'Big Data'

Preservation and Management Strategies for Exceptionally Large Data Formats

by Justin Barton
March 18, 2009
The Archaeological Data Service (hosted by the University of York) investigated the current problems associated with storing, using, and disseminating large (and increasingly larger) data sets generated by the cultural heritage field (researchers, archaeologists, and CRM). There is a growing use and demand for advanced technologies such as LiDAR, GIS, and computer modeling in cultural resource management and research yet there is still an inherent lack of standardization and 'good practice' methodologies for dealing with these vast data sets. The ADS 'Big Data' project covers archival strategies, metadata standards, preservation and on-going management of data, and known accessibility and usability issues revolving around large, proprietary data sets.

For laser scanning/LiDAR in particular, this relatively new field lacks cohesive standards for long-term data migration and usability. There are few (and hardly used) non-proprietary data formats in the field and so basic ASCII exports are the best solution yet most researchers leave their work in the native format for their equipment. As we reach a point were vast, fast data dissemination is more-and-more possible, this is cannot be the way forward. Everyone cannot use a different format for archiving and data exchange. This paper covers important topics and at a critical stage where changes can be made before the problem becomes unmanageable in the near future.