Presentation is loading. Please wait.

Presentation is loading. Please wait.

Preserving Digital Geospatial Data: The NC Geospatial Data Archiving Project (NCGDAP) Steven P. Morris North Carolina State University Libraries CRADLE.

Similar presentations


Presentation on theme: "Preserving Digital Geospatial Data: The NC Geospatial Data Archiving Project (NCGDAP) Steven P. Morris North Carolina State University Libraries CRADLE."— Presentation transcript:

1 Preserving Digital Geospatial Data: The NC Geospatial Data Archiving Project (NCGDAP) Steven P. Morris North Carolina State University Libraries CRADLE Seminar November 17, 2006

2 NC Geospatial Data Archiving Project
Partnership between university library (NCSU) and state agency (NCCGIA) Focus on state and local geospatial data in North Carolina (state demonstration) Tied to NC OneMap initiative, which provides for seamless access to data, metadata, and inventories Objective: engage existing state/federal geospatial data infrastructures in preservation Project approaches: Technical and Social Serve as catalyst for discussion within industry Note: Percentages based on the actual number of respondents to each question

3 Targeted data: Digital orthophotography
85+ NC counties with orthophotos 1-5 flights per county gb per flight Note: Percentages based on the actual number of respondents to each question

4 Targeted data: Vector data (w/tabular)
Economic, infrastructure, and ethnographic data Note: Percentages based on the actual number of respondents to each question

5 Today’s geospatial data as tomorrow’s cultural heritage
Future uses of data are difficult to anticipate (as with Sanborn Maps). Note: Percentages based on the actual number of respondents to each question

6 Risks to State/Local Geospatial Data
Producer focus on current data Data overwrite as common practice Future support of data formats in question No open, supported format for vector data Shift to web services-based access Data becoming more ephemeral Inadequate or nonexistent metadata Impedes discovery and use Increasing use of spatial databases for data management The whole is greater than the sum of the parts Note: Percentages based on the actual number of respondents to each question

7 Challenge: Vector Data Formats
No widely-supported, open vector formats for geospatial data Spatial Data Transfer Standard (SDTS) not widely supported Geography Markup Language (GML) – diversity of application schemas and profiles threatens permanent access Spatial Databases The sum is more than the whole of the parts, and the sum is very difficult to preserve Can export individual data layers for curation Some thinking of using the spatial database as the primary archival platform Note: Percentages based on the actual number of respondents to each question

8 Challenge: Cartographic Representation
Counterpart to the map is not just the dataset but also models, symbolization, classification, annotation, etc. Note: Percentages based on the actual number of respondents to each question

9 Challenge: Geospatial Web Services
How to capture records from decision- making processes? Possible: Atlas collections from automated image capture Web 2.0 impact: Emerging tiling and caching schemes (archive target?) Note: Percentages based on the actual number of respondents to each question

10 Different Ways to Approach Preservation
Technical solutions: How do we archive acquired content over the long term? Build a data repository: not as an end in itself but as a catalyst for discussion within the data community Develop a repository ingest workflow: create technical points of engagement with the digital preservation community Note: Percentages based on the actual number of respondents to each question

11 Different Ways to Approach Preservation
Cultural/Organizational solutions: How do we make the data more preservable—and more prone to be archived—from point of production? Engage data producer community and spatial data infrastructure through outreach and engagement; influence practice Sell the problem to software vendors and standards development Find overlap with more compelling business problems: disaster preparedness, business continuity, road building, etc. Start a discussion about roles at the local, state, and federal level Note: Percentages based on the actual number of respondents to each question

12 NCGDAP Technical Approach
Receive data as is – variety of distribution methods Migration of some at-risk formats Metadata remediation, normalization, and synchronization Distilling complex objects into repository ingest items (not easy) Using DSpace for demonstration purposes (keeping repository platform at arms length) In the development: use METS record as dormant item “brain” within the repository Some unsustainable activities – for learning experience Note: Percentages based on the actual number of respondents to each question

13 Building Data Bundles: The Zip Codes Example
Note: Percentages based on the actual number of respondents to each question

14 Where is the Dataset? Note: Percentages based on the actual number of respondents to each question

15 Here’s One! Files Multi-file dataset Georeferencing Metadata file
Symbolization file Additional documentation License Disclaimer More Metadata FGDC Acquisition metadata Transfer metadata Ingest metadata Archive rights Archive processes Collection metadata Series metadata Note: Percentages based on the actual number of respondents to each question

16 Hub-and-Spoke Metadata Workflow
Note: Percentages based on the actual number of respondents to each question

17 Hub-and-Spoke Metadata Workflow
Note: Percentages based on the actual number of respondents to each question

18 Hub-and-Spoke Metadata Workflow
Issues: Ingest process needs access to repository specifics (e.g., what collections exist) Understanding of what the core elements should be is refined as spokes are added Need to consider repository response to SIP or AIP evolution Note: Percentages based on the actual number of respondents to each question

19 Metadata: Going Beyond a Passive Role
Feedback to the NC OneMap Metadata Outreach Program vis-à-vis metadata quality problems encountered in repository ingest Engage standards body (Open Geospatial Consortium -- OGC) in discussions about: content packaging standards for geospatial better practices for time-versioned data persistent identifier schemes contributing archive use cases to GeoDRM Meetings with major software vendor development teams Note: Percentages based on the actual number of respondents to each question

20 Social Issues: Changing Industry Thinking
Is the geospatial industry “temporally-impaired?” Lack of access to older data Lack for tool/model support for temporal analysis Metadata: poor support for changing data Education: building class projects around available data (i.e., not temporal) Increased interest now in temporal applications? Increased demand for temporal data? Improved tool support: ArcGIS 9.2 animation tools; Geodatabase History, etc. IMPORTANT: Gathering business cases for using older data Note: Percentages based on the actual number of respondents to each question

21 Social Issues: Content Exchange Networks
Solving the present-day problems of data sharing is a pre-requisite to solving the problem of long-term access Leveraging more compelling business problems: disaster preparedness and business continuity needs can put the data in motion (siphon off to the archive) Geospatial data: large data volumes, frequent data update, complex datasets, ambiguous rights Content exchange network technical challenges: Rights management Large-scale transfers on network Content packaging (MPEG 21 DIDL, XFDU, METS, …) Note: Percentages based on the actual number of respondents to each question

22 Content Issues: Frequency of Capture Survey
Survey objective: Document current practices for obtaining archival snapshots of county/municipal geospatial vector data layers Seek guidance about frequency of capture Survey topics: General questions about data archiving practice Specific questions about parcels, street centerlines, jurisdictional boundaries, and zoning Survey subjects: All 100 counties and 25 municipalities -- 58% response rate Survey conducted September 2006 Added benefit: Survey socialized the preservation issue Note: Percentages based on the actual number of respondents to each question

23 NC County/Municipal Agency Frequency of Capture: Parcel Data
Based on a percentage of the respondents that indicate they actually archive some data Note: Percentages based on the actual number of respondents to each question

24 Cultivating a commercial market for older data.
Content Issues: What About Commercial Data? Project Status Cultivating a commercial market for older data. Part of “permanent access” is marketing, advertising, and putting older data into the path of the user Note: Percentages based on the actual number of respondents to each question

25 New Challenges: “Platial” vs. Spatial Imagery
Mobile, LBS and, social networking applications drive demand for placed-based data Example sources: Oblique Imagery Street-view Imagery (e.g., A9.com) Transportation Dept. Videologs Long-term cultural heritage value in non-overhead imagery: more descriptive of place and function Emerging: “Tricorder” applications Note: Percentages based on the actual number of respondents to each question

26 New Challenges: Ajax Applications, Google Earth and All That
Emerging online environments are increasingly used to make decisions, how are these decisions documented? Web mashup/AJAX interactions with existing systems spur creation of intermediate content layers: e.g., tiling and caching of WMS services Formulation of a standard tiling scheme may create a new preservation opportunity (temporal axis on caches?) Note: Percentages based on the actual number of respondents to each question

27 Web mashup/AJAX interactions with existing systems spur creation of intermediate content layers: e.g., tiling and caching of WMS services Identification of a standard tiling scheme may create a new preservation opportunity (temporal axis on caches?) Note: Percentages based on the actual number of respondents to each question

28 Working with New Partners
State Archives now an informal member of the NCGDAP project Collaboration with NARA Working with the Open Geospatial Consortium on standards issues Associate Partnership with JISC-funded UK-wide project Site visits with ESRI (major software vendor) development groups Participation in a variety of content exchange network activities More … Note: Percentages based on the actual number of respondents to each question

29 Next Steps Working with NARA and the OGC Interoperability Institute to develop an OGC Data Preservation Working Group charter Evaluating results for the frequency of capture survey Stepping up data acquisition and repository ingest Evaluating initial data acquisition efforts (time factors, content variety, technical/legal barriers) Partnership with content exchange network activities Ramping up partnerships with broader (non-geospatial) data repository efforts Note: Percentages based on the actual number of respondents to each question

30 Questions? http://www.lib.ncsu.edu/ncgdap Contact: Steve Morris
Head, Digital Library Initiatives NCSU Libraries ph: (919) Note: Percentages based on the actual number of respondents to each question


Download ppt "Preserving Digital Geospatial Data: The NC Geospatial Data Archiving Project (NCGDAP) Steven P. Morris North Carolina State University Libraries CRADLE."

Similar presentations


Ads by Google