Presentation is loading. Please wait.

Presentation is loading. Please wait.

Collection and Preservation of At- Risk Digital Geospatial Data: the North Carolina NDIIPP Project Partners: NCSU Libraries Project Lead: Steve Morris.

Similar presentations


Presentation on theme: "Collection and Preservation of At- Risk Digital Geospatial Data: the North Carolina NDIIPP Project Partners: NCSU Libraries Project Lead: Steve Morris."— Presentation transcript:

1 Collection and Preservation of At- Risk Digital Geospatial Data: the North Carolina NDIIPP Project Partners: NCSU Libraries Project Lead: Steve Morris NC Center for Geographic Information & Analysis Project Lead: Zsolt Nagy LCFS Database GroupMay 30, 2005

2 Note: Percentages based on the actual number of respondents to each question 2 Project Context Partnership between university library (NCSU) and state agency (NCCGIA) Focus on state and local geospatial content in North Carolina (state demonstration) Tied to NC OneMap initiative, which provides for seamless access to data, metadata, and inventory information Objective: engage existing state/federal geospatial data infrastructures in preservation

3 Note: Percentages based on the actual number of respondents to each question 3 Targeted Content Resource Types GIS “vector” (point/line/polygon) data Digital orthophotography Digital maps Tabular data (e.g. assessment data) Content Producers Mostly state, local, regional agencies Some university, not-for-profit, commercial Selected local federal projects

4 Note: Percentages based on the actual number of respondents to each question 4 NC Local GIS Landscape 100 counties, 92 with GIS 80 counties with high resolution orthophotography 65+ counties with unique map servers. Growing number of municipal systems Value: $162 million plus investment

5 Note: Percentages based on the actual number of respondents to each question 5 NC OneMap Initial Data Layers Produced by Cities and Counties

6 Note: Percentages based on the actual number of respondents to each question 6 Vector data (scale, accuracy, currency, etc.)

7 Note: Percentages based on the actual number of respondents to each question 7 Time series – vector data Parcel Boundary Changes 2001-2004, North Raleigh, NC

8 Note: Percentages based on the actual number of respondents to each question 8 Aerial imagery (image resolution, etc.)

9 Note: Percentages based on the actual number of respondents to each question 9 Aerial imagery (image resolution, etc.)

10 Note: Percentages based on the actual number of respondents to each question 10 Aerial imagery (image resolution, etc.)

11 Note: Percentages based on the actual number of respondents to each question 11 Aerial imagery (image resolution, etc.)

12 Note: Percentages based on the actual number of respondents to each question 12 Time series – Ortho imagery Vicinity of Raleigh-Durham International Airport 1993-2002

13 Note: Percentages based on the actual number of respondents to each question 13 Tabular data (combined with vector data)

14 Note: Percentages based on the actual number of respondents to each question 14 Tabular data (combined with vector data)

15 Note: Percentages based on the actual number of respondents to each question 15 Tabular data (combined with vector data)

16 Note: Percentages based on the actual number of respondents to each question 16 Today’s geospatial data as tomorrow’s cultural heritage

17 Note: Percentages based on the actual number of respondents to each question 17 Risks to Digital Geospatial Data Producer focus on current data Time-versioned content generally not archives Future support of data formats in question Vast range of data formats in use--complex Shift to “streaming data” for access Archives have been a by-product of providing access Preservation metadata requirements Descriptive, administrative, technical, DRM Geodatabases Complex functionality

18 Note: Percentages based on the actual number of respondents to each question 18 GIS Software Used – Local Agencies Source: NC OneMap Data Inventory 2004

19 Note: Percentages based on the actual number of respondents to each question 19 Earlier NCSU Acquisition Efforts NCSU University Extension project 2000-2001 Target: County/city data in eastern NC “Digital rescue” not “digital preservation” Project learning outcomes Confirmed concerns about long term access Need for efficient inventory/acquisition Wide range in rights/licensing Need to work within statewide infrastructure Acquired experience; unanticipated collaboration

20 Note: Percentages based on the actual number of respondents to each question 20 Exploring Approaches to Sharing Data County and City GIS Directories

21 Note: Percentages based on the actual number of respondents to each question 21 Processing Ingested Data e.g. Testing for data gaps in county orthophoto sets

22 Note: Percentages based on the actual number of respondents to each question 22 Content Identification and Selection Work from NC OneMap Data Inventory Combine with inventory information from various state agencies and from previous NCSU efforts Develop methodology for selecting from among “early,” “middle,” and “late” stage products Develop criteria for time series development Investigate use of emerging Open Geospatial Consortium technologies in data identification

23 Note: Percentages based on the actual number of respondents to each question 23 Content Acquisition Work from NC OneMap Data Sharing Agreements as a starting point (the “blanket”) Secure individual agreements (the “quilt”) Investigate use of OGC technologies in capture Explore use of METS as a metadata wrapper Ingest FGDC metadata; Xwalk to MODS? PREMIS? Maybe METS DRM short term; GeoDRM long term Consider links to services; version management Get the geospatial community to tackle the content packaging problem (maybe MPEG 21?)

24 Note: Percentages based on the actual number of respondents to each question 24 Partnership Building Work within context of the NC OneMap initiative State, local, federal partnership State expression of the National Map Defined characteristic: “ Historic and temporal data will be maintained and available” Advisory Committee drawn from the NC Geographic Information Coordinating Council subcommittees Seek external partners National States Geographic Information Council FGDC Historical Data Committee … more

25 Note: Percentages based on the actual number of respondents to each question 25 Content Retention and Transfer Ingest into Dspace Explore how geospatial content interacts with existing digital repository software environments Investigate re-ingest into a second platform Challenge: keep the collection repository-agnostic Start to define format migration paths Special problem: geodatabases Purse long term solution Roles of data producing agencies, state agencies; NC OneMap; NCSU

26 Note: Percentages based on the actual number of respondents to each question 26 Rights Issues Various interpretations of public records law 53.9% of local NC agencies charge for data 43.7% of local NC agencies restrict redistribution Desire for downstream control of data Disclaimer clickthrough; liability concerns Filtered locations/individuals; post 9/11 issues Restrictions on redistribution; commercial resale Web services area in “Wild West” stage Both content and technical agreements GeoDRM initiative in the works

27 Note: Percentages based on the actual number of respondents to each question 27 Big Challenges Format migration paths Management of data versions over time Preservation metadata Harnessing geospatial web services Preserving cartographic representation Keeping content repository-agnostic Preserving geodatabases More …

28 Note: Percentages based on the actual number of respondents to each question 28 Vector Data Format Issues Vector data much more complicated than image data ‘Archiving’ vs. ‘Permanent access’ An ‘open’ pile of XML might make an archive, but if using it requires a team of programmers to do digital archaeology then it does not provide permanent access Piles of XML need to be widely understood piles GML: need widely accepted application schemas (like OSMM?) The Geodatabase conundrum Export feature classes, and lose topology, annotation, relationships, etc. … or use the Geodatabase as the primary archival platform (some are now thinking this way)

29 Note: Percentages based on the actual number of respondents to each question 29 Vector Data Format Options Option A: use an open format and have a really unfortunate transformation and limited vendor support for the output object Option B: use closed format but retain the original content and count on short- and medium-term vendor support. Option C: do both to buy time and look for an open, ASCII solution. (watch GML activity) No sweet spot, just an evolving and changing mix of flawed options that are used in combination.

30 Note: Percentages based on the actual number of respondents to each question 30 Managing Time-versioned Content

31 Note: Percentages based on the actual number of respondents to each question 31 Managing Time-versioned Content Many local agency data layers continuously updated E.g., some county cadastral data updated daily— older versions not generally available Individual versioned datasets will wander off from the archive How do users “get current metadata/DRM/object” from a versioned dataset found “in the wild”? How do we certify concurrency and agreement between the metadata and the data?

32 Note: Percentages based on the actual number of respondents to each question 32 Managing Time-versioned Content Can we manage the relationship loosely using a persistent identifier link to a parent object? version Persistent ID Resolver Parent Object Manager version

33 Note: Percentages based on the actual number of respondents to each question 33 Preservation Metadata Issues FGDC Metadata Many flavors, incoming metadata needs processing Cross-walk elements to PREMIS, MODS? Metadata wrapper METS (Metadata Encoding and Transmission Standard) vs. other industry solutions Need a geospatial industry solution for the ‘METS- like problem’ GeoDRM a likely trigger—wrapper to enforce licensing (MPEG 21 references in OGIS Web Services 3)

34 Note: Percentages based on the actual number of respondents to each question 34 Harnessing Geospatial Web Services

35 Note: Percentages based on the actual number of respondents to each question 35

36 Note: Percentages based on the actual number of respondents to each question 36

37 Note: Percentages based on the actual number of respondents to each question 37

38 Note: Percentages based on the actual number of respondents to each question 38

39 Note: Percentages based on the actual number of respondents to each question 39

40 Note: Percentages based on the actual number of respondents to each question 40 Harnessing Geospatial Web Services Automated content identification ‘capabilities files,’ registries, catalog services WMS (Web Map Service) for batch extraction of image atlases last ditch capture option preserve cartographic representation retain records of decision-making process … feature services (WFS) later. Rights issues in the web services space are ambiguous

41 Note: Percentages based on the actual number of respondents to each question 41 Preserving Cartographic Representation

42 Note: Percentages based on the actual number of respondents to each question 42 Preserving Cartographic Representation The true counterpart of the old map is not the GIS dataset, but rather the cartographic representation that builds on that data: Intellectual choices about symbolization, layer combinations Data models, analysis, annotations Cartographic representation typically encoded in proprietary files (.avl,.lyr,.apr,.mxd) that do not lend themselves well to migration Symbologies have meaning to particular communities at particular points in time, preserving information about symbol sets and their meaning is a different problem

43 Note: Percentages based on the actual number of respondents to each question 43 Preserving Cartographic Representation Image-based approaches Generate images using Map Book or similar tools Harvest existing atlas images Capture atlases from WMS servers Export ‘layouts’ or ‘maps’ to image Vector-based approaches Store explicitly in the data format (e.g. Feature Class Representation in ArcGIS 9.2) Archive and upward-migrate existing files.avl,.apr,.lyr,.mxd, etc. SVG, VML or other XML approaches Other?

44 Note: Percentages based on the actual number of respondents to each question 44 Preserving Cartographic Representation

45 Note: Percentages based on the actual number of respondents to each question 45 Preserving Cartographic Representation

46 Note: Percentages based on the actual number of respondents to each question 46 Interest in how geospatial content interacts with widely available digital repository software Focus on salient, domain-specific issues Challenge: remain repository agnostic Avoid “imprinting” on repository software environment Preservation package should not be the same as the ingest object of the first environment Tension between exploiting repository software features vs. becoming software dependent Repository Architecture Issues

47 Note: Percentages based on the actual number of respondents to each question 47 Preserving Geodatabases Spatial databases in general vs. ESRI Geodatabase “format” Not just data layers and attributes—also topology, annotation, relationships, behaviors ESRI Geodatabase archival issues XML Export, Geodatabase History, File Geodatabase, Geodatabase Replication Growing use of geodatabases by municipal, county agencies Some looking to Geodatabase as archival platform (in addition to feature class export)

48 Note: Percentages based on the actual number of respondents to each question 48 Questions? Contact: Steve Morris Head, Digital Library Initiatives NCSU Libraries Steven_Morris@ncsu.edu


Download ppt "Collection and Preservation of At- Risk Digital Geospatial Data: the North Carolina NDIIPP Project Partners: NCSU Libraries Project Lead: Steve Morris."

Similar presentations


Ads by Google