Guy McGarva, EDINA National Data Centre Rajendra Bose, DCC and School of Informatics University of Edinburgh Tuesday 15 May 2007 CLADDIER Project Workshop, Chilworth, Southampton, UK Safeguarding the Citation Lifecycle for Geospatial Repositories (based on presentation at EGU 2007)
Contents Issues –Datasets vs. Features Citation of Geospatial Features –OS MasterMap ® example Dataset Citation –Example using Go-Geo! and GRADE Work in progress…
Geospatial Dataset vs Geospatial Features Citation may need to be at either Dataset or Feature level because: - You need to cite the dataset, if: Dataset is small (num. features, extent etc.) Whole dataset used in analysis Not feature based (e.g. raster/surface) –But, you need to cite specific features, if: Only small extent from very large geodatabase Only specific features used in analysis Continuously changing (but only small proportion of whole database), need to know which version
Geospatial Feature Citation Some Assumptions for features within a geospatial repository: –Every feature has a unique ID e.g. OS TOIDs (Topographic Object IDs) –Features have version numbers/time-stamps Note: Talking about vector data that could be used to produce cartographic products, but not the maps themselves
journal article citation creation (future) discovery of citation citation included in published work retrieve/ resolve citation citation target access a copy of article held in a library or institutional repository journal article article held in a library or institutional repository Citation lifecycle for journal articles
geospatial data citation creation (future) discovery of citation citation included in published work retrieve/ resolve citation citation target reassemble geospatial database contents Author, 2006, Citation target,… article held in a library or institutional repository journal article data held in geospatial repository Citation lifecycle (broken) for geospatial data
citation creation (future) discovery of citation citation included in published work retrieve/ resolve citation citation target reassemble geospatial database contents Author, 2006, Citation target,… article held in a library or institutional repository journal article data held in geospatial repository Citation lifecycle for geospatial data geospatial data
toid="osgb " version="2" toid="osgb " version="1" toid="rkb_ _2" toid="rkb_ _1" Sample Feature Data e.g. OS MasterMap e.g. User data
Researcher_1 References: Map1… link Jan 2007 … GIS_1 base data Citation creation: generation of manifest incorporation into bridge service web page (with embedded manifest) … citation in publication links to bridge service (via doi?) other data source … = toid="osgb " version="2" toid="osgb " version="1" toid="rkb_ _1" toid="rkb_ _2" where, e.g. OS MasterMap e.g User data
Bose, Rajendra Building geohazard data set. University of Edinburgh Database Group. Sample Citation And 'bridge service', is a web site that provides access to geospatial metadata Where, 'manifest' is XML file containing list of feature identifiers and version numbers So, citation for this data becomes
Example of how bridge service would work: embed manifest in ISO 19115, FGDC CSDGM, etc. metadata for set of features
Example of how bridge service would work:
DNF Schema to support data association Sample XML data based on DNF schema - can be used to transfer object identifiers and version numbers - can be used to associate users data with reference data Object Identifier Version number Some application layer specific information
STD-DOI project
Researcher_2 4 GIS_2 Citation retrieval: Completes the life-cycle References: Map1… link extract manifest and access archive from bridge service … citation in publication links to bridge service (e.g. via doi as in STD- DOI project?) repositories use manifest to provide historical MasterMap and other features if user has permissions features can now be reassembled with a GIS … Jan 2007 Jan 2010 Repository/ archive
Geospatial Dataset Citation Example An example using: – Go-Geo Metadata portal and –GRADE Geospatial Repository Shows how the life cycle could completed by maintaining the citation in the metadata and resolved by a domain repository GRADE solves problems of access and authorisation by only being accessible to users of Digimap (for uploading and downloading of data) Could be extended to support feature manifests using DNF schema
Go-Geo Geospatial Meta-data portal
Linking a record in Go-Geo with the data in GRADE Location of data
Retrieving the data from GRADE URL from Go-Geo! Download File
Conclusions Current guidelines for citing a selection of features within a geospatial database are inadequate An XML manifest could serve as a definitive, compact, portable (& copyright free) list of geospatial features that could facilitate citation A bridge service or metadata portal could provide a means of retrieving citations Further work… –How do manifests interact with archives/repositories –Web services
Safeguarding the Citation Lifecycle for Geospatial Repositories Rajendra Bose Guy McGarva Tuesday 15 May 2007 CLADDIER Project Workshop Safeguarding the Citation Lifecycle for Geospatial Repositories
GRADE Geospatial Repository
Spatial Search