Presentation on theme: "Scoping a Geospatial Repository for Academic Deposit and Extraction Anne Robertson EDINA EDINA National Data Centre University of Edinburgh JISC Geospatial."— Presentation transcript:
Scoping a Geospatial Repository for Academic Deposit and Extraction Anne Robertson EDINA EDINA National Data Centre University of Edinburgh JISC Geospatial Working Group UCL, 9th January 2006
JISC Digital Repositories Programme June 2005 JISC £4m programme Aim of encouraging growth of repositories in UK universities and colleges Programme consists of 25 projects exploring role and operation of projects Focus on how repositories can assist academic researchers both to do and share work more easily Open access is key driver plus growing demand for outputs of publicly-funded research to be freely available on the web
Todays climate According to OECD Follow up Group on Issues of Access to Publicly Funded Research Data 1 … More widespread and efficient access to and sharing of research data will have substantial benefits for most areas of scientific research. Evidence of re-use of data within UK data centres is low: –Level of re-use of data held in the AHDS and ESRC archives has been disappointingly low (Alison Allden, 2003) –NERC spends about £5 million per annum on data management, but unclear what benefit it derives from this. More research is needed to establish benefits and value of data re-use (Mark Thorley, 2003) –Qualidata survey of qualitative data re-use (2000). 44% respondents used colleague's data rather than acquiring archived data via a dissemination service (33%) 1 Interim Report, 20 October 2002
GRADE project introduction GRADE is part of JISC Digital Repositories Programme GRADE will investigate and report on the technical and cultural issues around the reuse of geospatial data within the context of discipline-based repositories Investigative in nature, not building a geospatial repository Particular focus on sharing and reuse of derived geospatial data EDINA leading GRADE with consortium partners: –AHRC Research Centre for Studies in Intellectual Property and Technology Law, School of Law, Edinburgh University –National Oceanography Centre, Southampton University Variety of other associate partners including NCGDAP, BADC, Ordnance Survey, Geog depts, HEASCs
Project Work Programme June 05 – Apr 07 Budget £160k 5 discrete work packages Digital rights issues - when we consider the reuse of derived geospatial data concerns over data ownership, IPR and copyright are commonplace Debate over institutional repository – one size fits all? Cultural aspects of allegiance to discipline not institution Interoperability issues – how could a geospatial repository interact within JISC IE, how could it make its assets available to the GRID and eScience community Establish user based evidence for the requirements and functionality of a repository capable of managing licensed geospatial assets Investigate and make an assessment of informal mechanisms for geospatial data sharing
Example of derived data scenario
Derived Data Example OS Landline Digitise coastline positions Input Processing Output ESRI Shapefile and tables of retreat Ground surveyHistoric OS Maps 2001 Orthophotos Scan Geo- reference Accuracy assessment Planimetric correction GPS survey Calculation of cliff retreat Source: Use case provision of derived geospatial data as part of the GRADE project in scoping digital repositories (draft report)
Example of more informal geospatial data sharing
Progress to Date Compendium of examples of derived geospatial data highlighting copyright issues –Basis upon which legal team can understand issues in building a framework for data sharing respects licensing conditions –Focus our thoughts on broader concepts of copyright inheritance, degrees of derived data Literature review providing global snapshot of geospatial data and repositories Developed a first demonstrator, built on existing open source repository software – ready to invite interaction, aim is to elicit feedback on user requirements Identified technical issues that we as a geographic community have not yet focussed upon. We presented these to OGC Technical Committee in November ……..
Issues – Content Packaging Consider a geospatial data asset deposited into a repository, its more than one file: –GML and associated schema! –proprietary vector format plus cartographic representation detail –geodatabase –raster with header file –Data set metadata and IPR info What is best method to package data? In eLibrary world the Metadata Encoding and Transmission Standard (METS) and IMS content package (IMS CP) and MPEG-21 DIDL for repository objects What direction is the GI industry taking with content packaging?
Issues – GML for archiving? If content packaging is about asking best method to package data, next question is about content being packaged. Permanent access requirements: –profiles and application schemas widely understood and supported, avoid requiring digital archaeology –Role of GML : current focus is as transfer format Assessing formats for preservation: sustainability v. quality v. functionality How to handle proprietary formats? –Spatial databases pose special challenge
Issues – Persistent Identifiers Once a geospatial data asset is deposited within a repository, there is a need to be able to persistently identify this asset Particular repository softwares use particular schemes e.g. Fedora uses info URI scheme Requirement to ensure identifier is actionable We are thinking about OpenURL Resolvers and perhaps Digital Object Identifier (DOI) for handle schemes What direction is GI industry taking with persistent identifiers?
What can JISC GWG do for GRADE? Take interest Share your experiences –examples of research data that should be shared and made available for reuse …………. Provide input on demonstrators and guidance on user requirements ……… Thank you