Presentation is loading. Please wait.

Presentation is loading. Please wait.

Metadata and Data Management activities at CSIRO Marine Research, Australia Kim Finney & Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart.

Similar presentations


Presentation on theme: "Metadata and Data Management activities at CSIRO Marine Research, Australia Kim Finney & Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart."— Presentation transcript:

1 Metadata and Data Management activities at CSIRO Marine Research, Australia Kim Finney & Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart http://www.marine.csiro.au/datacentre/

2 The Australian MarLIN Connection Great Minds Think Alike !!! – Almost simultaneous emergence of UK and Australian MarLIN projects. – Different emphases but many overlapping problems. Why Are We Here ? – To exchange ideas, make some new data friends and hopefully leverage off of UK developments that can also address Oz marine data issues. Who Are We ? – CSIRO Division of Marine Research (CMR), an Australian Commonwealth Government research agency. Approximately 300 staff. One of a number of such agencies (others include AIMS, GBRMPA)

3 Orientation information... RV Franklin Oceanographic research vessel FRV Southern Surveyor Fisheries research vessel CMR 16 million 2 km ocean territory

4 CMR - Data Centre Established in 1997 – 12 staff (multidisciplinary), – service Division and two ships, – focal point for promoting data management culture within CMR, Data Management Strategy – developed in 1997, – Outlines actions that CMR must take to move its data management practices into the 21st Century, – Covers policy, technology issues, data handling procedures, standards development/adoption - available on Data Centre web site.

5 What Are Some Of The Issues We Face ? Corporate knowledge of datasets held (internal & external sourced). Purchase & sharing of externally sourced data. Access & re-use of data generated by individuals. Data archiving for re-use. Coordination of external data exchange/data provision. Data pricing policies. Divisional use of WWW & database technology. Conformance with national & international standards (data exchange, data processing, data documentation) Contribution to national data management issues & activities. Data management tools (availability, development for re-use, divisional software libraries) Integration of data, records, publications and financial systems

6 Divisional Data Policies What Is Our Approach ? E.Commerce Module Data Licensing Module Basic WWW Metadata Directory Hyperlinked Data Files Hyperlinked Publications Hyperlinked Databases Standards

7 RMI HTTP Divisional Data Policies E.Commerce Module Data Licensing Module Basic WWW Metadata Directory Hyperlinked Data Files Hyperlinked Publications Hyperlinked Databases Standards Development Of CMR’s Research Database Network Protocol Client Server Servlet (Database Access Program) ORACLE Database ( Java Applet, or Browser ) Yet to be includ ed Video Data Catalogu e Conceptu al/ Physical Deployme nt Project Informati on Model Sources GIS Sources Device Sources Time Series Data Types Profil e Data Type s Photo Data Catch Data Model Data Image Data Meteorologic al Sedime n Sample Data Spatial Option J D B C

8 Video Data Catalogue {long table indexing all features in the database} Conceptual/ Physical Deployment Project Information Model Sources GIS Sources Device Sources Time Series Data Types Profile Data Types Photo Data Catch Data Model Data Image Data Meteorological Data Sediment Data Sample Data

9 Concluding Remarks

10 MarLIN - Marine Laboratories Information Network and CAAB - Codes for Australian Aquatic Biota

11 Situation at CMR pre-MarLIN Centrally-held data Derived products CMR -produced reference works & guides Scientific publications project/ voyage/ person details Supporting information CAAB taxonomic database Externally sourced data Indexes and catalogues Dispersed data (numerous dispersed resources)

12 MarLIN metadatabase as at July 1999: showing pointers/links to ( ) or information sourced from ( ) Centrally-held data Derived products CMR -produced reference works & guides Scientific publications project/ voyage/ person details Supporting information CAAB taxonomic database Externally sourced data Indexes and catalogues Dispersed data

13 MarLIN design questions... How to make data querying, entry and maintenance easily user-accessible (but maintain metadata standards)? – use www interfaces, but moderate user entries and updates What information to store, in what manner? – use ANZLIC and “Blue Pages” elements, plus additional ones as deemed useful for Divisional needs What metadata standards, thesauri, etc. to follow? – mostly follow ANZLIC & “Blue Pages”, with some extensions & replacements How to handle taxon-level information? – store taxonomic codes in MarLIN, referenced to scientific and common names from Division’s “CAAB” taxonomic database What about subject-based searching? – use “MarLIN subject categories”, developed from ASFA (R) scheme

14 MarLIN metadatabase implementation Oracle database, with www front end and HTML forms/JAVA interfaces – www used for searching and metadata submission/ metadata update, also for most administrative functions Relational design – common aspects to numerous records (e.g. project, voyage, person information) stored in separate tables Data entry and update is via user logon (restricted to users on CMR computer domain) – enterer details, time, etc. are automatically logged and added to record on submission “Submitted” records reside in separate (parallel) tables until approved by database administrator Nightly script runs to generate CMR’s “Blue Pages” entries from MarLIN metadata records

15 MarLIN metadata elements # = “Blue Pages” extension to ANZLIC standard, * = new element added for MarLIN Dataset... Title * Identifier/Short Title # Data Type Custodian Organisation * Contributors * Acknowledgements # References * Publication Date Abstract * Author's Comments On-Line Links (Data, Graphics, Documentation) Location Keywords Bounding Coordinates Subject Categories and Search Words * MarLIN Subject Categories # Habitat Keywords # Taxonomy Keywords * CAAB Species Codes # Parameters Measured # Equipment Used # Blue Pages Themes ANZLIC Search Words Project, vessel and voyage details # Originating Project Name * Project Details # Platform/Vessel Name * Voyage Identifier * Voyage Details Data Currency and Status Date range (Beginning and End Dates) Progress Maintenance Data Access Stored Data Format(s) * Stored Data Volume * Stored Data Location * Specific Data Location * Specific Software Requirements * Stored Data Documentation Available Format Type(s) Access Constraints Data Quality Data Source, Processing, and Quality Control * GIS Datum and scale used (if relevant) Logical Consistency Report Positional Accuracy Parameter Accuracy Completeness Contact point Contact Person and Details Metadata Information * Related MarLIN Datasets Additional Metadata * Metadata Availability Metadata Created On/By... (date, person) * Metadata Last Updated On/By... (date, person)

16 Aspects of MarLIN “Search” interface...

17 Example search results Lists of titles Summary information Links to voyage tracks

18

19 External MarLIN linkages (July 1999) Hyperlinks to documents, data, etc. Selected details exported to... Online link back to... Internet search engines “Blue Pages” HTML documents (many organisations’ records) MarLIN database (CMR’s records) Blue Pages search facility MarLIN search facility

20 MarLIN continuing development... Incorporate “live” links to other databases e.g. CAAB, CMR corporate databases, library systems Increase data coverage, try to maintain currency and consistency of entries Continue to “sell the concept” for users to document their own data Make a “view” of MarLIN records visible to ASDD Possible future links with metadata systems based on other standards, using “crosswalks” MarLIN v.2 to be developed in c. 12 months … closely integrated with new Divisional data storage system (with parallel development of interfaces etc., automated retrieval of data as well as metadata)

21 MarLIN present ( ) and future ( ) operation Centrally-held data Derived products CMR -produced reference works & guides Scientific publications project/ voyage/ person details Supporting information CAAB taxonomic database Externally sourced data Indexes and catalogues Dispersed data

22 CAAB Codes for Australian Aquatic Biota http://www.marine.csiro.au/caab/

23 Example CAAB codes (hammerhead sharks) (dogfishes)

24 CAAB rationale/ historic reasons for existence Taxonomists needed a tool for organising specimen collections and supporting information Field biologists needed a tool for rapid data entry (to include categories corresponding to “non orthodox groups”) Data custodians needed a system for storing taxon- related information in a long-term, stable form (independent of future name changes) Use of “intelligent” codes permits rapid human- or computer-based sorting of taxa, and retrieval of supporting information

25 CAAB implementation CAAB has 47 “major categories” (e.g. fish, mammals, Algae - Phaeophyta, angiosperms), each with up to 999,999 available codes for allocation to Australian aquatic taxa Coverage of Australian fish species (c.4,500) is essentially complete, also some smaller groups (marine reptiles and mammals) Other categories - populated on “as needs” basis (e.g. 300 molluscs, 350 crustaceans, 60 angiosperms - plus ongoing additions) 2-digit prefix (category code) and 3-digit family code are machine- sortable - e.g.: – 37 = fish 37 001 = fish family 1 37 001001 = fish family 1 species 1 – families are in contiguous blocks, e.g. families 37 005 to 37 024 are all types of sharks Numeric code is attached to taxon, independent of changes of scientific or common name (gives relative stability for data storage) Master CAAB database stores taxon/voucher specimen details, present and any previous scientific names, common names, comments and other information

26 Present usage of CAAB information CMR -produced reference works & guides CAAB taxonomic database CAAB - generated species lists Other organisations’ databases CMR databases (including MarLIN) used in... generates... Quoted in...

27 Intended future CAAB operation Links to on- line information CAAB taxonomic database CAAB species lists - on-line generation CAAB www interface CAAB taxon-level report Additional search facilities - e.g. MarLIN, other CMR databases, ITIS, www, etc. Users’ databases

28 CAAB continuing tasks... Taxon-level information from other local databases to be incorporated into CAAB (coverage will gradually be extended to most groups of aquatic organisms) Database structure will be improved to suit external www user access to the database Species common names to be handled in a structured way, permitting user-definable output formats, more comprehensive searching, etc. Hyperlinks will be incorporated, to electronic versions of available maps, images, etc. as available On-line links to other databases from CAAB will be enabled (and vice versa)

29 Selected data and metadata developments elsewhere in Australia On-line data, data products, and summaries Collection-based information On-line references Other metadata systems


Download ppt "Metadata and Data Management activities at CSIRO Marine Research, Australia Kim Finney & Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart."

Similar presentations


Ads by Google