Presentation is loading. Please wait.

Presentation is loading. Please wait.

Shoaib Sufi CCLRC e-Science Centre CCLRC Scientific Metadata (CSMD) Model April 2004 NESC.

Similar presentations


Presentation on theme: "Shoaib Sufi CCLRC e-Science Centre CCLRC Scientific Metadata (CSMD) Model April 2004 NESC."— Presentation transcript:

1 Shoaib Sufi CCLRC e-Science Centre CCLRC Scientific Metadata (CSMD) Model April 2004 NESC

2 Shoaib Sufi CCLRC e-Science Centre Model Motivation A common general format/standard for Scientific Studies and data holdings metadata does not exist By proposing Model and Implementation: –Form a specification for the types of metadata studies should captured by Scientific Studies –Ease citation, collaboration, exploitation and Integration –Allow easy Integration of distributed heterogeneous metadata systems into a homogeneous (albeit virtual) Platform

3 Shoaib Sufi CCLRC e-Science Centre Structure of Metadata Model The CCLRC Scientific metadata model (CSMD) is a study- data set orientated model: –Indexing –Provenance –Data Description –Data Location –Access Conditions –Related Material

4 Shoaib Sufi CCLRC e-Science Centre What influenced CSMD CIP from Earth Observation DDI from Social Sciences DublinCore from the Library community –Publication only metadata XSIL as used on LIGO –Low level Scientific Data Objects focus CERA from the MPIM –A bit specific to Earth Sciences but close … hence the need to develop out own General Model – CCLRC Scientific Metadata Model

5 Shoaib Sufi CCLRC e-Science Centre some Model aims Abstract class orientated description of the types of metadata that should be captured by Scientific Studies Create a denominator for Scientific Study metadata which form a specification Metadata workshop at NIEES 2002 during a discussion on metadata standards – are people capturing metadata at the moment – simple answer given was no !!

6 Shoaib Sufi CCLRC e-Science Centre CSMD Used on DataPortal XML Implementation used as Data Interface for DataPortal Single view of heterogeneous systems/schemas Acts as a stress test of the model –Limitations feed into Model Requirements –New requirements fed back into implementation

7 Shoaib Sufi CCLRC e-Science Centre Model Breakdown: Provenance The Study contains the following metadata: –The Study Name –The Study Institution –The Investigator –Extended Study Information Abstract Funding Start and End times –Investigations

8 Shoaib Sufi CCLRC e-Science Centre Investigations A Study can have more than one investigation; possible enumerations are experiment, simulation, measurements etc. – investigations contain: –Name –Investigation Type –Abstract –Resource –Link to DataHolding

9 Shoaib Sufi CCLRC e-Science Centre Topic (for indexing) Keywords –Discipline (i.e. domain) –Keyword Source (e.g. domain dictionary) –Keyword Subjects –Discipline –Subject Source (e.g. domain taxonomy) –Subject

10 Shoaib Sufi CCLRC e-Science Centre Access Condition & Related Material Access Conditions –Contains a list of users or groups who are allowed access to the metadata and data, or a pointer to an access control system which contains such data for this study Related Material –One or many links and or textual descriptions of material related to this study e.g. earlier studies or parallel studies

11 Shoaib Sufi CCLRC e-Science Centre Data Data Description holds a logical description of the Studys data: –Data Name –Type of Data –Status –Data Topic –Parameters –Related Data Ref –Relation type (e.g. derived) Data Location contains the link between logical name and physical URIs –Data Name –Locator(s)

12 Shoaib Sufi CCLRC e-Science Centre More on Parameters Parameters contain a lot of information about the data objects (DO) and collections A collection/DO can have many parameter entries, each parameter entry contains: Parameter derivation (e.g. measured/fixed) –The value –The units –Range –Error margin Parameter aggregation is also supported

13 Shoaib Sufi CCLRC e-Science Centre Cardinality Issues The model recommends a certain cardinality of elements Certain metadata components are necessary for one to have an instance of the implemented model – treating everything as optional is not acceptable It is though implementations may modify this more to their needs – model attempts to remain ideal (i.e. most common Cardinality)

14 Shoaib Sufi CCLRC e-Science Centre Enumeration Issues Enumerations (or controlled vocabularies) e.g. types of investigator, types of institutions; these are distinct from the model e.g. as taxonomies are. However they are necessary for the model to work so implementations e.g. CCLRC DataPortal XML implementation of the model propose some enumerations for common things Recognised and relevant controlled vocabularies are hoped to be used by implementation where they are available

15 Shoaib Sufi CCLRC e-Science Centre Conformance Level For a complete metadata study-dataset record a large amount of metadata has to be stored/processed So its useful to have conformance levels Model uses 5 levels Each level specifies more metadata (and Indexing information) should be held

16 Shoaib Sufi CCLRC e-Science Centre Level 1 Type of Information captured: –Study and Investigation metadata with indexing at the Study level Level 1 metadata is similar to library/publication style metadata (e.g. DublinCore)

17 Shoaib Sufi CCLRC e-Science Centre Level 2 Type of Information captured: –Level 1 + DataHolding metadata (i.e. DataSets and DataObjects)

18 Shoaib Sufi CCLRC e-Science Centre Level 3 Type of Information captured: –Level 2 + related material, Access condition, indexing to data collection levels

19 Shoaib Sufi CCLRC e-Science Centre Level 4 Type of Information captured: –Level 3 + indexing to data object level and data object parameter information

20 Shoaib Sufi CCLRC e-Science Centre Level 5 Type of Information captured: –All metadata components are filled as L4 + funding, resources used, facilities used etc

21 Shoaib Sufi CCLRC e-Science Centre Conformance Levels L1 is similar to library/publication style metadata (e.g. DublinCore) The current DataPortal uses somewhere between L2 and L3 – indexing at study level moving towards collection level but with parameter information Envisaged only new systems designed with CSMD will conform to L4+ Benefit of conformance levels; the higher the level of conformance to the CSMD the richer the clients that operate on the data can be –e.g. identifying datasets and objects which link directly to keywords/taxonomies and not just studies

22 Shoaib Sufi CCLRC e-Science Centre

23 Shoaib Sufi CCLRC e-Science Centre Facilities using CSMD CCLRC Facilities (via CCLRC DataPortal): –ISIS - Neutron Spallation at Rutherford Appleton Laboratory (test) –SR – Synchroton Radiation source at Daresbury Laboratory (test) –British Atmospheric Data Centre (BADC) at RAL (prototype) External Facilities (via CCLRC DataPortal): –Max-Planck-Institut für Meteorologie (MPIM) in Hamburg External Projects using CSMD –NERC funded E-mineral environment from the molecular level –EPSRC funded E-materials project –Manchester MyGrid project uses an adapted version –ISIS (RAL) have taken data needs inhouse and use a model based heavily on CSMD

24 Shoaib Sufi CCLRC e-Science Centre The Future Increased use/recommendation for use of Controlled vocabularies Increased support for formal identification systems Feeding relevant ideas from other standards Update XML and Relational implementations so they more closely track the model. Look into internationalisation issues and see if these effect the model or the implementations

25 Shoaib Sufi CCLRC e-Science Centre More information Latest Model description –http://www- dienst.rl.ac.uk/library/2002/tr/dltr pdfhttp://www- dienst.rl.ac.uk/library/2002/tr/dltr pdf For an XML implementation and Relational Implementation, newer draft of the model documentation with the subject containing [metadata model


Download ppt "Shoaib Sufi CCLRC e-Science Centre CCLRC Scientific Metadata (CSMD) Model April 2004 NESC."

Similar presentations


Ads by Google