Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ocean Observatories Initiative Data Management (DM) Subsystem Overview Michael Meisinger September 29, 2009.

Similar presentations


Presentation on theme: "Ocean Observatories Initiative Data Management (DM) Subsystem Overview Michael Meisinger September 29, 2009."— Presentation transcript:

1 Ocean Observatories Initiative Data Management (DM) Subsystem Overview Michael Meisinger September 29, 2009

2 2 OOI CI Kick-Off Meeting Sept 9-11, 2009 Outline Subsystem Architecture Overview Scope of Release 1 Selected Components –Data Distribution based on the Exchange –Data Store as a Service

3 3 OOI CI Kick-Off Meeting Sept 9-11, 2009 Data Distribution w/ Exchange Context of DM within CI Exchange handles Data distribution

4 4 OOI CI Kick-Off Meeting Sept 9-11, 2009 Data Processing and Availability Multiple aspects of data management Data processing and analysis at various levels of abstraction Data distribution critical to global scientific research

5 5 OOI CI Kick-Off Meeting Sept 9-11, 2009 Requirements Focus on High risk requirements The CI shall implement an OOI-standard metadata model for resources The OOI-standard metadata model shall support a description of physical resource behavior The OOI-standard metadata model shall support a description of physical resource content The OOI-standard metadata model shall support a syntactic description for the content of an information resource The OOI-standard metadata model shall support a semantic description for the content of an information resource The OOI-standard metadata model shall support tracking of resource provenance The OOI-standard metadata model shall support tracking of quality The OOI-standard metadata model shall support tracking of context The OOI-standard metadata model shall support tracking of correspondence The OOI-standard metadata model shall support tracking of citation The OOI-standard metadata model shall support tracking of lineage The OOI-standard metadata model shall be extensible The CI shall provide semantic services to support ontological representations and relationships The semantic services shall utilize domain-specific vocabularies A user interface to define vocabulary terms shall be provided The vocabularies shall be extensible The semantic services shall recommend new terms to enter into the vocabulary The semantic services shall implement an ontological language The semantic services shall implement an ontological engine The CI shall provide persistent archive services The persistent archive services shall be data format agnostic The persistent archive services shall be subject to policy The persistent archive services shall preserve all associations between data and metadata The persistent archive services shall ingest data independent of delivery order The persistent archive services shall guarantee the integrity of archived data The persistent archive services shall support distributed data repositories The persistent archive services shall support federation The persistent archive services shall support data versioning The persistent archive services shall acknowledge requests for data and provide an estimate for response time

6 6 OOI CI Kick-Off Meeting Sept 9-11, 2009 Scope of Release 1 Common data and metadata model –Resource metadata, behavior, lifecycle, content, provenance, lineage, citation, quality, context, correspondence –Extensible vocabularies and ontologies –Data formats (syntax and semantics) Dynamic data distribution services –Pub/sub, topics, processing chaining, sequestration Data catalog and repository –Discovery, metadata management Persistent archive services –Repository management, common repository framework, ingestion services, long-term archival

7 7 OOI CI Kick-Off Meeting Sept 9-11, 2009 DM Functional Components DX Prototype Data Exchange (DX) prototype barely touches the Ingestion/Transformation/ Exchange/Preservation in the context of a Data distribution model DX strongly informs further refinements of the DM architecture and technology choices

8 8 OOI CI Kick-Off Meeting Sept 9-11, 2009 Information Container Model Encapsulates all kinds of information resources, such as: scientific data, user identities, process definitions, virtual machine images, etc. Multiple levels of meta-data Separation of concerns between Information services

9 9 OOI CI Kick-Off Meeting Sept 9-11, 2009 Ingestion Provides basic mechanisms for identifying the data streams and formats, parsing the content and identifying the associated meta-data, adding version information, and registering the streams with a ISN Repository

10 10 OOI CI Kick-Off Meeting Sept 9-11, 2009 Ingestion Service Data Model Relationship between the constituents of the Ingestion Service and the Information Container Model

11 11 OOI CI Kick-Off Meeting Sept 9-11, 2009 Transformation Service Data Model Relationship between the constituents of the Transformation Service and the Information Container Model

12 12 OOI CI Kick-Off Meeting Sept 9-11, 2009 Preservation Service Data Model Relationship between the constituents of the Preservation Service and the Information Repository Model

13 13 OOI CI Kick-Off Meeting Sept 9-11, 2009 Scientific Data Transport As DAP evolves, Unidata’s CDM may be its successor* –OpenDAP –netCDF –HDF5 * Comparison available at: http://wiki.opendap.org/twiki/bin/view/Developers/ModelSummary Currently DAP as canonical form

14 14 OOI CI Kick-Off Meeting Sept 9-11, 2009 Data Store as Service Exchange makes data transport possible and physical location of data becomes transparent to application Storage mechanisms abstracted to improve flexibility Ability to choose the best technology for the available platform that fits the intended purpose Multiple different storage “back-ends” possible Attribute Store prototype as the predecessor to a storage architecture

15 15 OOI CI Kick-Off Meeting Sept 9-11, 2009 Attribute Store generic repository of information organized around key + value pairs intended to provide fast, reliable data storage and retrieval for lightweight data elements (not a full-blown SQL engine). Decomposition: –Command Processor – interfaces with other OOI entities and abstracts from Repository technology –Repository – stores the actual content in using the best technology available for the selected platform –Specification – describes Repository and how to store/retrieve/match elements to/from Repository

16 16 OOI CI Kick-Off Meeting Sept 9-11, 2009 Attribute Store - Design Fundamental Interaction Pattern Internal Interaction Pattern for the WRITE Cmd. Command Set

17 17 OOI CI Kick-Off Meeting Sept 9-11, 2009 Data Representation Data Representation/Encoding Standards –Processing –Transport –Storage Many choices… with overlapping capabilities

18 Technology Mapping

19 19 OOI CI Kick-Off Meeting Sept 9-11, 2009 Thanks !

20 20 OOI CI Kick-Off Meeting Sept 9-11, 2009 DM Components Base is DM FDR presentation Data Distribution based on the Exchange –Data Exchange architecture after services OV2 slide as example for a data distribution (vs storage model, the older model); real architecture has not been chosen; DX strongly informs. Covers Ingestion, Transformation, Preservation in the context of a Data distribution model –DAP as canonical form for transport of data. For given streams there are canonical forms (e.g. DAP), but not for the system in general (i.e. a database). That’s why we chose the new model. Be aware that the underlying data model of DAP is in evolution. Unidata CDM. Insert a few references to these models. –Reference to encoding formats, FIPA header –Query against the past (e.g. archive query) or the future (e.g. subscriptions). Pointer to SQLstream prototype Data Store as a Service –Attribute store as the predecessor to a storage architecture –Model, commands

21 21 OOI CI Kick-Off Meeting Sept 9-11, 2009 FIPA Provides valuable models for –Communication patterns –Message structure

22 22 OOI CI Kick-Off Meeting Sept 9-11, 2009 Subsystem Data and Information Access –Search & Navigation –External observatory access (IOOS, Neptune Canada, …) Transformation and Mediation –Attribution & Association –Aggregation –Syntactical Transformation –Ontology-based mediation between vocabularies Dynamic Data/Information Distribution –Persistent Archive –Information Catalog & Repository


Download ppt "Ocean Observatories Initiative Data Management (DM) Subsystem Overview Michael Meisinger September 29, 2009."

Similar presentations


Ads by Google