The Federated Data System DataFed: Experiences in Data Homogenization and Networking R.B. Husar, K. Hoijarvi, S. R. Falke, E. M. Robinson, Washington University,

Slides:



Advertisements
Similar presentations
GEOSS ADC Architecture Workshop Clearinghouse, Catalogues, Registries Doug Nebert U.S. Geological Survey February 5, 2008.
Advertisements

1 GEOSS Architecture Implementation Pilot (AIP) Phase 2 Air Quality and Health Working Group Stefan Falke, Rudy Husar, Frank Lindsay, David McCabe, Erin.
Web Services Implementation Case Study: DataFed Air Quality Data & Services Project Coordinators: Software Architecture: R. Husar Software Implementation:
REASoN REASoN Project to link NASA's data, modeling and systems to users in research, education and applications Application of NASA ESE Data and Tools.
Enabling Tools and Methods for International, Inter- disciplinary and Educational Collaboration E. M. Robinson, K. Hoijarvi, S. Falke, E. Fialowski, M.
Web 2.0 in Air Quality Analysis and Management Rudolf B. Husar and Erin M. Robinson Washington University, St. Louis, MO 2007 National Environmental Information.
Enhancing Data Discovery, Understanding and Usage through an Air Quality Metadata System ISRSE, 6 May 2009 The Palazzo dei Congressi di Stresa Stresa,
GEOSS Architecture Describes how components fit together for providing data and information that will be better …than the individual components or systems.
ESIP Air Quality Workgroup and the GEO Air Quality Community of Practice collaboratively building an air quality community network for finding, accessing,
Emissions Information Infrastructures NEISGEI and GEOSS Stefan Falke Northrop Grumman and Washington University in St. Louis
Global Earth Observation System of Systems (GEOSS) Architecture Implementation Pilot (AIP) Air Quality and Health Working Group Stefan Falke, Rudy Husar,
Air Quality Cluster Earth Science Information Partners Goals 1. Serve as facilitator and advisor.
DRAFT June 6, 2005 ESIP AQ Cluster, Air Quality Cluster Air Quality Cluster TechTrack Earth Science Information Partners Partners NASA.
2009 Progress Report to UIC: GEOSS Air Quality Community of Practice (AQ CoP link to cop site ) Rudolf Husar (Point of Contact, Washington.
Instrument Builders Information Specialists (ESIP) Scientists Curriculum Developers Teachers Decision Analysts Decision Makers Reports From Kim Kastens.
REASoN REASoN Project to link NASA's data, modeling and systems to users in research, applications and education Application of NASA ESE Data and Tools.
Architecture and Technologies for an Agile, User-Oriented Air Quality Data System Rudolf B. Husar Washington University, St. Louis Presented at the workshop.
GEO Architecture and Data Committee Task AR Architecture Implementation Pilot George Percivall Open Geospatial Consortium GEO Task AR Point.
Work Group Meeting on HTAP-Relevant IT Techniques, Tools and Philosophies: DataFed Experience and Perspectives Rudolf B. Husar CAPITA, Washington University,
GEOSS AIP-2 Overview AIP-2 Societal Benefit Area Working Groups: Air Quality & Health Disaster Response Biodiversity and Climate Renewable Energy Air Quality.
REASoN REASoN Project to link NASA's data, modeling and systems to users in research, education and applications Application of NASA ESE Data and Tools.
REASoN REASoN Project to link NASA's data, modeling and systems to users in research, education and applications Application of NASA ESE Data and Tools.
Adoption of RDA-DFT Terminology and Data Model to the Description and Structuring of Atmospheric Data Aaron Addison, Rudolf Husar, Cynthia Hudson-Vitale.
Emerging Approaches to the Integration of Autonomous Heterogeneous Air Quality Information Systems Invited Paper IN13E-01 American Geophysical Union, San.
GEOSS Air Quality Community Infrastructure ESIP, 23 July 2010 Knoxville, TN E.M. Robinson, R.B. Husar, S.R. Falke, E.T. Habermann, A. Warnock, M. Hogeweg,
Interoperable Information System of Systems for HTAP Rudolf B. Husar and Rich Scheffe With Erin Robinson Presented at HTAP Workshop, WMO, Geneva, January.
Why so many data systems? Dickerson – ppt. Information as a Resource Shared not exchanged …
1 Using the GEOSS Common Infrastructure in the Air Quality & Health SBA: Wildfire & Smoke Assessment Prepared by the GEOSS AIP-2 Air Quality & Health Working.
REASoN REASoN Project to link NASA's data, modeling and systems to users in research, education and applications Application of NASA ESE Data and Tools.
Enabling Tools and Methods for International, Inter- disciplinary and Educational Collaboration E. M. Robinson, K. Hoijarvi, S. Falke, E. Fialowski, M.
AQ Science Problems Complex Needs: –Multiple datasets –Multiple Tools –DIVERSE Community of People Underlying data infrastructure is being developed to.
Architectures and Technologies Enabling the Diffusion of Atmospheric Science Information Rudolf B. Husar and Erin Robinson Washington University, St. Louis.
R.Husar From Dedicated to Federated Data Systems Data ProvidersDecisionsDSS CIRA VIEWS Reg. Haze EPA AQS NAAQS EPA AirNow.
Ellsworth LeDrew (UIC) Gary Foley (UIC) Jay Pearlman (UIC and ADC) Hans-Peter Plag (UIC and ADC) Rudolf Husar (UIC and ADC) David McCabe (ADC)
Air Quality Demonstration Prepared for the 18 th ESIP Federation Meeting January 3, Air Quality Community Support through New Information Technologies.
Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,
International Collaboration, Interoperable Data Flows, Architecture of GEOSS R.B. Husar, E. M. Robinson, J. D. Husar and M. Kieffer, Wash. Univ. G. Percivall,
2010 Progress Report to UIC: GEOSS Air Quality Community of Practice (AQ CoP) Rudolf Husar (Point of Contact, Washington University in.
ESIP AQ Cluster Community Components for the Air Quality SBA in AIP-2.
Fire Emissions Network Sept. 4, 2002 A white paper for the development of a NSF Digital Government Program proposal Stefan Falke Washington University.
NASA REASoN Project SHAirED: S ervices for H elping the Air -quality Community use E SE D ata Stefan Falke, Kari Höijärvi and Rudolf Husar, Washington.
NASA REASoN Project SHAirED: S ervices for H elping the Air -quality Community use E SE D ata Stefan Falke, Kari Höijärvi and Rudolf Husar, Washington.
2009 Progress Report to UIC: GEOSS Air Quality Community of Practice (AQ CoP link to cop site ) Rudolf Husar (Point of Contact, Washington.
1 ISO WAF Community Catalog Product Access Servers Registry Clearinghouse(s) GEO Portals Community Portals Users Client Apps GetCapabilities ISO
Part I DataFed An Agile Distributed Air Quality Data System Rudolf B. Husar Washington University, St. Louis Seminar Presented at University of Alabama,
Architecture and Technologies for an Agile, User-Oriented Air Quality Data System Rudolf B. Husar Washington University, St. Louis Presented at the workshop.
Architecture and Technologies for an Agile, User-Oriented Air Quality Data System Rudolf B. Husar Washington University, St. Louis Presented at the workshop.
Web Services-Based Mediator of Distributed Data Flow and Processing Project Coordinators: Software Architecture: R. Husar Software Implementation: K. Höijärvi.
Application of NASA ESE Data and Tools to Air Quality Management Stefan Falke and Rudolf Husar (Co-PIs) Washington University in St. Louis Project Period:
1 Using the GEOSS Common Infrastructure in the Air Quality & Health SBA: Wildfire & Smoke Assessment Prepared by the GEOSS AIP-2 Air Quality & Health Working.
ESIP Air Quality Jan Air Quality Cluster Air Quality Cluster Technology Track Earth Science Information Partners Partners NASA NOAA EPA (?) USGS.
A Workflow-Accounting Methodology to Determine Earth Observation Requirements for Air Quality A Contribution from the GEO Air Quality Community of Practice.
DRAFT June 6, 2005 ESIP AQ Cluster, Contact R. Husar Air Quality Cluster Air Quality Cluster TechTrack Earth Science Information Partners.
Harmonization and Integration of Semi- Structured Data Through Wikis and Controlled Tagging E. M. Robinson, R. B. Husar Washington University, St. Louis,
The Federated Data System DataFed R. Husar, K. Hoijarvi, S. Falke, DaFed Community EPA Data Summit, Feb. 12, 2008, RTP Non-intrusive data integration infrastructure.
Standards-based Access to Air Quality Data: Application of OGC WMS and WCS Protocols Client Server Std. Interface GetCapabilities GetData Capabilities,
Topic Suggestions Scheffe GEOSS Support to Regional Air Quality (see next slide) –Data. Services –Sharing/Harvesting Infrastructure –Intellectual Resources.
HTAP Data Network GEO Task DA-09-02d: Atm. Model Evaluation Network Project Officer: Terry Keating, EPA ESIP Meeting, Air Quality Workgroup January 5,
GEOSS Common Infrastructure (GCI) The GEOSS Common Infrastructure allows Earth Observations users to search, access and use the data, information, tools.
There is increasing evidence that intercontinental transport of air pollutants is substantial Currently, chemical transport models are the main tools for.
NATIONAL AERONAUTICS AND SPACE ADMINISTRATION ESDS Reuse Working Group Earth Science Data Systems Reuse Working Group Case Study: SHAirED Services for.
ESIP Air Quality Jan Air Quality Cluster Air Quality Cluster Technology Track Earth Science Information Partners Partners NASA NOAA EPA (?) USGS.
Global Earth Observation System of Systems (GEOSS) Architecture Implementation Pilot (AIP) Air Quality and Health Working Group Stefan Falke, Rudy Husar,
Earth Science Information Partners
ESIP Air Quality Custer Initiative Workspaces: Connecting Data Producers, Analysts Decision Makers Winter ESIP Meeting, January 8, 2008, Washington.
The GEO Air Quality Community of Practice AQ CoP
GEOSS Air Quality Community Infrastructure
Interoperable Information System of Systems for HTAP
4/5 May 2009 The Palazzo dei Congressi di Stresa Stresa, Italy
Air Quality Data Systems and the GEOSS Architecture
Presentation transcript:

The Federated Data System DataFed: Experiences in Data Homogenization and Networking R.B. Husar, K. Hoijarvi, S. R. Falke, E. M. Robinson, Washington University, St. Louis G. Leptoukh, NASA GSFC Spring AGU, May 29, 2008, Ft. Lauderdale

DataFed in a Nutshell: A Federation of autonomous, distributed data providers Performs non-intrusive wrapping of data into web services Provides service-based analysis services and tools General Experience with DataFed: It is an agile virtual data system can deliver info products to diverse users Third-party mediation can homogenize distributed data on the fly Since 2005, DataFed is used by EPA and in research DataFed Motivated by GEOSS DataFed development is guided by the meme of GEOSS

Five practices for agile, seamless data federation: 1.Space-Time Query for standardized access to all data (WCS) 2.Data Wrappers for turning heterogeneous data into web services 3.Data Mediators for transforming data into ‘Views’ 4.Mashups for connecting autonomous application 5.DataSpaces for shared metadata by the users, for the users

Parameter-Space-Time Query Using OGC WCS Data Access Protocol Regardless of the data location, data type and format, the parameter-space-time query is the same the return is in user selectable format from the offerings Coverage=THEEDDS.T& BBOX=-126,24,-65,52,0,0 &TIME= / &FORMAT=NetCDF Coverage=SEAW.Refl& BBOX=-126,24,-65,52,0,0 &TIME= / &FORMAT=GeoTIFF Coverage=SURF.Bext& BBOX=-126,24,-65,52,0,0 &TIME= / &FORMAT=NetCDF-table GridImageStation Data ParameterBounding BoxTime RangeOut Format

DataFed wrappers are non-intrusive, third party Third Party Data Wrappers Heterogeneous input data >>> Homogeneous (WCS) Query

Mediated User-Data Interface Mediator turns data into Views Mediated Integration is a flexible design pattern for System of Systems Client-Server design is demanding: User carries the burden of integration Query Data Views

SOAP RDF Mashup Workflow Mashups: Loose Coupling of Autonomous Applications DataFed – Wiki -- GoogleEarth

DataSpaces for Datasets GEOSS Comp. Registry Community AQ Portal extracts Service Offeror registers GEOSS Clearinghouse Catalog list Searches, harvests invokes references publishes provides Standards; SIF Registry Adopted from Percivall, Feb 2008 by R. Husar, March 2008Percivall, Feb 2008 Community AQ Catalog Catalog User Service Workflow composes Data Analyst visualizes Reports to Decision Maker Policy Analyst Informs Services find Community DataSpaces links to GEOSS Core Service Offerors and Users Shared Metadata by the Users, for the Users

GEOSS Comp. Registry Community AQ Portal extracts Service Offeror registers GEOSS Clearinghouse Catalog list Searches, harvests invokes references publishes provides Standards; SIF Registry Adopted from Percivall, Feb 2008 by R. Husar, March 2008Percivall, Feb 2008 Community AQ Catalog Service Workflow composes Data Analyst visualizes Reports to Decision Maker Policy Analyst Informs Services find Community DataSpaces links to GEOSS Core Service Offerors and Users Shared Metadata by the Users, for the Users views report

GEOSS Comp. Registry Community AQ Portal extracts Service Offeror registers GEOSS Clearinghouse Catalog list Searches, harvests invokes references publishes provides Standards; SIF Registry Adopted from Percivall, Feb 2008 by R. Husar, March 2008Percivall, Feb 2008 Community AQ Catalog Service Workflow composes Data Analyst visualizes Decision Maker Policy Analyst Informs Services find Community DataSpaces links to GEOSS Core Service Offerors and Users Shared Metadata by the Users, for the Users -+ views report

Wiki ‘DataSpaces’ Creating and Sharing Metadata Community Catalog - Find Dataset Describe Dataset Discuss Dataset ESIP Communal Wiki Semantic Wiki: Structured (RDF and Unstructured Content Open, Standard Matadata - RDF Ready for Export/Harvesting by Registries, Catalogs

Sharing Best Practices: GEO Best Practice Wiki

Developments and Challenges: Favorable Engineering Developments: A Core network for Air Quality data sharing is emerging. Standards are available for sharing previously unstructured data Third-party mediation can homogenize the distributed data Agile SOA-based systems can deliver info products to diverse users Since 2005, one such IS, DataFed is used by EPA and in research However: Service interfaces are still uneven; networks are still fragile The utility of social networking in science is not understood Users can not provide feedback to upstream providers Many cultural, legal and other barriers hamper progress

ESIP Coordination Application