Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ohio State University Department of Computer Science and Engineering 1 Cyberinfrastructure for Coastal Forecasting and Change Analysis Gagan Agrawal Hakan.

Similar presentations


Presentation on theme: "Ohio State University Department of Computer Science and Engineering 1 Cyberinfrastructure for Coastal Forecasting and Change Analysis Gagan Agrawal Hakan."— Presentation transcript:

1 Ohio State University Department of Computer Science and Engineering 1 Cyberinfrastructure for Coastal Forecasting and Change Analysis Gagan Agrawal Hakan Ferhatosmanoglu Xutong Niu Ron Li Keith Bedford The Ohio State University

2 Ohio State University Department of Computer Science and Engineering 2 Context New Award from Office of Cyberinfrastructure (OCI) –Under Cyberinfrastructure for Environmental Observatories Program –September 2006 – August 2009, total amount $1,400,000 Involves 2 Computer Scientists and 2 Environmental Scientists –G. Agrawal (PI) – Grid Middleware –H. Ferhatosmanoglu – Databases –K. Bedford: Great Lakes Now/Forecasting –R. Li: Coastal Erosion Analysis

3 Ohio State University Department of Computer Science and Engineering 3 Coastal Forecasting and Change Detection (Lake Erie)

4 Ohio State University Department of Computer Science and Engineering 4 Project Premise Limitation of Current Environmental Observation Systems –Tightly coupled systems »No reuse of algorithms »Very hard to experiment with new algorithms –Closely tied to existing resources Our claim –Emerging trends towards web-services and grid- services can help

5 Ohio State University Department of Computer Science and Engineering 5 Challenges Existing Grid Middleware Systems have not considered –Processing of Streaming Data –Data Integration Issues The applications involved needs techniques for multi-modal data fusion, query planning, and data mining –Need to implement them as grid or web-services

6 Ohio State University Department of Computer Science and Engineering 6 Proposed Infrastructure and Collaboration

7 Ohio State University Department of Computer Science and Engineering 7 Application Details: Great Lakes Now/ForeCasting GLOS: Great Lakes Observing System –Co-designer/project manager: K. Bedford, a co-PI on this project –Collaboration with NOAA Limitations: Hard-wired –Cannot incorporate new streams or algorithms Create an Implementation using our Middleware for Streaming Data

8 Ohio State University Department of Computer Science and Engineering 8 Application Details: Coastal Erosion Prediction and Analysis Focus: Erosion along Lake Erie Shore –Serious problem –Substantial Economic Losses Prediction requires data from –Variety of Satellites –In-situ sensors –Historical Records Challenges –Analyzing distributed data –Data Integration/Fusion

9 Ohio State University Department of Computer Science and Engineering 9 Middleware Developed at Ohio State Middleware Developed at Ohio State Automatic Data Virtualization Framework –Enabling processing and integration of data in low- level formats GATES (Grid-based AdapTive Execution on Streams) –Processing of distributed data streams FREERIDE-G (FRamework for Rapid Implementation of Datamining Engines in Grid) –Supporting scalable data analysis on remote data

10 Ohio State University Department of Computer Science and Engineering 10 Automatic Data Virtualization: Motivation Access mechanisms for remote repositories –Complex low-level formats make accessing and processing of data difficult –Main desired functionality »Ability to select, down-load, and process a subset of data Sensor Data –Again, low level data –Need to convert formats –Need a flexible architecture

11 Ohio State University Department of Computer Science and Engineering 11 Data Virtualization An abstract view of data dataset Data Service Data Virtualization By Global Grid Forum’s DAIS working group: A Data Virtualization describes an abstract view of data. A Data Service implements the mechanism to access and process data through the Data Virtualization

12 Ohio State University Department of Computer Science and Engineering 12 Our Approach: Automatic Data Virtualization Automatically create data services –A new application of compiler technology A metadata descriptor describes the layout of data on a repository An abstract view is exposed to the users Two implementations: –Relational /SQL-based –XML/XQuery based

13 Ohio State University Department of Computer Science and Engineering 13 Streaming Data Model Continuous data arrival and processing Emerging model for data processing –Sources that produce data continuously: sensors, long running simulations –Critical In Environmental Observatories Active topic in many computer science communities –Databases –Data Mining –Networking ….

14 Ohio State University Department of Computer Science and Engineering 14 Need for a Grid-Based Stream Processing Middleware Application developers interested in data stream processing –Will like to have abstracted »Grid standards and interfaces »Adaptation function –Will like to focus on algorithms only GATES is a middleware for –Grid-based –Self-adapting Data Stream Processing

15 Ohio State University Department of Computer Science and Engineering 15 Adaptation for Real-time Processing Analysis on streaming data is approximate Accuracy and execution rate trade-off can be captured by certain parameters (Adaptation parameters) –Sampling Rate –Size of summary structure Application developers can expose these parameters and a range of values

16 Ohio State University Department of Computer Science and Engineering 16 FREERIDE-G: Supporting Distributed Data-Intensive Science Data Repository Cluster Compute Cluster User ?

17 Ohio State University Department of Computer Science and Engineering 17 Challenges for Application Development Analysis of large amounts of disk resident data Incorporating parallel processing into analysis Processing needs to be independent of other elements and easy to specify Coordination of storage, network and computing resources required Transparency of data retrieval, staging and caching is desired

18 Ohio State University Department of Computer Science and Engineering 18 FREERIDE-G Goals Support High-End Processing –Enable efficient processing of large scale data mining computations Ease Use of Parallel Configurations –Support shared and distributed memory parallelization starting from a common high-level interface Hide Details of Data Movement and Caching –Data staging and caching (when feasible/appropriate) needs to be transparent to application developer

19 Ohio State University Department of Computer Science and Engineering 19 Data Analysis Services Multi-model Multi-Sensor Data Integration –Built on our Data Virtualization Framework Query Planning Service –Feature Extraction: Integration with Grid Metadata Catalogs Remote Mining of Spatio-Temporal Data –Built using FREERIDE-G Mining algorithms for Data Streams –Built using GATES

20 Ohio State University Department of Computer Science and Engineering 20 Recap

21 Ohio State University Department of Computer Science and Engineering 21 Looking For Feedback on our approach Synergy with other efforts Lessons learnt by others


Download ppt "Ohio State University Department of Computer Science and Engineering 1 Cyberinfrastructure for Coastal Forecasting and Change Analysis Gagan Agrawal Hakan."

Similar presentations


Ads by Google