Presentation is loading. Please wait.

Presentation is loading. Please wait.

Institut für Softwarewissenschaft - Universität WienP.Brezany 1 Toward Knowledge Discovery in Databases Attached to Grids Peter Brezany Institute for Software.

Similar presentations


Presentation on theme: "Institut für Softwarewissenschaft - Universität WienP.Brezany 1 Toward Knowledge Discovery in Databases Attached to Grids Peter Brezany Institute for Software."— Presentation transcript:

1 Institut für Softwarewissenschaft - Universität WienP.Brezany 1 Toward Knowledge Discovery in Databases Attached to Grids Peter Brezany Institute for Software Science University of Vienna E-mail : brezany@par.univie.ac.at

2 Institut für Softwarewissenschaft - Universität WienP.Brezany 2 Media That Radically Influenced Society Web 1500s Printing Press 1840s Penny Post 1850s Telegraph 1920s Telephone 1930s Radio 1990s 1950s TV 20xx Grid

3 Institut für Softwarewissenschaft - Universität WienP.Brezany 3 Talk Outline Data Mining on the Grid – Background Information Application Examples Architecture of a Traditional Data Mining System GridMiner – A framework for Data Mining on the Grid GridMiner Architecture Functional and Data Access Model Conclusions

4 Institut für Softwarewissenschaft - Universität WienP.Brezany 4 Data Mining on the Grid Data mining on the Grid (DMG) : finding unknown data patterns in an environment with geographically distributed data and computation. Data may be highly heterogeneous with a high update frequency A good DMG algorithm analyzes data in a distributed fashion with modest data communication overhead. A typical DMG algorithm involves local data analysis followed by the generation of a global data model.

5 Institut für Softwarewissenschaft - Universität WienP.Brezany 5 Application Examples Finding out the dependency of the emergence of hepatitis-C on the weather patterns: access to a large hepatitis-C DB at one location and an environmental DB at another location. 2 major financial organizations want to cooperate. They need to share data patterns relevant to the data mining task, they do not want to share the data since it is sensitive - combining the databases may not be feasible. Federating Brain Data Project – Integrating several neuro-science DBs A major multi-national corporation wants to analyze the customer transaction records for quickly developing successful business strategies. - It has thousands of establishments through out the world - Collecting all the data to a centralized data warehouse, followed by analysis using existing commercial data mining software,takes too long.

6 Institut für Softwarewissenschaft - Universität WienP.Brezany 6 Telemedical Applications AMG – Austrian Medical Grid Web Raw Medical Data Reconstructed Medical Data Derived Medical Data Database

7 Institut für Softwarewissenschaft - Universität WienP.Brezany 7 Telemedical Collaboration - Example A patient living in a remote village has a heart problem. An EEG is taken by the local doctor and all the patient’s details are stored in the doctor’s PC based telemedical system. MRI and CT scans are taken within different departments of a general hospital and stored in the telemedical DB. A consultant compiles a report and saves it in the DB. If necessary, in a specialized clinic a 3D ultrasound scan is taken and further report compiled. Requiring complicated surgery, an external specialist using Virtual Reality techniques defines how the surgery should be planned. The resulting operation is placed on video for, e.g., education.  Data mining support/assistance is needed.

8 Institut für Softwarewissenschaft - Universität WienP.Brezany 8 Architecture of a Data Mining System Graphical user interface Pattern evaluation Data mining engine Database or data warehouse server Knowledge base Database Data warehouse FilteringData cleaning, data integration

9 Institut für Softwarewissenschaft - Universität WienP.Brezany 9 On Line Analytical Mining (OLAM)

10 Institut für Softwarewissenschaft - Universität WienP.Brezany 10 GridMiner – A Framework for Data Mining on Grids System Requirements: - Algorithm and data publishing and integration - Compatibility with grid infrastructure and Grid awareness - Openness - Scalability - Security and data privacy Functionality requirements: - Mining different kinds of knowledge in databases - Incremental data mining algorithms - Interactive mining of knowledge at multiple levels of abstraction

11 Institut für Softwarewissenschaft - Universität WienP.Brezany 11 GridMiner (Layered) Architecture (Based on the K.F. Jeffery´s idea)

12 Institut für Softwarewissenschaft - Universität WienP.Brezany 12 Functional and Data Access Model MDS

13 Institut für Softwarewissenschaft - Universität WienP.Brezany 13 Example: Mining Patterns for Data Classification and Associations use database dat1, dat2 mine classifications analyze credit_rating using g_parsimony display as tree use database DBs attributes mine associations using method attributes display as rules

14 Institut für Softwarewissenschaft - Universität WienP.Brezany 14 Knowledge Grid Architecture Layers Generic Grid and Data Grid Services Knowledge Directory Service Resource Allocation Execution Management Data Access Service Tools and Algorithms Access Service Execution Plan Management Result Present. Service High level layer Core layer

15 Institut für Softwarewissenschaft - Universität WienP.Brezany 15 Conclusions Grid data mining is a relevant research topic GridMiner approach may contribute to this research domain Collaborations are needed IPG (Information Power Grid) is the only Grid project, which wants to addresss knowledge discovery issues Looking for a pilot application(s) Open issues - basic Grid technology: Globus, DataGrid, Jini, JXTA ?

16 Institut für Softwarewissenschaft - Universität WienP.Brezany 16 Data Storage and the Components Site ASite B Site C Site D Preprocesing Preprocessing Local DM Construction of the Global Model GUISite E


Download ppt "Institut für Softwarewissenschaft - Universität WienP.Brezany 1 Toward Knowledge Discovery in Databases Attached to Grids Peter Brezany Institute for Software."

Similar presentations


Ads by Google