Institut für Softwarewissenschaft - Universität WienP.Brezany 1 Toward Knowledge Discovery in Databases Attached to Grids Peter Brezany Institute for Software.

Slides:



Advertisements
Similar presentations
Designing Services for Grid-based Knowledge Discovery A. Congiusta, A. Pugliese, Domenico Talia, P. Trunfio DEIS University of Calabria ITALY
Advertisements

Chapter 1 Business Driven Technology
Institut für Softwarewissenschaft - Universität WienP.Brezany 1 Grids, Grid Technologies and Data Mining Peter Brezany Institut für Softwarewissenschaft.
By: Mr Hashem Alaidaros MIS 211 Lecture 4 Title: Data Base Management System.
1.Data categorization 2.Information 3.Knowledge 4.Wisdom 5.Social understanding Which of the following requires a firm to expend resources to organize.
6.1 © 2007 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
An Agent-Based Approach to Inference Prevention in Distributed Database System Xue Ying Chen Department of Computer Science.
Institute for Software Science – University of ViennaP.Brezany 1 Databases and the Grid Peter Brezany Institute für Scientific Computing University of.
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Introduction to Data Mining Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential.
Institute for Scientific Computing – University of ViennaP.Brezany 1 Databases and the Grid Peter Brezany Institute für Scientific Computing University.
ICPCA 2008 Research of architecture for digital campus LBS in Pervasive Computing Environment 1.
1 Data and Knowledge Management. 2 Data Management: A Critical Success Factor The difficulties and the process Data sources and collection Data quality.
Knowledge Portals and Knowledge Management Tools
Telemedicine and Main Issues in Developing Countries: A General Review
Lecture-8/ T. Nouf Almujally
Business Intelligence
University of ViennaP. Brezany 1 Knowledge Discovery in Grid Datasets – Goals, Design Concepts and the Architecture Peter Brezany University of Vienna.
The 2014 International Conference on Internet Computing and Big Data (ICOMP'14), USA, Las-Vegas, July 21-24, science.org/worldcomp14/ws/conferences/icomp14/submission.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
Module 3: Business Information Systems
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization.
Data Management Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
Logistics and Systems Rabby Q. Lavilles. Supply chain is a system of organizations, people, technology, activities, information and resources involved.
Intelligent Grid Solutions GridMiner A Framework for Knowledge Discovery on the Grid – from a Vision to Design and Implementation Peter.
Active Monitoring in GRID environments using Mobile Agent technology Orazio Tomarchio Andrea Calvagna Dipartimento di Ingegneria Informatica e delle Telecomunicazioni.
Chapter 1 Introduction to Data Mining
1 School of Computer, National University of Defense Technology A Profile on the Grid Data Engine (GridDaEn) Xiao Nong
© 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang 5-1 Chapter 5 Business Intelligence: Data.
Edinburgh, 30. Nov GridMiner A Framework for Knowledge Discovery on the Grid – Scientific Drivers and Contributions Peter Brezany.
The Grid System Design Liu Xiangrui Beijing Institute of Technology.
Chapter 3 DECISION SUPPORT SYSTEMS CONCEPTS, METHODOLOGIES, AND TECHNOLOGIES: AN OVERVIEW Study sub-sections: , 3.12(p )
Data Mining By Dave Maung.
1 4/23/2007 Introduction to Grid computing Sunil Avutu Graduate Student Dept.of Computer Science.
KNOWLEDGE GRIDS Akshat Mishra GRID SEMINAR WINTER 2008 Feb 2008.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
This material was developed by Oregon Health & Science University, funded by the Department of Health and Human Services, Office of the National Coordinator.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
4 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved. Computer Software Chapter 4.
Service - Oriented Middleware for Distributed Data Mining on the Grid ,劉妘鑏 Antonio C., Domenico T., and Paolo T. Journal of Parallel and Distributed.
Institut für Softwarewissenschaft - Universität WienP.Brezany 1 Meta-Learning in Distributed Datamining Systems Peter Brezany Institut für Softwarewissenschaft.
Middleware for Grid Computing and the relationship to Middleware at large ECE 1770 : Middleware Systems By: Sepehr (Sep) Seyedi Date: Thurs. January 23,
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Towards Using Grid Services for Mining Fuzzy Association Rules Mihai Gabroveanu, Ion Iancu, Mirel Cosulschi, Nicolae Constantinescu Faculty of Mathematics.
Chapter 2 Introduction to Systems Architecture. Chapter goals Discuss the development of automated computing Describe the general capabilities of a computer.
Marv Adams Chief Information Officer November 29, 2001.
Economic and On Demand Brain Activity Analysis on Global Grids A case study.
Foundations of Information Systems in Business. System ® System  A system is an interrelated set of business procedures used within one business unit.
IT and Network Organization Ecommerce. IT and Network Organization OPTIMIZING INTERNAL COLLABORATIONS IN NETWORK ORGANIZATIONS.
Pertemuan 16 Materi : Buku Wajib & Sumber Materi :
Component 11/Unit 8a Introduction to Data
Globus: A Report. Introduction What is Globus? Need for Globus. Goal of Globus Approach used by Globus: –Develop High level tools and basic technologies.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
Cyberinfrastructure Overview of Demos Townsville, AU 28 – 31 March 2006 CREON/GLEON.
INTRODUCTION TO INFORMATION SYSTEMS LECTURE 9: DATABASE FEATURES, FUNCTIONS AND ARCHITECTURES PART (2) أ/ غدير عاشور 1.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
Chapter 6 Foundations of Business Intelligence: Databases and Information Management.
Discovering Computers 2010: Living in a Digital World Chapter 14
CCNT Lab of Zhejiang University
Grid Computing.
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Data Warehouse.
University of Technology
Data Mining: Concepts and Techniques Course Outline
Chapter 6 Foundations of Business Intelligence: Databases and Information Management.
Data Warehousing and Data Mining
Chapter 6 Foundations of Business Intelligence: Databases and Information Management.
Presentation transcript:

Institut für Softwarewissenschaft - Universität WienP.Brezany 1 Toward Knowledge Discovery in Databases Attached to Grids Peter Brezany Institute for Software Science University of Vienna

Institut für Softwarewissenschaft - Universität WienP.Brezany 2 Media That Radically Influenced Society Web 1500s Printing Press 1840s Penny Post 1850s Telegraph 1920s Telephone 1930s Radio 1990s 1950s TV 20xx Grid

Institut für Softwarewissenschaft - Universität WienP.Brezany 3 Talk Outline Data Mining on the Grid – Background Information Application Examples Architecture of a Traditional Data Mining System GridMiner – A framework for Data Mining on the Grid GridMiner Architecture Functional and Data Access Model Conclusions

Institut für Softwarewissenschaft - Universität WienP.Brezany 4 Data Mining on the Grid Data mining on the Grid (DMG) : finding unknown data patterns in an environment with geographically distributed data and computation. Data may be highly heterogeneous with a high update frequency A good DMG algorithm analyzes data in a distributed fashion with modest data communication overhead. A typical DMG algorithm involves local data analysis followed by the generation of a global data model.

Institut für Softwarewissenschaft - Universität WienP.Brezany 5 Application Examples Finding out the dependency of the emergence of hepatitis-C on the weather patterns: access to a large hepatitis-C DB at one location and an environmental DB at another location. 2 major financial organizations want to cooperate. They need to share data patterns relevant to the data mining task, they do not want to share the data since it is sensitive - combining the databases may not be feasible. Federating Brain Data Project – Integrating several neuro-science DBs A major multi-national corporation wants to analyze the customer transaction records for quickly developing successful business strategies. - It has thousands of establishments through out the world - Collecting all the data to a centralized data warehouse, followed by analysis using existing commercial data mining software,takes too long.

Institut für Softwarewissenschaft - Universität WienP.Brezany 6 Telemedical Applications AMG – Austrian Medical Grid Web Raw Medical Data Reconstructed Medical Data Derived Medical Data Database

Institut für Softwarewissenschaft - Universität WienP.Brezany 7 Telemedical Collaboration - Example A patient living in a remote village has a heart problem. An EEG is taken by the local doctor and all the patient’s details are stored in the doctor’s PC based telemedical system. MRI and CT scans are taken within different departments of a general hospital and stored in the telemedical DB. A consultant compiles a report and saves it in the DB. If necessary, in a specialized clinic a 3D ultrasound scan is taken and further report compiled. Requiring complicated surgery, an external specialist using Virtual Reality techniques defines how the surgery should be planned. The resulting operation is placed on video for, e.g., education.  Data mining support/assistance is needed.

Institut für Softwarewissenschaft - Universität WienP.Brezany 8 Architecture of a Data Mining System Graphical user interface Pattern evaluation Data mining engine Database or data warehouse server Knowledge base Database Data warehouse FilteringData cleaning, data integration

Institut für Softwarewissenschaft - Universität WienP.Brezany 9 On Line Analytical Mining (OLAM)

Institut für Softwarewissenschaft - Universität WienP.Brezany 10 GridMiner – A Framework for Data Mining on Grids System Requirements: - Algorithm and data publishing and integration - Compatibility with grid infrastructure and Grid awareness - Openness - Scalability - Security and data privacy Functionality requirements: - Mining different kinds of knowledge in databases - Incremental data mining algorithms - Interactive mining of knowledge at multiple levels of abstraction

Institut für Softwarewissenschaft - Universität WienP.Brezany 11 GridMiner (Layered) Architecture (Based on the K.F. Jeffery´s idea)

Institut für Softwarewissenschaft - Universität WienP.Brezany 12 Functional and Data Access Model MDS

Institut für Softwarewissenschaft - Universität WienP.Brezany 13 Example: Mining Patterns for Data Classification and Associations use database dat1, dat2 mine classifications analyze credit_rating using g_parsimony display as tree use database DBs attributes mine associations using method attributes display as rules

Institut für Softwarewissenschaft - Universität WienP.Brezany 14 Knowledge Grid Architecture Layers Generic Grid and Data Grid Services Knowledge Directory Service Resource Allocation Execution Management Data Access Service Tools and Algorithms Access Service Execution Plan Management Result Present. Service High level layer Core layer

Institut für Softwarewissenschaft - Universität WienP.Brezany 15 Conclusions Grid data mining is a relevant research topic GridMiner approach may contribute to this research domain Collaborations are needed IPG (Information Power Grid) is the only Grid project, which wants to addresss knowledge discovery issues Looking for a pilot application(s) Open issues - basic Grid technology: Globus, DataGrid, Jini, JXTA ?

Institut für Softwarewissenschaft - Universität WienP.Brezany 16 Data Storage and the Components Site ASite B Site C Site D Preprocesing Preprocessing Local DM Construction of the Global Model GUISite E