Www.gridminer.org… Intelligent Grid Solutions GridMiner A Framework for Knowledge Discovery on the Grid – from a Vision to Design and Implementation Peter.

Slides:



Advertisements
Similar presentations
Intelligent Technologies Module: Ontologies and their use in Information Systems Revision lecture Alex Poulovassilis November/December 2009.
Advertisements

OLAP Query Processing in Grids
By: Mr Hashem Alaidaros MIS 211 Lecture 4 Title: Data Base Management System.
Management Information Systems, Sixth Edition
A Java Architecture for the Internet of Things Noel Poore, Architect Pete St. Pierre, Product Manager Java Platform Group, Internet of Things September.
Using the Semantic Web to Construct an Ontology- Based Repository for Software Patterns Scott Henninger Computer Science and Engineering University of.
James Martin CpE 691, Spring 2010 February 11, 2010.
Institute for Software Science – University of ViennaP.Brezany 1 Databases and the Grid Peter Brezany Institute für Scientific Computing University of.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Intelligent Grid Solutions 1 / 18 Convergence of Grid and Web technologies Alexander Wöhrer und Peter Brezany Institute for Software.
Advanced Topics COMP163: Database Management Systems University of the Pacific December 9, 2008.
Institute for Scientific Computing – University of ViennaP.Brezany 1 Databases and the Grid Peter Brezany Institute für Scientific Computing University.
Institut für Softwarewissenschaft - Universität WienP.Brezany 1 Toward Knowledge Discovery in Databases Attached to Grids Peter Brezany Institute for Software.
Domain Specific Kit for Business Rule Management By Netsoft.
CS2032 DATA WAREHOUSING AND DATA MINING
Business Intelligence System September 2013 BI.
Copyright © 2014 Pearson Education, Inc. 1 It's what you learn after you know it all that counts. John Wooden Key Terms and Review (Chapter 6) Enhancing.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
University of ViennaP. Brezany 1 Knowledge Discovery in Grid Datasets – Goals, Design Concepts and the Architecture Peter Brezany University of Vienna.
Application of PDM Technologies for Enterprise Integration 1 SS 14/15 By - Vathsala Arabaghatta Shivarudrappa.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
GRID job tracking and monitoring Dmitry Rogozin Laboratory of Particle Physics, JINR 07/08/ /09/2006.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
1 Dr. Markus Hillenbrand, ICSY Lab, University of Kaiserslautern, Germany A Generic Database Web Service for the Venice Service Grid Michael Koch, Markus.
ASG - Towards the Adaptive Semantic Services Enterprise Harald Meyer WWW Service Composition with Semantic Web Services
DBSQL 14-1 Copyright © Genetic Computer School 2009 Chapter 14 Microsoft SQL Server.
5/26/2016DataSet™ Presentation 1 Front Cover 2008 DataSet™ An Advanced Business Intelligence Solution.
Master Thesis Defense Jan Fiedler 04/17/98
© 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang 5-1 Chapter 5 Business Intelligence: Data.
XML & Mediators Thitima Sirikangwalkul Wai Sum Mong April 10, 2003.
Linked-data and the Internet of Things Payam Barnaghi Centre for Communication Systems Research University of Surrey March 2012.
Edinburgh, 30. Nov GridMiner A Framework for Knowledge Discovery on the Grid – Scientific Drivers and Contributions Peter Brezany.
DOMENICO TALIA (joint work with M. Cannataro, A. Congiusta, P. Trunfio) DEIS University of Calabria ITALY Grid-Based Data Mining and.
Information System Development Courses Figure: ISD Course Structure.
© DATAMAT S.p.A. – Giuseppe Avellino, Stefano Beco, Barbara Cantalupo, Andrea Cavallini A Semantic Workflow Authoring Tool for Programming Grids.
Introduction to the Adapter Server Rob Mace June, 2008.
1 XML Based Networking Method for Connecting Distributed Anthropometric Databases 24 October 2006 Huaining Cheng Dr. Kathleen M. Robinette Human Effectiveness.
OGSA-DAI in OMII-Europe Neil Chue Hong EPCC, University of Edinburgh.
KNOWLEDGE GRIDS Akshat Mishra GRID SEMINAR WINTER 2008 Feb 2008.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
1 1 EPCC 2 Curtin Business School & Edinburgh University Management School Michael J. Jackson 1 Ashley D. Lloyd 2 Terence M. Sloan 1 Enabling Access to.
Service - Oriented Middleware for Distributed Data Mining on the Grid ,劉妘鑏 Antonio C., Domenico T., and Paolo T. Journal of Parallel and Distributed.
Middleware for Grid Computing and the relationship to Middleware at large ECE 1770 : Middleware Systems By: Sepehr (Sep) Seyedi Date: Thurs. January 23,
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Major Disciplines in Computer Science Ken Nguyen Department of Information Technology Clayton State University.
By N.Gopinath AP/CSE. There are 5 categories of Decision support tools, They are; 1. Reporting 2. Managed Query 3. Executive Information Systems 4. OLAP.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
CALIBER2009 An Approach for Generic Information Query Retrieval in Web2.0 Thippeswamy.K Assistant Professor & HOD Dept. Information Science & Engineering.
Information Integration BIRN supports integration across complex data sources – Can process wide variety of structured & semi-structured sources (DBMS,
Scalable Grid system– VDHA_Grid: an e-Science Grid with virtual and dynamic hierarchical architecture Huang Lican College of Computer.
NeuroLOG ANR-06-TLOG-024 Software technologies for integration of process and data in medical imaging A transitional.
Workflow Management in GridMiner Günter Kickinger, Jürgen Hofer, Peter Brezany, A Min Tjoa Institute for Software Science University of Vienna The 3rd.
Advanced Database Concepts
High Risk 1. Ensure productive use of GRID computing through participation of biologists to shape the development of the GRID. 2. Develop user-friendly.
SEMDIG supported by:funded by: Providing Data Access and Data Related Monitoring Information for Data Integration on the Grid Alexander Wöhrer and Peter.
Cyberinfrastructure Overview of Demos Townsville, AU 28 – 31 March 2006 CREON/GLEON.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 11: BIG DATA AND.
Business Intelligence Overview. What is Business Intelligence? Business Intelligence is the processes, technologies, and tools that help us change data.
SAP BI – The Solution at a Glance : SAP Business Intelligence is an enterprise-class, complete, open and integrated solution.
ACGT Architecture and Grid Infrastructure Juliusz Pukacki ‏ EGEE Conference Budapest, 4 October 2007.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
OGSA-DAI.
Amy Krause EPCC OGSA-DAI An Overview OGSA-DAI on OMII 2.0 OMII The Open Middleware Infrastructure Institute NeSC,
Management Information Systems by Prof. Park Kyung-Hye Chapter 7 (8th Week) Databases and Data Warehouses 07.
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Data Warehousing and Data Mining
Presentation transcript:

Intelligent Grid Solutions GridMiner A Framework for Knowledge Discovery on the Grid – from a Vision to Design and Implementation Peter Brezany, Ivan Janciak, Alexander Wöhrer, A Min Toja University of Vienna Institute for Software Science

CGW'04, 13. Dec. 042 GridMiner Overview  Start: Jan  Host: University of Vienna Vienna University of Technology  Target: provide tools to discover and access relevant knowledge and information from different distributed and heterogeneous data sources  Test application area: medical traumatic brain injury treatment Predicting the outcome of seriously ill patients analytical part focuses on data mining and On-Line Analytical Processing (OLAP)

CGW'04, 13. Dec. 043 Project members Project leader: Prof. A Min Tjoa, Vienna University of Technology Prof. Peter Brezany, University of Vienna Visualization: Radoslav Ivanov Data streaming: Nguyen Manh Tho OLAP: Bernhard Fiser Umut Onan Ibrahim Elsayed Data mediation: Alexander Wöhrer Knowledge Mgt: Ivan Janciak Job Control: Günter Kickinger Sequence Rules: Michael Rinner Clustering: Markus Mayer Decision rules: Christian Kloner Juergen Hofer GUI: Paul Panhofer Autonomic aspects: Michael Bergmann

CGW'04, 13. Dec. 044 Outline Motivation/ Requirements GridMiner Services Architecture Dynamic Service Composition Engine OLAP Knowledge base Data Integration Graphical user interface Implementation Summary

CGW'04, 13. Dec. 045 The process to cover  Data distributed over participating hospitals  accesses from different platforms (hand held, PC,…) for data generation, querying, analysis  Process needs to access various data sources

CGW'04, 13. Dec. 046 GridMiner  Motivation integrate knowledge discovery and knowledge management as an autonomic system manage and control whole lifecycle of knowledge give a strong support to other intelligent entities in their needs for knowledge  Basic Requirements Ability to access and analyze a huge amount of information – typically heterogeneous and geographically distributed Intelligent behavior ability to maintain, discover, extend, present and communicate knowledge High performance (real-time or soft real-time) query processing High security guarantee

CGW'04, 13. Dec. 047 GridMiner Services Dynamic Workflow Control Service Data mining services  Sequences (SPADE)  Clustering (SimpleKMeans)  Decision rules (SPRINT) OLAP (sequential/parallel version)  Association rules on OLAP Grid Data Mediator Service

CGW'04, 13. Dec. 048 GridMiner Architecture Graphical User Interface Knowledge BaseService configuration Dynamic service control engine (DSCE) Data Access and IntegrationData mining services Grid Web User environment DSCE Client

CGW'04, 13. Dec. 049 Dynamic Service Control Engine  Process a workflow described by DSCL.  Based on the Open Grid Services Architecture  Supports both interactive and batch processing  User independent processing of the workflow  Provision of all intermediate results from the involved services  Full user control during workflow execution  Supports the OGSA Notification Model

CGW'04, 13. Dec Dynamic Service Control Engine (cont.)

CGW'04, 13. Dec Knowledge Base Metadata Domain Ontology Activity OntologyDatamining Ont.Datatsource Ont. Rules Facts XML,XML Schema (XSL) (webrowset,pmml…) Web Ontology Language OWL + OWL-S SWRL OWL

CGW'04, 13. Dec OLAP  Multidimensional data analysis by sequential and distributed / parallel OLAP engines.  Cube construction and querying  Representation of query results by OLAP Modeling Markup Language  Integration with data mining engines (Association rules on OLAP)

CGW'04, 13. Dec Grid Data Mediation Service Principles Tight Federation:  global (relational) schema Virtual integration:  let the data where it is  always up-to-date data No proprietary solution  inherit well solve aspects from OGSA-DAI Not bound to special architecture  Supported data sources: RDBMS (via JDBC), XMLDB (Xindice), CSV files  Operators: “Union all” and “inner join”  Operators are XQuery based (using SAXON)

CGW'04, 13. Dec Data Integration Scenario  Heterogeneities: Name in A is „First Last“ (as the target format) Name in C has to be combined  Distribution: 3 data sources

CGW'04, 13. Dec Data Integration Scenario (cont.)  Query: SELECT p_name FROM patient WHERE id=10 to Standard optimized

CGW'04, 13. Dec Implementation/Technology Globus 3.2 OGSA/DAI GUI – Workflow constructions/Results visualization (JGraph, Java web Start, Java server pages) Service Configuration (Java server pages/PHP/..) Knowledge base – (XML,OWL)

CGW'04, 13. Dec Data mining Scenario Database (100k rows) (Select 10k rows) Decision Rules (SPRINT)Decision Rules (C45) (Select 20k rows) Decision Rules (C45)

CGW'04, 13. Dec Graphical User Interface

CGW'04, 13. Dec Summary  Integrated data mining infrastructure Covers the whole process Service Oriented Architecture Implemented Prototype  Project ongoing New data mining tasks (algorithms) Knowledge management  More information:

CGW'04, 13. Dec Thank you Questions?