Djamel A. Zighed and Nicolas Nicoloyannis ERIC Laboratory University of Lyon 2 (France) Prague Sept. 04.

Slides:



Advertisements
Similar presentations
Managerial Decision Making and Problem Solving Computer Lab Notes 1.
Advertisements

Evolutionary Neural Logic Networks for Breast Cancer Diagnosis A.Tsakonas 1, G. Dounias 2, E.Panourgias 3, G.Panagi 4 1 Aristotle University of Thessaloniki,
PARTITIONAL CLUSTERING
Active Cost-sensitive Learning (Intelligent Test Strategies)
Query Optimization of Frequent Itemset Mining on Multiple Databases Mining on Multiple Databases David Fuhry Department of Computer Science Kent State.
Data Model for ICT-Based Environment for Collaborative Learning and Research By OGUNTUNDE Toyin OSOFISAN O. Adenike Computer Science Department University.
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence,
Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute, Zagreb Zdenko Sonicki University of Zagreb.
Mahgul Gulzai Moomal Umer Rabail Hafeez
1 Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks Lecture 7: Evaluation of discovered knowledge Brief introduction to lectures.
Research on Intelligent Information Systems Himanshu Gupta Michael Kifer Annie Liu C.R. Ramakrishnan I.V. Ramakrishnan Amanda Stent David Warren Anita.
August 2005RSFDGrC 2005, Regina, Canada 1 Feature Selection Based on Relative Attribute Dependency: An Experimental Study Jianchao Han 1, Ricardo Sanchez.
Automated Changes of Problem Representation Eugene Fink LTI Retreat 2007.
Intelligent Databases and Information Systems Department of Computer Science and Artificial Intelligence, University of Granada, Spain © Fernando Berzal,
Careers in Mathematics 24 March Careers In Mathematics What can you do with a B.Sc. degree in Mathematics ? Where do you want to work ?
IST DEVELOPMENT IN LATVIA
CS 1 – Introduction to Computer Science Introduction to the wonderful world of Dr. T Dr. Daniel Tauritz.
CS157A Spring 05 Data Mining Professor Sin-Min Lee.
Rule Induction with Extension Matrices Leslie Damon, based on slides by Yuen F. Helbig Dr. Xindong Wu, 1998.
Presented To: Madam Nadia Gul Presented By: Bi Bi Mariam.
Data Mining Techniques
Do we need theoretical computer science in software engineering curriculum: an experience from Uni Novi Sad Bansko, August 28, 2013.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
Wrocław University of Technology leaded by Ngoc Thanh Nguyen Institute of Information Science and Engineering Multi-agent Systems and Knowledge Integration.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Microsoft Access Lecture -13- By lec. (Eng.) Hind Basil University of Technology Department of Materials Engineering 1.
PARALLEL TABLE LOOKUP FOR NEXT GENERATION INTERNET
Project CC4U2 Setting an Efficient Partnership for Allowing International Student Exchanges: a Difficult Issue Philippe Lahire University of Nice Sophia.
Suzanne Westbrook, PhD School of Information: Science, Technology, & Arts Computer Science Dept, UA.
Research in Education Faculty Development Workshop March 8, 2013 Donna L. Pattison, PhD Instructional Professor Department of Biology & Biochemistry.
Treatment Learning: Implementation and Application Ying Hu Electrical & Computer Engineering University of British Columbia.
Prepared by Julia Gil 1st year student group PP 14-04B Krasnoyarsk,2014.
Garrett Poppe, Liv Nguekap, Adrian Mirabel CSUDH, Computer Science Department.
Data Mining Knowledge on rough set theory SUSHIL KUMAR SAHU.
The New eScience Education at the University of Copenhagen Professor Eric Jul Director of eScience Studies eScience Center DIKU – Department of Computer.
Factors that contribute to effective research in an engineering department Gavin van Winsen, Jan-Harm C Pretorius, Leon Pretorius.
Trend Analysis and Risk Identification 1 The Gerstner laboratory for intelligent decision making and control, Czech Technical University, Prague Lenka.
1 Learning Agents Laboratory Computer Science Department George Mason University Prof. Gheorghe Tecuci 2. Projects and assignments.
Data Mining By Dave Maung.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Intelligent Systems Laboratory 1 CLUSTERS Prof. George Papadourakis,
Department of computer science and engineering Two Layer Mapping from Database to RDF Martin Švihla Research Group Webing Department.
Pascucci-1 Valerio Pascucci Director, CEDMAV Professor, SCI Institute & School of Computing Laboratory Fellow, PNNL Massive Data Management, Analysis,
CS157B Fall 04 Introduction to Data Mining Chapter 22.3 Professor Lee Yu, Jianji (Joseph)
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Multi-Relational Data Mining: An Introduction Joe Paulowskey.
Computer Science and Engineering Parallelizing Defect Detection and Categorization Using FREERIDE Leonid Glimcher P. 1 ipdps’05 Scaling and Parallelizing.
1 Structure of Aalborg University Welcome to Aalborg University.
GIS Data Models GEOG 370 Christine Erlien, Instructor.
M.Sc. and Ph.D. in Computational Science Department of Mathematics Faculty of Science Chulalongkorn University.
ML, DM education What’s cookin’ ? Maja Skrjanc, Tanja Urbancic, Peter Flach.
Distributed Spanning Tree Center for Information Security Department of Computer Science University of Tulsa, Tulsa, OK
Data Mining and Decision Support
HISTORY With the establishment of Karabük University at 2007, Graduate School of Natural and Applied Sciences has started its education life with 8 graduate.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author : Yongqiang Cao Jianhong Wu 國立雲林科技大學 National Yunlin University of Science.
REU 2007-ParSat: A Parallel SAT Solver Christopher Earl, Mentor: Dr. Hao Zheng Department of Computer Science & Engineering Introduction Results and Conclusions.
1 Database Systems, 8 th Edition Star Schema Data modeling technique –Maps multidimensional decision support data into relational database Creates.
CS 1010– Introduction to Computer Science Daniel Tauritz, Ph.D. Associate Professor of Computer Science Director, Natural Computation Laboratory Academic.
A Solution to the Recall Problem using Rough Set Theory Professor Djamel Bouchaffra (Advisor) Tarek Dakhlallah (Ph.D. Student) Computer Science & Engineering.
Physics Faculty, Kabul University
CS 1010– Introduction to Computer Science
RESEARCH APPROACH.
The Transportation Model
What is Pattern Recognition?
Seminar Title By Name of the Candidate A Seminar on
Department of Information Science and Engineering, KLS, GIT, Belgavi.
Department of Computer Science and Engineering, KLS, GIT, Belgavi.
University of Nice Sophia Antipolis
Department of Computer Science DCC University of Chile
Presentation transcript:

Djamel A. Zighed and Nicolas Nicoloyannis ERIC Laboratory University of Lyon 2 (France) Prague Sept. 04

About Computer science dep. In Lyon, there are 3 universities, students Lumière university Lyon 2, has students, Lyon 2, is mainly a liberal art university The faculty of economic has tree departments, among them the computer science one We belong to this department We have Bachelor, Master and PhD programs for 300 students

ERIC Lab at the University EconomicSociologyLinguisticLaw Faculties of university of Lyon 2 ERIC Research centers of the university Knowledge Engineering Research Center - The budget of ERIC doesn’t depend from the university, it’s given par The national ministry of education - We have a large autonomy in decision making

ERIC Lab Born in 1995, 11 professors (N. Nicoloyannis, director) 15 PhD Students Grants+contracts+WK+…=200K€/year Research topics –Data mining (theory, tools and applications) –Data warehouse management (T,T,A)

Data Mining (T,T,A) Theory –Induction graphs –Learning and classification Tools –SIPINA : Plate form for data mining Applications –Medical fields –Chemical applications –Human science –… Data mining TTA for complex data

Data mining on complex data An example : Breast cancer diagnosis

Motivations Contingency table Association measure : It measures the strength of the relationship between X and Y

Motivations Contingency table Association measure : It measures the strength of the relationship between X and Y

Motivations Contingency table Association measure : It measures the strength of the relationship between X and Y

Motivations Contingency table Association measure : It measures the strength of the relationship between X and Y According to a specific association measure, may we improve the strength of the relationship by merging some rows and/or some columns ?

Motivations Contingency table Association measure : It measures the strength of the relationship between X and Y According to a specific association measure, may we improve the strength of the relation ship by merging some rows and/or some columns ?

An example

Goal: Find the groupings that maximize the association between attributes Yes, we can improve the association by reducing the size of the contingency table For the preceding example the maximization of the Tschuprow’s t gives

Extension Contingency table According to a specific association measure, may we find the optimal reduced contingency table ?

Optimal solution (exhaustive search) Goal : Find the best cross partition on T

Optimal solution (exhaustive search)

According to a specific association measure, may we find the optimal reduced contingency table ? Yes, but the solution is intractable in real word because of the high time complexity

Heuristic Proceed successively to the grouping of 2 (row or column) values that maximizes the increase in the association criteria.

Complexity

Simulation Goal : How far is the quasi-optimal solution from the true optimum? Comparison tractable for tables not greater than 6 × 6. Simulation Design Randomly generate 200 tables Analysis of the distribution of the deviations between optima and quasi-optima. Generating the Tables cases distributed in the cxr cells of the table with an uniform distribution (worst case).

Quasi-optimal solution

Conclusion Implementation for new approach induction decision tree. –Zighed, D.A., Ritschard, G., W. Erray and V.-M. Scuturici (2003), Abogodaï,a New approach for Decision Trees, in Lavrac, N., D. Gamberger, L. Todorovski and H. Blockeel (eds), Knowledge Discovery in databases: PKDD 2003, LNAI 2838, Berlin: Springer, –Zighed D. A., Ritschard G., Erray W., Scuturici V.-M. (2003), Decision tree with optimal join partitioning, To appear in Journal of Information Intelligent Systems, Kluwer (2004). Divisive top-down approach Extension to multidimensionnal case