Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Definition Data mining is the exploration and analysis of large quantities of data.

Slides:



Advertisements
Similar presentations
Association Rules Evgueni Smirnov.
Advertisements

Data Mining: What? WHY? HOW?
Association Rule and Sequential Pattern Mining for Episode Extraction Jonathan Yip.
Association Rules Spring Data Mining: What is it?  Two definitions:  The first one, classic and well-known, says that data mining is the nontrivial.
IT 433 Data Warehousing and Data Mining Association Rules Assist.Prof.Songül Albayrak Yıldız Technical University Computer Engineering Department
Nadia Andreani Dwiyono DESIGN AND MAKE OF DATA MINING MARKET BASKET ANALYSIS APLICATION AT DE JOGLO RESTAURANT.
Data Mining Sangeeta Devadiga CS 157B, Spring 2007.
Data Mining 198:541. Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Definition Data mining is the exploration and analysis of large.
ICS 421 Spring 2010 Data Mining 1 Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 4/6/20101Lipyeow Lim.
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Association Rule Mining Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential.
Data Mining Adrian Tuhtan CS157A Section1.
Lecture14: Association Rules
Mining Association Rules
Chapter Extension 12 Database Marketing.
Data Mining – Intro.
CS157A Spring 05 Data Mining Professor Sin-Min Lee.
Data mining By Aung Oo.
Lesson Outline Introduction: Data Flood
TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT (Muscat, Oman) DATA MINING.
Enterprise systems infrastructure and architecture DT211 4
1 Data Mining, Database Tuning Tuesday, Feb. 27, 2007.
Dr. Awad Khalil Computer Science Department AUC
Knowledge Discovery & Data Mining process of extracting previously unknown, valid, and actionable (understandable) information from large databases Data.
Chapter 5: Data Mining for Business Intelligence
Data Mining Dr. Chang Liu. What is Data Mining Data mining has been known by many different terms Data mining has been known by many different terms Knowledge.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Mining An Introduction.
Tang: Introduction to Data Mining (with modification by Ch. Eick) I: Introduction to Data Mining A.Short Preview 1.Initial Definition of Data Mining 2.Motivation.
Forecast Anything! The Seven Data Mining Models Andy Cheung ISV Developer Evangelist Microsoft Hong Kong.
3 Objects (Views Synonyms Sequences) 4 PL/SQL blocks 5 Procedures Triggers 6 Enhanced SQL programming 7 SQL &.NET applications 8 OEM DB structure 9 DB.
Data Mining CS157B Fall 04 Professor Lee By Yanhua Xue.
1 1 Slide Introduction to Data Mining and Business Intelligence.
Knowledge Discovery and Data Mining Evgueni Smirnov.
Data Mining By Fu-Chun (Tracy) Juang. What is Data Mining? ► The process of analyzing LARGE databases to find useful patterns. ► Attempts to discover.
Association Rule By Kenneth Leung. Data Mining The process of extracting valid, previously unknown, comprehensible, and actionable information from large.
Knowledge Discovery and Data Mining Evgueni Smirnov.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Introduction of Data Mining and Association Rules cs157 Spring 2009 Instructor: Dr. Sin-Min Lee Student: Dongyi Jia.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
CRM - Data mining Perspective. Predicting Who will Buy Here are five primary issues that organizations need to address to satisfy demanding consumers:
Instructor : Prof. Marina Gavrilova. Goal Goal of this presentation is to discuss in detail how data mining methods are used in market analysis.
Association Rule Mining Data Mining and Knowledge Discovery Prof. Carolina Ruiz and Weiyang Lin Department of Computer Science Worcester Polytechnic Institute.
Data Warehousing Lecture-30 What can Data Mining do? Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics Research.
Association Rule Mining
DATA MINING By Cecilia Parng CS 157B.
CURE Clustering Using Representatives Handles outliers well. Hierarchical, partition First a constant number of points c, are chosen from each cluster.
MIS2502: Data Analytics Advanced Analytics - Introduction.
Academic Year 2014 Spring Academic Year 2014 Spring.
Data Mining Copyright KEYSOFT Solutions.
Customer Relationship Management (CRM) Chapter 4 Customer Portfolio Analysis Learning Objectives Why customer portfolio analysis is necessary for CRM implementation.
Chap 6: Association Rules. Rule Rules!  Motivation ~ recent progress in data mining + warehousing have made it possible to collect HUGE amount of data.
DATA MINING It is a process of extracting interesting(non trivial, implicit, previously, unknown and useful ) information from any data repository. The.
Chapter 26: Data Mining Prepared by Assoc. Professor Bela Stantic.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
Introduction.  Instructor: Cengiz Örencik   Course materials:  myweb.sabanciuniv.edu/cengizo/courses.
Introduction to Data Mining Mining Association Rules Reference: Tan et al: Introduction to data mining. Some slides are adopted from Tan et al.
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
Data Mining – Introduction (contd…) Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Data Mining – Intro.
Data Mining Motivation: “Necessity is the Mother of Invention”
MIS2502: Data Analytics Advanced Analytics - Introduction
A Research Oriented Study Report By :- Akash Saxena
Association Rules Repoussis Panagiotis.
Adrian Tuhtan CS157A Section1
Sangeeta Devadiga CS 157B, Spring 2007
Data Mining Association Rules Assoc.Prof.Songül Varlı Albayrak
כריית נתונים.
Kenneth C. Laudon & Jane P. Laudon
Welcome! Knowledge Discovery and Data Mining
Fast Algorithms for Mining Association Rules
Presentation transcript:

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Definition Data mining is the exploration and analysis of large quantities of data in order to discover valid, novel, potentially useful, and ultimately understandable patterns in data. Example pattern (Census Bureau Data): If (relationship = husband), then (gender = male). 99.6%

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Definition (Cont.) Data mining is the exploration and analysis of large quantities of data in order to discover valid, novel, potentially useful, and ultimately understandable patterns in data. Valid: The patterns hold in general. Novel: We did not know the pattern beforehand. Useful: We can devise actions from the patterns. Understandable: We can interpret and comprehend the patterns.

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Advanced Scout Example pattern: An analysis of the data from a game played between the New York Knicks and the Charlotte Hornets revealed that “ When Glenn Rice played the shooting guard position, he shot 5/6 (83%) on jump shots." Pattern is interesting: The average shooting percentage for the Charlotte Hornets during that game was 54%.

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Example Application: Sky Survey Input data: 3 TB of image data with 2 billion sky objects, took more than six years to complete Goal: Generate a catalog with all objects and their type Method: Use decision trees as data mining model Results: 94% accuracy in predicting sky object classes Increased number of faint objects classified by 300% Helped team of astronomers to discover 16 new high red-shift quasars in one order of magnitude less observation time

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Gold Nuggets? Investment firm mailing list: Discovered that old people do not respond to IRA mailings Bank clustered their customers. One cluster: Older customers, no mortgage, less likely to have a credit card “ Bank of 1911 ” Customer churn example

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Market Basket Analysis Consider shopping cart filled with several items Market basket analysis tries to answer the following questions: Who makes purchases? What do customers buy together? In what order do customers purchase items?

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Market Basket Analysis Given: A database of customer transactions Each transaction is a set of items Example: Transaction with TID 111 contains items {Pen, Ink, Milk, Juice}

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Market Basket Analysis (Contd.) Coocurrences 80% of all customers purchase items X, Y and Z together. Association rules 60% of all customers who purchase X and Y also buy Z. Sequential patterns 60% of customers who first buy X also purchase Y within three weeks.

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Confidence and Support We prune the set of all possible association rules using two interestingness measures: Confidence of a rule: X  Y has confidence c if P(Y|X) = c Support of a rule: X  Y has support s if P(XY) = s We can also define Support of an itemset (a coocurrence) XY: XY has support s if P(XY) = s

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Example Examples: {Pen} => {Milk} Support: 75% Confidence: 75% {Ink} => {Pen} Support: 100% Confidence: 100%

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Example Find all itemsets with support >= 75%?

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Example Can you find all association rules with support >= 50%?

Ramakrishnan and Gehrke. Database Management Systems, 3 rd Edition. Market Basket Analysis: Applications Sample Applications Direct marketing Fraud detection for medical insurance Floor/shelf planning Web site layout Cross-selling