Frequent patterns and Association Rules

Slides:



Advertisements
Similar presentations
Association Rule Mining
Advertisements

Data Mining Techniques Association Rule
Association Rules Spring Data Mining: What is it?  Two definitions:  The first one, classic and well-known, says that data mining is the nontrivial.
IT 433 Data Warehousing and Data Mining Association Rules Assist.Prof.Songül Albayrak Yıldız Technical University Computer Engineering Department
MIS2502: Data Analytics Association Rule Mining. Uses What products are bought together? Amazon’s recommendation engine Telephone calling patterns Association.
1 of 25 1 of 45 Association Rule Mining CIT366: Data Mining & Data Warehousing Instructor: Bajuna Salehe The Institute of Finance Management: Computing.
Data Mining, Frequent-Itemset Mining
6/23/2015CSE591: Data Mining by H. Liu1 Association Rules Transactional data Algorithm Applications.
Fast Algorithms for Association Rule Mining
Mining Association Rules
Mining Association Rules in Large Databases. What Is Association Rule Mining?  Association rule mining: Finding frequent patterns, associations, correlations,
Association Discovery from Databases Association rules are a simple formalism for expressing positive connections between columns in a 0/1 matrix. A classical.
Chapter 5 Mining Association Rules with FP Tree Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2010.
Association Rules. 2 Customer buying habits by finding associations and correlations between the different items that customers place in their “shopping.
Ch5 Mining Frequent Patterns, Associations, and Correlations
ASSOCIATION RULE DISCOVERY (MARKET BASKET-ANALYSIS) MIS2502 Data Analytics Adapted from Tan, Steinbach, and Kumar (2004). Introduction to Data Mining.
Supermarket shelf management – Market-basket model:  Goal: Identify items that are bought together by sufficiently many customers  Approach: Process.
EXAM REVIEW MIS2502 Data Analytics. Exam What Tool to Use? Evaluating Decision Trees Association Rules Clustering.
Association Rule Mining Data Mining and Knowledge Discovery Prof. Carolina Ruiz and Weiyang Lin Department of Computer Science Worcester Polytechnic Institute.
Frequent-Itemset Mining. Market-Basket Model A large set of items, e.g., things sold in a supermarket. A large set of baskets, each of which is a small.
Lecture 4: Association Market Basket Analysis Analysis of Customer Behavior and Service Modeling.
Association Rule Mining
ASSOCIATION RULES (MARKET BASKET-ANALYSIS) MIS2502 Data Analytics Adapted from Tan, Steinbach, and Kumar (2004). Introduction to Data Mining.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Association Analysis This lecture node is modified based on Lecture Notes for.
Associations and Frequent Item Analysis. 2 Outline  Transactions  Frequent itemsets  Subset Property  Association rules  Applications.
Mining Frequent Patterns, Associations, and Correlations Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Mining Frequent Patterns. What Is Frequent Pattern Analysis? Frequent pattern: a pattern (a set of items, subsequences, substructures, etc.) that occurs.
Data Mining  Association Rule  Classification  Clustering.
Chapter 8 Association Rules. Data Warehouse and Data Mining Chapter 10 2 Content Association rule mining Mining single-dimensional Boolean association.
Chap 6: Association Rules. Rule Rules!  Motivation ~ recent progress in data mining + warehousing have made it possible to collect HUGE amount of data.
MIS2502: Data Analytics Association Rule Mining David Schuff
MIS2502: Data Analytics Association Rule Mining Jeremy Shafer
Mining Association Rules in Large Database This work is created by Dr. Anamika Bhargava, Ms. Pooja Kaul, Ms. Priti Bali and Ms. Rajnipriya Dhawan and licensed.
Data Mining – Association Rules
Mining Dependent Patterns
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Association rule mining
The Shopping Basket Analysis Tool
Association Rules Repoussis Panagiotis.
Mining Association Rules
Knowledge discovery & data mining Association rules and market basket analysis--introduction UCLA CS240A Course Notes*
Association Rules.
Association Rules Zbigniew W. Ras*,#) presented by
I. Association Market Basket Analysis.
Waikato Environment for Knowledge Analysis
Market Basket Many-to-many relationship between different objects
Big Data Analytics: HW#2
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rule Mining
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining Association Rules Assoc.Prof.Songül Varlı Albayrak
Transactional data Algorithm Applications
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rule Mining
Analysis of Customer Behavior and Service Modeling
Unit 3 MINING FREQUENT PATTERNS ASSOCIATION AND CORRELATIONS
732A02 Data Mining - Clustering and Association Analysis
MIS2502: Data Analytics Association Rule Mining
MIS2502: Data Analytics Association Rule Mining
©Jiawei Han and Micheline Kamber
I. Association Market Basket Analysis.
關連分析 (Association Analysis)
Department of Computer Science National Tsing Hua University
Association Rues Analysis .Event A -> Event ?
MIS2502: Data Analytics Association Rule Learning
Association Analysis: Basic Concepts
Presentation transcript:

Frequent patterns and Association Rules CSCI N317 Computation for Scientific Applications Unit 3 - 3 Weka Frequent patterns and Association Rules

Frequent Pattern Analysis Frequent pattern: a pattern (a set of items, subsequences, etc.) that occurs frequently in a data set Motivation: Finding inherent regularities in data What products were often purchased together?— Beer and diapers?! What are the subsequent purchases after buying a PC? What kinds of DNA are sensitive to this new drug? Market Basket Analysis - Analyze customers buying habits by finding associations between different items that are placed in their “shopping baskets” Help design store layouts Items frequently purchased together can be placed together to further encourage the sale of both items Items are placed at opposite ends of the store may entice customers to pick up other items along the way Help retailers plan which items to put on sale at a reduced price

Association Rules Patterns can be represented in the form of association rules E.g. Computer =>antivirus_software [support = 2%, confidence = 60%] - 2% of all the transactions under analysis show that computer and antivirus software are purchased together. - 60% of customers who purchased a computer also bought the software

Basic Concepts: Frequent Patterns and Association Rules TID Items bought 10 I1, I2, I4, 20 I1, I3, I4, 30 I1, I4, I5, 40 I2, I5, I6, 50 I2, I3, I4, I5, I6, Itemset I = {I1, …, Im} Find all the rules A  B with minimum support and confidence support, s, probability that a transaction contains A  B confidence, c, conditional probability that a transaction having A also contains B Customer buys diaper buys both buys beer Let supmin = 50%, confmin = 50% Association rules: I1  I4 (60%, 100%) I4  I1 (60%, 75%)

Basic Concepts: Frequent Patterns and Association Rules Itemset I = {I1, …, Im} Let A and B be a set of items An association rule is an implication of the form A=>B, where and Support(A=>B) = Confidence(A=>B) = Rules that satisfy both a minimum support threshold and a minimum confidence threshold are called strong The occurrence frequency of an itemset is the number of transactions that contain the itemset, also called frequency, support count or count If the support of an itemset satisfies the minimum support threshold, the itemset is a frequent itemset Looking for rules with high support

Interestingness Measure: Correlations (Lift) Strong rules are not necessarily interesting: E.g. 10,000 transactions, 6,000 include computer games, 7,500 include videos, and 4,000 include both computer games and videos. Association rule generated: buys(X, “computer games”) => buys(X, “videos”) [support = 40%, confidence = 66%] The rule is misleading as the probability of purchasing videos is 75%, which is even larger than 66%. In fact, computer games and videos are negatively associated because the purchase of one decreases the likelihood of purchasing the other. Correlation measure can be used to augment the support-confidence framework A => B [support, confidence, correlation]

Interestingness Measure: Correlations (Lift) Measure of dependent/correlated events: lift P(AU B): the likelihood of purchasing both P(A)P(B):the likelihood if the two purchases are independent lift >1: A and B are positively correlated lift<1: A and B are negatively correlated lift = 1: A and B are independent Looking for a rule with lift > 1

Interestingness Measure: Correlations (Lift) Contingency table: lift is smaller than 1. Two times are negatively correlated, thus the rule is not interesting. Computer Game No Computer Game Total Video Game 4,000 3,500 7,500 No Video Game 2,000 500 2,500 6,000 10,000

Association Rule Mining in R and Weka https://www.youtube.com/watch?v=Z4VZsF96QfU https://www.youtube.com/watch?v=4J3gX4ySw1s R https://www.youtube.com/watch?v=b5hgDPa7a2k https://www.youtube.com/watch?v=Gy_nqzJMNrI Need to install “arules” package when following examples on the video. May need to install package from local zip file. Sample code: dataMiningExample.R