Market Basket Analysis 포항공대 산업공학과 PASTA Lab. 석사과정 20012557 신원영.

Slides:



Advertisements
Similar presentations
Association Rule Mining
Advertisements

Data Mining Techniques Association Rule
DATA MINING Association Rule Discovery. AR Definition aka Affinity Grouping Common example: Discovery of which items are frequently sold together at a.
Association Rules Spring Data Mining: What is it?  Two definitions:  The first one, classic and well-known, says that data mining is the nontrivial.
LOGO Association Rule Lecturer: Dr. Bo Yuan
Pertemuan XIV FUNGSI MAYOR Assosiation. What Is Association Mining? Association rule mining: –Finding frequent patterns, associations, correlations, or.
Association Rule Mining. 2 The Task Two ways of defining the task General –Input: A collection of instances –Output: rules to predict the values of any.
MIS2502: Data Analytics Association Rule Mining. Uses What products are bought together? Amazon’s recommendation engine Telephone calling patterns Association.
Association Analysis. Association Rule Mining: Definition Given a set of records each of which contain some number of items from a given collection; –Produce.
Data Mining Techniques So Far: Cluster analysis K-means Classification Decision Trees J48 (C4.5) Rule-based classification JRIP (RIPPER) Logistic Regression.
Data Mining Association Analysis: Basic Concepts and Algorithms Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining, Frequent-Itemset Mining
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining Association Analysis: Basic Concepts and Algorithms
Asssociation Rules Prof. Sin-Min Lee Department of Computer Science.
Data Mining, Frequent-Itemset Mining. Data Mining Some mining problems Find frequent itemsets in "market-basket" data – "50% of the people who buy hot.
Association Rule Mining Part 1 Introduction to Data Mining with Case Studies Author: G. K. Gupta Prentice Hall India, 2006.
Fast Algorithms for Association Rule Mining
IBM SPSS Modeler 14.2 Data Mining Concepts Introduction to Undirected Data Mining: Association Analysis Prepared by David Douglas, University of ArkansasHosted.
MIS 451 Building Business Intelligence Systems Association Rule Mining (1)
Chapter 13 – Association Rules
Association Rules. 2 Customer buying habits by finding associations and correlations between the different items that customers place in their “shopping.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining By Tan, Steinbach, Kumar Lecture.
Modul 7: Association Analysis. 2 Association Rule Mining  Given a set of transactions, find rules that will predict the occurrence of an item based on.
Association Rule By Kenneth Leung. Data Mining The process of extracting valid, previously unknown, comprehensible, and actionable information from large.
ASSOCIATION RULE DISCOVERY (MARKET BASKET-ANALYSIS) MIS2502 Data Analytics Adapted from Tan, Steinbach, and Kumar (2004). Introduction to Data Mining.
CIS 600: Master's Project Online Trading and Data Mining- Based Marketing of IT Books Supervisor : Dr. Haiping Xu Student : Tsung-Ta Tu Student ID :
Supermarket shelf management – Market-basket model:  Goal: Identify items that are bought together by sufficiently many customers  Approach: Process.
EXAM REVIEW MIS2502 Data Analytics. Exam What Tool to Use? Evaluating Decision Trees Association Rules Clustering.
CS 8751 ML & KDDSupport Vector Machines1 Mining Association Rules KDD from a DBMS point of view –The importance of efficiency Market basket analysis Association.
1 What is Association Analysis: l Association analysis uses a set of transactions to discover rules that indicate the likely occurrence of an item based.
Frequent-Itemset Mining. Market-Basket Model A large set of items, e.g., things sold in a supermarket. A large set of baskets, each of which is a small.
Association Rule Mining
DATA MINING By Cecilia Parng CS 157B.
ASSOCIATION RULES (MARKET BASKET-ANALYSIS) MIS2502 Data Analytics Adapted from Tan, Steinbach, and Kumar (2004). Introduction to Data Mining.
1 Knowledge discovery & data mining Association rules and market basket analysis --introduction UCLA CS240A Course Notes* __________________________ *
Association Rules presented by Zbigniew W. Ras *,#) *) University of North Carolina – Charlotte #) ICS, Polish Academy of Sciences.
Data Mining  Association Rule  Classification  Clustering.
Chapter 8 Association Rules. Data Warehouse and Data Mining Chapter 10 2 Content Association rule mining Mining single-dimensional Boolean association.
Chap 6: Association Rules. Rule Rules!  Motivation ~ recent progress in data mining + warehousing have made it possible to collect HUGE amount of data.
Data Mining Association Rules Mining Frequent Itemset Mining Support and Confidence Apriori Approach.
Elective-I Examination Scheme- In semester Assessment: 30 End semester Assessment :70 Text Books: Data Mining Concepts and Techniques- Micheline Kamber.
Jerry Post Copyright © Database Management Systems: Data Mining Market Baskets Association Rules.
1 Data Mining Lecture 6: Association Analysis. 2 Association Rule Mining l Given a set of transactions, find rules that will predict the occurrence of.
MIS2502: Data Analytics Association Rule Mining David Schuff
Chapter 13 – Association Rules DM for Business Intelligence.
MIS2502: Data Analytics Association Rule Mining Jeremy Shafer
Data Mining Association Analysis: Basic Concepts and Algorithms
Knowledge discovery & data mining Association rules and market basket analysis--introduction UCLA CS240A Course Notes*
Frequent Pattern Mining
Association Rules.
I. Association Market Basket Analysis.
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rule Mining
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining Association Rules Assoc.Prof.Songül Varlı Albayrak
Transactional data Algorithm Applications
Data Mining Association Analysis: Basic Concepts and Algorithms
Analysis of Customer Behavior and Service Modeling
Frequent patterns and Association Rules
MIS2502: Data Analytics Association Rule Mining
Market Basket Analysis and Association Rules
MIS2502: Data Analytics Association Rule Mining
I. Association Market Basket Analysis.
Lecture 11 (Market Basket Analysis)
Association Rues Analysis .Event A -> Event ?
MIS2502: Data Analytics Association Rule Learning
Chapter 14 – Association Rules
Association Analysis: Basic Concepts
Presentation transcript:

Market Basket Analysis 포항공대 산업공학과 PASTA Lab. 석사과정 신원영

What is Market Basket Analysis? Finding useful information in ‘ Market Basket ’ Useful information – Who customers are – Which products tend to be purchased together – Why some products tend to be purchased together Useful information like “ If Item A then Item B ” is called ‘ Association rule ’

Association Rules (1) The clear and useful result of market basket analysis Only one result in association rules are strongly recommended – “ If diapers and Thursday,then beer ” is more useful than “ If Thursday,then diapers and beer ” Three Rules – The trivial, the useful, and the inexplicable rules – The trivial rule is already known and simple like “ If paint, then paint brushes ”

Association Rules (2) - The Useful Rule Contains high quality,actionable information Once the patter(rule) is found, it is often not hard to justify – Example ; Rules like “ On Thursday, Customer who purchase diapers are likely to purchase beer ”  Young couples prepare for the weekend by stocking up on diapers for the infants and beer for dad Can be applied to Market layout – Same Ex.) Placing other baby products within sight of the beer

Association Rules (3) - The Inexplicable rules Inexplicable rules give a new fact but no information about consumer behavior and future actions. – Example) “ When a new hardware store opens, one of the most commonly sold items is toilet toilet rings ” Inexplicable rules can be flukes More investigation might give explanation

The Basic Process  Choosing the right set of items (by using taxonomies and virtual items)  Generating rules (by deciphering the counts in the co-occurrence matrix)  Overcoming the practical limits (imposed by thousands or tens or tens of thousands of items appearing in combinations)

Taxonomies - Choosing the right set of items (1) Why? – To reduce too many item combinations – When the items occur in about the same number of times in the data, the analysis produce the best results How to find the appropriate level? – Consider frequency (roll up rare items to higher levels to help to generalize items) – Consider importance (roll down expensive item to lower levels)

Virtual Items - Choosing the right set of items (2) Items that cross product boundaries – So not to appear in the product taxonomy – Possible to include information about transactions themselves(ex.seasonal information) There is danger of using Virtual Items – Redundant Rules; If Coke in summer, then Coke – If diet soda and coke then pizza  If diet coke then pizza

Generating rules (1)  Gathering transactions for selected items ( including virtual items)  Making Co-Occurrence Matrix  Finding the most frequent combination from the matrix  Making a distinction between ‘ condition ’ and ‘ result ’ from that combination – the two keywords ; confidence, improvment

Generating rules (2) Example of transactions CustomerItems 1Orange juice, soda 2Milk, Orange juice, window Cl 3Orange juice, detergent 4Orange juice, detergent,soda 5Window Cl, soda

Generating rules (3) Example of Co-Occurrence Matrix OJWindow Cl.MilkSodaDetergent OJ4 (0.8) 1 (0.2) 1 (0.2) 2 (0.4) 1 (0.2) Window Cl1 (0.2) 2 (0.4) 1 (0.2) 1 (0.2) 0 Milk1 (0.2) 1 (0.2) 1 (0.2) 00 Soda2 (0.4) 1 (0.2) 03 (0.6) 1 (0.2) Detergent1 (0.2) 001 (0.2) 2 (0.4)

Generating rules (4) Assume most common combination ‘ A,B,C ’ CombinationProbabilityCombinationProbability A45%A and B25% B42.5%A and C20% C40%B and C15% A and B and C5%

Generating rules (5) Which is Result between A,B,C? Setting a result on the basis of ‘ confidence ’ Confidence of rule “ If condition then result ” – P(Result|Condition) = P(Result and Condition)/P(Condition) Confidence of rule “ If AB then C ” = P(ABC|AB) Association Rule P(condition)P(condition and result) confidence If AB then C 25%5%0.20 If AC then B 20%5%0.25 If BC then A 15%5%0.33

Generating rules (6) What if P(R) > P(R|C) (= Confidence) ? ‘ Improvement ’ tells how much better a rule is than just prediction the result Improvement = P(R|C) / P(R) = P(RC)/P(R)P(C) – If Improvement > 1  the rule is better – If Improvement < 1  “ If C,then NOT R ” is better RuleconfidenceImprovement If BC then A If BC then Not A If A then B

Overcoming the practical limits Exponential growth as problem size increases – On menu with 100 items,how many combinations are there?  161,700 (3 items ) How to solve the problem of Big Data? – To use the taxonomy ; generalize items that can meet criterion – To use Pruning ; throw out item or combination of items that do not meet criterion. “ Minimum support pruning ” is the most common pruning mechanism

Strengths of Market Basket Analysis It produces clear and understandable results – because the results are association rules It supports both directed and undirected data mining It can handle transactions themselves The computations it uses are simple to understand – The calculation of confidence,and improvement is simple

Weaknesses of Market Basket Analysis It requires exponentially more computational effort as the problem size grows – number of items,complexity of the rules It has limited support for data attribute – Virtual items make rules more expressive It is difficult to determine the right number of items It discounts rare items

Application of Market Basket Analysis It can suggest store layouts It can tell which products are amenable to promotion It is used to compare stores It can be applied to time-series problems