Download presentation

Presentation is loading. Please wait.

Published byErnest Haswell Modified about 1 year ago

1
Data Mining Techniques Association Rule

2
What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations, or causal structures among item sets in transaction databases, relational databases, and other information repositories Applications – Market basket analysis (marketing strategy: items to put on sale at reduced prices), cross-marketing, catalog design, shelf space layout design, etc Examples – Rule form: Body ead [Support, Confidence]. – buys(x, “Computer”) buys(x, “Software”) [2%, 60%] – major(x, “CS”) ^ takes(x, “ DB”) grade(x, “A”) [1%, 75%]

3
Market Basket Analysis Typically, association rules are considered interesting if they satisfy both a minimum support threshold and a minimum confidence threshold.

4
Rule Measures: Support and Confidence Let minimum support 50%, and minimum confidence 50%, we have –A C [50%, 66.6%] –C A [50%, 100%]

5
Support & Confidence

6
Association Rule: Basic Concepts Given –(1) database of transactions, –(2) each transaction is a list of items (purchased by a customer in a visit) Find all rules that correlate the presence of one set of items with that of another set of items Find all the rules A B with minimum confidence and support –support, s, P(A B) –confidence, c, P(B|A)

7
Terminologies Item –I1, I2, I3, … –A, B, C, … Itemset –{I1}, {I1, I7}, {I2, I3, I5}, … –{A}, {A, G}, {B, C, E}, … 1-Itemset –{I1}, {I2}, {A}, … 2-Itemset –{I1, I7}, {I3, I5}, {A, G}, …

8
Terminologies K-Itemset –If the length of the itemset is K Frequent (Large) K-Itemset –If the length of the itemset is K and the itemset satisfies a minimum support threshold. Association Rule –If a rule satisfies both a minimum support threshold and a minimum confidence threshold

9
Analysis The number of itemsets of a given cardinality tends to grow exponentially

10
Fast Algorithms for Mining Association Rules

11
Mining Association Rules: Apriori Principle For rule A C: –support = support({A C}) = 50% –confidence = support({A C})/support({A}) = 66.6% The Apriori principle: –Any subset of a frequent itemset must be frequent Min. support 50% Min. confidence 50%

12
Mining Frequent Itemsets: the Key Step Find the frequent itemsets: the sets of items that have minimum support –A subset of a frequent itemset must also be a frequent itemset i.e., if {AB} is a frequent itemset, both {A} and {B} should be a frequent itemset –Iteratively find frequent itemsets with cardinality from 1 to k (k-itemset) Use the frequent itemsets to generate association rules

13
Example Database D 1 3 4 2 3 5 1 2 3 5 2 5 scan D count C 1 C 1 count 1 2 2 3 3 4 1 5 3 generate L 1 L 1 1 2 3 5 scan D count C 2 C 2 count 12 1 13 2 15 1 23 2 25 3 35 2 generate L 2 L 2 13 23 25 35 C 2 12 13 15 23 25 35 generate C 2 scan D count C 3 C 3 count 235 2 generate L 3 L 3 235 C 3 235 generate C 3

14
Example of Generating Candidates L 3 ={abc, abd, acd, ace, bcd} Self-joining: L 3 *L 3 –abcd from abc and abd –acde from acd and ace Pruning: –acde is removed because ade is not in L 3 C 4 ={abcd}

15
Example

16
Apriori Algorithm

19
Exercise 4 min-sup = 20% min-conf = 80%

20
Demo-IBM Intelligent Minner

21
Demo Database

25
Multi-Dimensional Association Single-Dimensional (Intra-Dimension) Rules: Single Dimension (Predicate) with Multiple Occurrences. buys(X, “milk”) buys(X, “bread”) Multi-Dimensional Rules: 2 Dimensions –Inter-dimension association rules (no repeated predicates) age(X,”19-25”) occupation(X,“student”) buys(X,“coke”) –hybrid-dimension association rules (repeated predicates) age(X,”19-25”) buys(X, “popcorn”) buys(X, “coke”) Categorical (Nominal) Attributes –finite number of possible values, no ordering among values Quantitative Attributes –numeric, implicit ordering among values

26
Exercise 5 min-sup = 20% min-conf = 80%

27
Research Topics Quantitative Association Rules – buys (bread, 5) buys (milk, 3) Weighted Association Rules High Utility Association Rules Non-redundant Association Rule Constrained Association Rules Mining Multi-dimensional Association Rules Generalized Association Rules Negative Association Rules Incremental Mining Association Rules Data Stream Association Rule Mining Interactive Mining Association Rules

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google