Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Similar presentations


Presentation on theme: "Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,"— Presentation transcript:

1 Data Mining Techniques Association Rule

2 What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations, or causal structures among item sets in transaction databases, relational databases, and other information repositories Applications – Market basket analysis (marketing strategy: items to put on sale at reduced prices), cross-marketing, catalog design, shelf space layout design, etc Examples – Rule form: Body  ead [Support, Confidence]. – buys(x, “Computer”)  buys(x, “Software”) [2%, 60%] – major(x, “CS”) ^ takes(x, “ DB”)  grade(x, “A”) [1%, 75%]

3 Market Basket Analysis Typically, association rules are considered interesting if they satisfy both a minimum support threshold and a minimum confidence threshold.

4 Rule Measures: Support and Confidence Let minimum support 50%, and minimum confidence 50%, we have –A  C [50%, 66.6%] –C  A [50%, 100%]

5 Support & Confidence

6 Association Rule: Basic Concepts Given –(1) database of transactions, –(2) each transaction is a list of items (purchased by a customer in a visit) Find all rules that correlate the presence of one set of items with that of another set of items Find all the rules A  B with minimum confidence and support –support, s, P(A  B) –confidence, c, P(B|A)

7 Terminologies Item –I1, I2, I3, … –A, B, C, … Itemset –{I1}, {I1, I7}, {I2, I3, I5}, … –{A}, {A, G}, {B, C, E}, … 1-Itemset –{I1}, {I2}, {A}, … 2-Itemset –{I1, I7}, {I3, I5}, {A, G}, …

8 Terminologies K-Itemset –If the length of the itemset is K Frequent (Large) K-Itemset –If the length of the itemset is K and the itemset satisfies a minimum support threshold. Association Rule –If a rule satisfies both a minimum support threshold and a minimum confidence threshold

9 Analysis The number of itemsets of a given cardinality tends to grow exponentially

10 Fast Algorithms for Mining Association Rules

11 Mining Association Rules: Apriori Principle For rule A  C: –support = support({A  C}) = 50% –confidence = support({A  C})/support({A}) = 66.6% The Apriori principle: –Any subset of a frequent itemset must be frequent Min. support 50% Min. confidence 50%

12 Mining Frequent Itemsets: the Key Step Find the frequent itemsets: the sets of items that have minimum support –A subset of a frequent itemset must also be a frequent itemset i.e., if {AB} is a frequent itemset, both {A} and {B} should be a frequent itemset –Iteratively find frequent itemsets with cardinality from 1 to k (k-itemset) Use the frequent itemsets to generate association rules

13 Example Database D scan D count C 1 C 1 count generate L 1 L scan D count C 2 C 2 count generate L 2 L C generate C 2 scan D count C 3 C 3 count generate L 3 L C generate C 3

14 Example of Generating Candidates L 3 ={abc, abd, acd, ace, bcd} Self-joining: L 3 *L 3 –abcd from abc and abd –acde from acd and ace Pruning: –acde is removed because ade is not in L 3 C 4 ={abcd}

15 Example

16 Apriori Algorithm

17

18

19 Exercise 4 min-sup = 20% min-conf = 80%

20 Demo-IBM Intelligent Minner

21 Demo Database

22

23

24

25 Multi-Dimensional Association Single-Dimensional (Intra-Dimension) Rules: Single Dimension (Predicate) with Multiple Occurrences. buys(X, “milk”)  buys(X, “bread”) Multi-Dimensional Rules:  2 Dimensions –Inter-dimension association rules (no repeated predicates) age(X,”19-25”)  occupation(X,“student”)  buys(X,“coke”) –hybrid-dimension association rules (repeated predicates) age(X,”19-25”)  buys(X, “popcorn”)  buys(X, “coke”) Categorical (Nominal) Attributes –finite number of possible values, no ordering among values Quantitative Attributes –numeric, implicit ordering among values

26 Exercise 5 min-sup = 20% min-conf = 80%

27 Research Topics Quantitative Association Rules – buys (bread, 5)  buys (milk, 3) Weighted Association Rules High Utility Association Rules Non-redundant Association Rule Constrained Association Rules Mining Multi-dimensional Association Rules Generalized Association Rules Negative Association Rules Incremental Mining Association Rules Data Stream Association Rule Mining Interactive Mining Association Rules


Download ppt "Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,"

Similar presentations


Ads by Google