Download presentation

Presentation is loading. Please wait.

1
**Data Mining Techniques Association Rule**

2
**What Is Association Mining?**

Association Rule Mining Finding frequent patterns, associations, correlations, or causal structures among item sets in transaction databases, relational databases, and other information repositories Applications Market basket analysis (marketing strategy: items to put on sale at reduced prices), cross-marketing, catalog design, shelf space layout design, etc Examples Rule form: Body ® Head [Support, Confidence]. buys(x, “Computer”) ® buys(x, “Software”) [2%, 60%] major(x, “CS”) ^ takes(x, “DB”) ® grade(x, “A”) [1%, 75%]

3
**Market Basket Analysis**

Typically, association rules are considered interesting if they satisfy both a minimum support threshold and a minimum confidence threshold.

4
**Rule Measures: Support and Confidence**

Let minimum support 50%, and minimum confidence 50%, we have A C [50%, 66.6%] C A [50%, 100%]

5
Support & Confidence

6
**Association Rule: Basic Concepts**

Given (1) database of transactions, (2) each transaction is a list of items (purchased by a customer in a visit) Find all rules that correlate the presence of one set of items with that of another set of items Find all the rules A B with minimum confidence and support support, s, P(A B) confidence, c, P(B|A)

7
**Terminologies Item Itemset 1-Itemset 2-Itemset I1, I2, I3, …**

A, B, C, … Itemset {I1}, {I1, I7}, {I2, I3, I5}, … {A}, {A, G}, {B, C, E}, … 1-Itemset {I1}, {I2}, {A}, … 2-Itemset {I1, I7}, {I3, I5}, {A, G}, …

8
**Terminologies K-Itemset Frequent (Large) K-Itemset Association Rule**

If the length of the itemset is K Frequent (Large) K-Itemset If the length of the itemset is K and the itemset satisfies a minimum support threshold. Association Rule If a rule satisfies both a minimum support threshold and a minimum confidence threshold

9
Analysis The number of itemsets of a given cardinality tends to grow exponentially

10
**Fast Algorithms for Mining Association Rules**

11
**Mining Association Rules: Apriori Principle**

Min. support 50% Min. confidence 50% For rule A C: support = support({A C}) = 50% confidence = support({A C})/support({A}) = 66.6% The Apriori principle: Any subset of a frequent itemset must be frequent

12
**Mining Frequent Itemsets: the Key Step**

Find the frequent itemsets: the sets of items that have minimum support A subset of a frequent itemset must also be a frequent itemset i.e., if {AB} is a frequent itemset, both {A} and {B} should be a frequent itemset Iteratively find frequent itemsets with cardinality from 1 to k (k-itemset) Use the frequent itemsets to generate association rules

13
**Example Database D 1 3 4 2 3 5 1 2 3 5 2 5 scan D count C1 C1 count**

2 5 scan D count C1 C1 count generate L1 L1 1 2 3 5 scan D count C2 C2 count generate L2 L2 13 23 25 35 C2 12 15 generate C2 scan D count C3 C3 count generate L3 L3 235 C3 generate C3

14
**Example of Generating Candidates**

L3={abc, abd, acd, ace, bcd} Self-joining: L3*L3 abcd from abc and abd acde from acd and ace Pruning: acde is removed because ade is not in L3 C4={abcd}

15
Example

16
Apriori Algorithm

17
Apriori Algorithm

18
Apriori Algorithm

19
Exercise 4 min-sup = 20% min-conf = 80%

20
**Demo-IBM Intelligent Minner**

21
Demo Database

25
**Multi-Dimensional Association**

Single-Dimensional (Intra-Dimension) Rules: Single Dimension (Predicate) with Multiple Occurrences. buys(X, “milk”) buys(X, “bread”) Multi-Dimensional Rules: 2 Dimensions Inter-dimension association rules (no repeated predicates) age(X,”19-25”) occupation(X,“student”) buys(X,“coke”) hybrid-dimension association rules (repeated predicates) age(X,”19-25”) buys(X, “popcorn”) buys(X, “coke”) Categorical (Nominal) Attributes finite number of possible values, no ordering among values Quantitative Attributes numeric, implicit ordering among values

26
Exercise 5 min-sup = 20% min-conf = 80%

27
**Research Topics Quantitative Association Rules**

buys (bread, 5) ® buys (milk, 3) Weighted Association Rules High Utility Association Rules Non-redundant Association Rule Constrained Association Rules Mining Multi-dimensional Association Rules Generalized Association Rules Negative Association Rules Incremental Mining Association Rules Data Stream Association Rule Mining Interactive Mining Association Rules

Similar presentations

© 2019 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google