Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Mining – Association Rules

Similar presentations


Presentation on theme: "Data Mining – Association Rules"— Presentation transcript:

1 Data Mining – Association Rules
Tutorial 4 Data Mining – Association Rules

2 Contents Review Questions Algorithm Questions
Question 1: Data Mining and Metrics Algorithm Questions Question 2: Applying Apriori Algorithm Question 3: Finding Association Rules

3 Review Questions

4 Question 1: Data Mining and Metrics
What is an Association Rule?

5 Question 1: Data Mining and Metrics
What is an Association Rule? An association rule states that: given a set of records, each of which contain some number of items from a given collection, there will be a dependency rule that will predict the occurrence of an item based on the occurrences of other items in the transaction. In other words, if it has been found in all transactions that coke is always bought with milk, then there will be a rule that states {milk} -> {coke} (however, not the other way around since not all milk is bought with coke).

6 Question 1: Data Mining and Metrics
What are the metrics for evaluating association rules?

7 Question 1: Data Mining and Metrics
What are the metrics for evaluating association rules? The association rule evaluation metrics are “Support” (s) and “Confidence” (c). Support is the fractions of the transactions that contain both X and Y. Confidence measures how often items in Y appears in transactions that contain X.

8 Question 1: Data Mining and Metrics
What are the metrics for evaluating association rules? For example given the following table, these are the support and confidence values: Example Association Rule: {Milk, Diaper} => Beer s = (Milk, Diaper, Beer)/Total Transactions = 2/5 = 0.4 c = (Milk, Diaper, Beer)/ (Milk, Diaper) = 2/3 = 0.67 TID Items 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke

9 Algorithm Questions

10 Question 2: Applying the Apiori Algorithm
Apply the Apriori algorithm to find all itemsets with support >= 0.2 from the following data: Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

11 Question 2: Applying the Apiori Algorithm
Apriori Principle Step 1: Count up the occurrences of 1 item: *Note: since it is out of 10, 0.2 support means if it appears twice in the list. Itemset Count Milk 5 Bread 4 Eggs Juice 3 Butter 2 Coffee Cookies

12 Question 2: Applying the Apiori Algorithm
Apriori Principle Step 2: Look for frequent occurrences of 2 items (in bold, not strikethrough): Itemset Count Milk, Bread 4 Milk, Eggs 3 Milk, Juice 1 Milk, Cookies Bread, Eggs Bread, Cookies Eggs, Coffee Eggs, Cookies Juice, Butter Juice, Coffee Butter, Cookies

13 Question 2: Applying the Apiori Algorithm
Apriori Principle Step 3: Look for frequent occurrences of 3 items (in bold, not strikethrough): Therefore, the most frequent and highest itemset data mining sub-itemset is {Milk, Bread, Eggs}. Itemset Count Milk, Bread, Eggs 3

14 Question 3: Applying the Apiori Algorithm
Using the data set in question 2 ({Milk, Bread, Eggs}), find all the association rules with support >= 0.2 and confidence >= 0.8. “{Milk, Bread} -> Eggs” where {Milk, Bread} is X and Eggs is Y. Support = {itemset (X and Y)}/transactions Confidence = {itemset (X and Y)}/{itemset (X)} To do this, we check each permutation of the association rules.

15 Question 3: Applying the Apiori Algorithm
Association Rules for {Milk, Bread, Eggs}: {Milk, Bread} -> {Eggs} Support = Confidence = {Milk Eggs} -> {Bread} Support = {Eggs, Bread} -> {Milk} Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

16 Question 3: Applying the Apiori Algorithm
Association Rules for {Milk, Bread, Eggs}: {Milk, Bread} -> {Eggs} Support = 3/10 = 0.3 Confidence = 3/4 = 0.75 {Milk Eggs} -> {Bread} Support = Confidence = {Eggs, Bread} -> {Milk} Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

17 Question 3: Applying the Apiori Algorithm
Association Rules for {Milk, Bread, Eggs}: {Milk, Bread} -> {Eggs} Support = 3/10 = 0.3 Confidence = 3/4 = 0.75 {Milk Eggs} -> {Bread} Support = 3/10 = 0.3 Confidence = 3/3 = 1 {Eggs, Bread} -> {Milk} Support = Confidence = Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

18 Question 3: Applying the Apiori Algorithm
Association Rules for {Milk, Bread, Eggs}: {Milk, Bread} -> {Eggs} Support = 3/10 = 0.3 Confidence = 3/4 = 0.75 {Milk Eggs} -> {Bread} Support = 3/10 = 0.3 Confidence = 3/3 = 1 {Eggs, Bread} -> {Milk} Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

19 Question 3: Applying the Apiori Algorithm
Association Rules for {Milk, Bread}: {Milk} -> {Bread} Support = Confidence = {Bread} -> {Milk} Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

20 Question 3: Applying the Apiori Algorithm
Association Rules for {Milk, Bread}: {Milk} -> {Bread} Support = 4/10 = 0.4 Confidence = 4/5 = 0.8 {Bread} -> {Milk} Support = Confidence = Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

21 Question 3: Applying the Apiori Algorithm
Association Rules for {Milk, Bread}: {Milk} -> {Bread} Support = 4/10 = 0.4 Confidence = 4/5 = 0.8 {Bread} -> {Milk} Confidence = 4/4 = 1 Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

22 Question 3: Applying the Apiori Algorithm
Association Rules for {Milk, Eggs}: {Milk} -> {Eggs} Support = Confidence = {Eggs} -> {Milk} Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

23 Question 3: Applying the Apiori Algorithm
Association Rules for {Milk, Eggs}: {Milk} -> {Eggs} Support = 3/10 = 0.3 Confidence = 3/5 = 0.6 {Eggs} -> {Milk} Support = Confidence = Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

24 Question 3: Applying the Apiori Algorithm
Association Rules for {Milk, Eggs}: {Milk} -> {Eggs} Support = 3/10 = 0.25 Confidence = 3/5 = 0.6 {Eggs} -> {Milk} Support = 3/10 = 0.3 Confidence = 3/4 = 0.75 Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

25 Question 3: Applying the Apiori Algorithm
Association Rules for {Bread Eggs}: {Bread} -> {Eggs} Support = Confidence = {Eggs} -> {Bread} Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

26 Question 3: Applying the Apiori Algorithm
Association Rules for {Bread Eggs}: {Bread} -> {Eggs} Support = 3/10 = 0.3 Confidence = 3/4 = 0.75 {Eggs} -> {Bread} Support = Confidence = Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

27 Question 3: Applying the Apiori Algorithm
Association Rules for {Bread Eggs}: {Bread} -> {Eggs} Support = 3/10 = 0.25 Confidence = 3/4 = 0.75 {Eggs} -> {Bread} Support = 3/10 = 0.3 Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

28 Question 3: Applying the Apiori Algorithm
Therefore, the only Association Rules that satisfy the restriction of having support >= 2 and confidence >= 0.8 is: {Milk, Eggs} -> {Bread} (s=0.3, c=1) {Eggs, Bread} -> {Milk} (s=0.3, c=1) {Milk} -> {Bread} (s=0.4, c=0.8) {Bread} -> {Milk} (s=0.4, c=1)

29 Questions?


Download ppt "Data Mining – Association Rules"

Similar presentations


Ads by Google