Presentation is loading. Please wait.

Presentation is loading. Please wait.

Discover The Association Rules of Different Patterns Xuemei Fan.

Similar presentations


Presentation on theme: "Discover The Association Rules of Different Patterns Xuemei Fan."— Presentation transcript:

1 Discover The Association Rules of Different Patterns Xuemei Fan

2 Introduction Description of the dataset Description of the dataset State the problems State the problems Describe the Method used in this project Describe the Method used in this project Show the results Show the results Analysis the results Analysis the results

3 Description of the dataset The dataset is about 1 GB The dataset is about 1 GB Each transaction includes one user id, one date and one item Each transaction includes one user id, one date and one item There is about 5.5 millions transactions There is about 5.5 millions transactions 00111/1/2007Carrot 00111/1/2007Banana 0022/1/2007Lettuce 0024/1/2007Icecream 0014/1/2007Broccoli 0034/1/2007Potato

4 State the problems Discover the association rules from different pattern Discover the association rules from different pattern Find out the dieting habits in different area Find out the dieting habits in different area Determine if there are any relations of association rules from different pattern Determine if there are any relations of association rules from different pattern

5 Describe the Method used in this project Filter Data Filter Data Group Data Group Data Remove Duplicate Data Remove Duplicate Data Extract the Regular Customer Extract the Regular Customer Generate Association Rule Generate Association Rule

6 Filter Data Top 3000 list Top 3000 list Filter data based on the list Filter data based on the list Calculate the frequency of different items Calculate the frequency of different items

7 Filter Data 0014/1/2007Banana 00111/1/2007 Dog food 00111/1/2007Carrot 00111/1/2007Banana 0022/1/2007Lettuce 0024/1/2007Icecream 0014/1/2007Broccoli 0034/1/2007PotatoBananaCarrot....

8 Group Data Group the data which has same user_id and date Group the data which has same user_id and date {001,4/1/2007, Banana,Banana,Broccoli} {001,4/1/2007, Banana,Banana,Broccoli} most frequent least frequent most frequent least frequent 0014/1/2007Banana 00111/1/2007Carrot 00111/1/2007Carrot 0014/1/2007Banana 0022/1/2007Lettuce 0024/1/2007Icecream 0014/1/2007Broccoli 0034/1/2007Potato

9 Remove Duplicate Data Remove the dupicate data has in the same group Remove the dupicate data has in the same group {001,4/1/2007, Banana,Banana,Broccoli}

10 Extract the Regular Customer 0014/1/2007 Banana, Broccoli 00111/1/2007 Banana, Carrot 00111/8/2007 Carrot, Broccoli, Lettuce 0024/1/2007 Banana, Lettuce 0062/1/2007Lettuce 0024/1/2007 Banana, Lettuce, Icecream... >=24

11 Generate Association Rule Generate the Association rule By FP-Tree Algorithm Generate the Association rule By FP-Tree Algorithm Support = 5.0 Support = 5.0 Confidence = 20.0 Confidence = 20.0 UserID: UserID: 6008944149905810 6008944149905810

12 The Results—Filter Data Filter dataset based on top 3000 list Filter dataset based on top 3000 list Split dataset based on area (Burnside, Elizabeth) Split dataset based on area (Burnside, Elizabeth) Burnside contains 1.9 millions records Burnside contains 1.9 millions records Elizabeth has 1.5 millions transaction Elizabeth has 1.5 millions transaction

13 Top 50 Frequent Items in Burnside

14 Top 50 Frequent Items in Elizabeth

15 The Results—Group Data 247292 grouped transactions in Burnside 247292 grouped transactions in Burnside 165480 transactions in Elizabeth 165480 transactions in Elizabeth The largest single purchase by a customer is 29 items in Burnside The largest single purchase by a customer is 29 items in Burnside The largest single purchase by a customer is 51 items in Burnside The largest single purchase by a customer is 51 items in Burnside

16 The Results— Regular Customers Shop 462 Shop 462 total: 165480 total: 165480 regular customer: 1994 regular customer: 1994 regular transactions: 92978 regular transactions: 92978 unreguar customer: 17448 unreguar customer: 17448 unregular transactions:72502 unregular transactions:72502 Shop 453 Shop 453 total: 247292 total: 247292 regular customers: 3200 regular customers: 3200 regular transactions:171009 regular transactions:171009 unregular customer: 20227 unregular customer: 20227 unregular transactions: 76283 unregular transactions: 76283

17 The Results—FP-Tree Top 10 Association Rules in Burnside Top 10 Association Rules in Burnside {AVOCADO, BROCCOLI} -->BANANAS 53.03% {CUCUMBERS,BROCCOLI}-->BANANAS APPLES --> BANANAS PEARS --> BANANAS {AVOCADO, ONIONS} -->BANANAS ORANGES --> BANANAS {BROCCOLI, ONIONS}-->BANANAS {BROCCOLI, ZUCCHINI}-->BANANAS MANDARINS --> BANANAS

18 The Results—FP-Tree Top 10 Association Rules in Elizabeth Top 10 Association Rules in Elizabeth {CUCUMBERS, BROCCOLI} -->BANANAS PEARS --> BANANAS ORANGES --> BANANAS GRAPE --> BANANAS {BROCCOLI, SMART BUY CARROTS} -->BANANAS WATERMELON --> BANANAS {CUCUMBERS, SMART BUY CARROTS} -->BANANAS STRAWBERRIES --> BANANAS CARROTS --> BANANAS {BANANAS, LETTUCE} --> CUCUMBERS

19 Analysis--Nutrition Condition Burnside Burnside Better purchase habits Better purchase habits Prefer health food over unhealthy food Prefer health food over unhealthy food The top of the association rules The top of the association rules Fruit with fruit Fruit with fruit Fruit with vegetables Fruit with vegetables Vegetables with vegetables Vegetables with vegetables Milk with fruit Milk with fruit

20 Analysis--Nutrition Condition Burnside Burnside The purchase habits are varied more purchases of soft drink than Burnside The purchase habits are varied more purchases of soft drink than Burnside The confidence of the healthy food associations are lower than at Burnsides The confidence of the healthy food associations are lower than at Burnsides

21 Analysis—FP-Tree Burnside Burnside

22 Analysis—FP-Tree Elizabeth Elizabeth

23 Conclusion The association rules are generated from regular customers has strong relations The association rules are generated from regular customers has strong relations If the regular customers are the majority in the all customers dataset, the association rules have strong relations than non- regular customers If the regular customers are the majority in the all customers dataset, the association rules have strong relations than non- regular customers

24 Thanks


Download ppt "Discover The Association Rules of Different Patterns Xuemei Fan."

Similar presentations


Ads by Google