Presentation is loading. Please wait.

Presentation is loading. Please wait.

Association Rules (market basket analysis) Retail shops are often interested in associations between different items that people buy. Someone who buys.

Similar presentations


Presentation on theme: "Association Rules (market basket analysis) Retail shops are often interested in associations between different items that people buy. Someone who buys."— Presentation transcript:

1 Association Rules (market basket analysis) Retail shops are often interested in associations between different items that people buy. Someone who buys bread is quite likely also to buy milk A person who bought the book Database System Concepts is quite likely also to buy the book Operating System Concepts. Associations information can be used in several ways. E.g. when a customer buys a particular book, an online shop may suggest associated books. Association rules: bread  milk DB-Concepts, OS-Concepts  Networks Left hand side: antecedent, right hand side: consequent An association rule must have an associated population; the population consists of a set of instances E.g. each transaction (sale) at a shop is an instance, and the set of all transactions is the population

2 Association Rule Definitions Set of items: I={I 1,I 2,…,I m } Transactions: D={t 1,t 2, …, t n }, t j  I Itemset: {I i1,I i2, …, I ik }  I Support of an itemset: Percentage of transactions which contain that itemset. Large (Frequent) itemset: Itemset whose number of occurrences is above a threshold.

3 Association Rules Example I = { Beer, Bread, Jelly, Milk, PeanutButter}

4 Association Rule Definitions Association Rule (AR): implication X  Y where X,Y  I and X  Y = the null set; Support of AR (s) X  Y: Percentage of transactions that contain X  Y Confidence of AR (  ) X  Y: Ratio of number of transactions that contain X  Y to the number that contain X

5 Association Rules Ex (cont’d)

6 Of 5 transactions, 3 involve both Bread and PeanutButter, 3/5 = 60% Of the 4 transactions that involve Bread, 3 of them also involve PeanutButter 3/4 = 75%

7 Association Rule Problem Given a set of items I={I 1,I 2,…,I m } and a database of transactions D={t 1,t 2, …, t n } where t i ={I i1,I i2, …, I ik } and I ij  I, the Association Rule Problem is to identify all association rules X  Y with a minimum support and confidence (supplied by user). NOTE: Support of X  Y is same as support of X  Y.

8 Association Rule Algorithm (Basic Idea)  Find Large Itemsets.  Generate rules from frequent itemsets. This is the simple naïve algorithm, better algorithms exist.

9 Association Rule Algorithm We are generally only interested in association rules with reasonably high support (e.g. support of 2% or greater) Naïve algorithm 1. Consider all possible sets of relevant items. 2. For each set find its support (i.e. count how many transactions purchase all items in the set). Large itemsets: sets with sufficiently high support Use large itemsets to generate association rules. From itemset A generate the rule A - {b}  b for each b  A. Support of rule = support (A). Confidence of rule = support (A ) / support (A - {b})

10 From itemset A generate the rule A - {b}  b for each b  A. Support of rule = support (A). Confidence of rule = support (A ) / support (A - {b}) Lets say itemset A = {Bread, Butter, Milk} Then A - {b}  b for each b  A includes 3 possibilities {Bread, Butter}  Milk {Bread, Milk}  Butter {Butter, Milk}  Bread

11 Apriori Large Itemset Property: Any subset of a large itemset is large. Contrapositive: If an itemset is not large, none of its supersets are large.

12 Large Itemset Property

13 If B is not frequent, then none of the supersets of B can be frequent. If {ACD} is frequent, then all subsets of {ACD} ({AC}, {AD}, {CD}) must be frequent. If {ACD} is frequent, then all subsets of ({A}, {A}, {C}) must be frequent.

14 My Personal View of Association Rules Vastly over studied problem, of dubious utility

15 Student Presentations Starting next week students will be giving presentations Presentation can be on The student project A paper chosen by the student (per my approval) The presentation should last 8 to15 minutes. You need to tell me in advance how long the talk will be. You must email me the slides by midnight, before the talk There will be a signup sheet (topic and date) on my door tomorrow.

16 Tips for Giving a Good Talk Winter 2003 Dr Eamonn Keogh Computer Science & Engineering Department University of California - Riverside Riverside,CA 92521 eamonn@cs.ucr.edu Modified from the notes of Edward R. Tufte, Craig S. Kaplan, Eamonn Keogh and others

17 Outline Advice on giving talks General advice Organization Making clear overheads Avoiding common pitfallsConclusion

18 Show up early. You may have a chance to head off some technical or ergonomic problem. Have a backup plan. If your lecture is based on a PowerPoint presentation, have overhead backups of each page. Check out the room ahead of time. Before your talk, check out the room, and make sure it has everything you need. General Advice I

19 Never apologize. Most people wouldn’t have noticed the issues for which you’re apologizing—and it just sounds lame. Invest in a laser pointer. They are inexpensive, and are extremely useful. Rehearse timing. This is the most common sin!!! General Advice II

20 Overheads I Use large fonts. Use the biggest fonts realistically possible. Small fonts are hard to read Use highly contrasting colors. Avoid busy backgrounds. Too much in the background makes the text hard to read

21 Overheads II Avoid using red text. Red text is often hard to read. AVOID ALL CAPS! All caps look like you're shouting. …Include a good combination of words, pictures, and graphics. A variety keeps the presentation interesting

22 Overheads III Be Terse The sales forecasts show an increase on the horizon. Sales are up. Use bullets or numbered items appropriately Goals Ease of use Reusability Reliability Outline of our method 1.Design 2.Implementation 3.Testing

23 Overheads IIII Begin with an introduction slide (Who you are, why you are giving a talk, the title of the talk) Next, give an outline (“roadmap”). For a short talk, you might want to combine this with the above State your point (one simple slide) Demonstrate your point (a few slides) Review your point (one simple slide)

24 Overheads V End with a slide that reviews the entire talk… We introduced the TSP problem We explained why it is an important problem We explained why it is a hard problem We introduced a new heuristic to solve TSP We empirically demonstrated the utility of our approach End “cleanly”, don’t fade away.

25 Overheads VI Avoid using “standard” clipart/ background etc I have seen this at least 20 times in conference presentations.

26 Overheads VII Be careful with Acronyms… C_max C_min Range i, Diameter i R 1, D 1 R 2, D 2 Neighboring Unlabeled Token: sskh f dhfa

27 Annoying Personal Habits I (This means you) Playing with jewelry Licking and/or biting your lips Constantly adjusting your glasses Popping the top of a pen Playing with facial hair (men) Playing with/twirling your hair (women)

28 Annoying Personal Habits II (This means you) Jingling change in your pocket Leaning against anything for support Fillers: “ah”, “um”, and “and” Starting every sentence with the same word Sticky floor syndrome Avoiding eye contact Lack of enthusiasm “Basically” and “essentially” seem to be the current favorites.

29 Conclusion We have motivated the need for a high quality talk We have seen various tips on creating high quality overheads We have seen various hints on avoiding common pitfalls

30 Questions? Dr Eamonn Keogh Computer Science & Engineering Department University of California - Riverside Riverside,CA 92521 eamonn@cs.ucr.edu


Download ppt "Association Rules (market basket analysis) Retail shops are often interested in associations between different items that people buy. Someone who buys."

Similar presentations


Ads by Google