Presentation is loading. Please wait.

Presentation is loading. Please wait.

By Arijit Chatterjee Dr

Similar presentations


Presentation on theme: "By Arijit Chatterjee Dr"— Presentation transcript:

1 MAXIMUM LIKELIHOOD USED TO CALCULATE CONFIDENCE IN ASSOCIATION RULE MINING
By Arijit Chatterjee Dr. William Perrizo Computer Science Department North Dakota State University

2 Contents Data Mining Applications Association Rule Mining
Market Basket Analysis Statistical Estimation and Maximum Likelihood Function MLF used to calculate confidence and collective confidence of Association Rules

3 Data Mining Data mining is becoming an important tool and a highly potent process capable of discovering and extracting important information from huge volumes of raw data. Data Mining is sometimes erroneously treated synonymously to the broader process of Knowledge and Data Discovery (KDD). Analyzing data from different perspectives and summarizing it into useful information.

4 Knowledge Discovery in Database(KDD)
Reference :

5 Applications of Data Mining
Business Science and Engineering Spatial Data Mining Surveillance Games

6 Data Mining Tasks Characterization – General characteristics and features of a specific class of data. Discrimination – Characteristics in comparison with contrasting classes. Classification – Using training dataset with known class samples to come up with a model that predicts the unknown class labels of new samples.

7 continued….. Clustering – Process of grouping data objects into different clusters. Outlier Detection – Process of studying the objects that lie scattered around a dataset. Association Rule Mining – Popular method in finding interesting relations between variables in large databases.

8 Association Rules Association rule mining finds interesting associations and/or correlation relationships among large set of data items. Association rules show attribute value conditions that occur frequently together in a given dataset. A typical and widely-used example of association rule mining is Market Basket Analysis.

9 Market Basket Analysis
Definition: Market Basket Analysis sometimes also referred to as Association Analysis is a mathematical modeling technique based upon the theory that if you buy certain group of items ,you are likely to buy another group of items. Usage: Understanding behavior of shoppers Basket data consist of collection of transaction date and items bought in a transaction Item set Retail organizations interested in generating qualified decisions and strategy based on analysis of transaction data what to put on sale, how to place merchandise on shelves for maximizing profit, customer segmentation based on buying pattern

10 Rules and Examples Association rules provide information of this type in the form of "if-then" statements. These rules are computed from the data and, unlike the if-then rules of logic, association rules are probabilistic in nature.  Rule form: LHS -> RHS IF a customer buys diapers, THEN they also buy beer diapers -> beer “Transactions in which bread and butter are purchased also purchased milk” bread  butter  milk In association analysis the antecedent and consequent are sets of items (called item sets) that are disjoint (do not have any items in common). 

11 How can Maximum Likelihood Function be used to Calculate Confidence Of Association Rules In Market Baskets ?

12 Statistical Estimation
The Theory of Estimation requires a random sample x1, x2…………………xn on a variable x whose distribution in the population involves an unknown parameter . It is required to find an estimate of  on the basis of random values. The estimation is done in two different ways: i) Point Estimation and ii) Interval Estimation. In Point Estimation, the estimated value is given by a single quantity which is a function of sample observations (i.e. statistic). This function is called the ‘estimator’ of the parameter and the value of the estimator in a particular example is called an ‘estimate’. Consider, x1, x2…………………xn be a random sample from a population with probability mass function, f(x,), where  is the parameter. Then the joint distribution of the sample observations denoted as L is defined by: L = f(x1, ) f(x2, ) f(x3, ) f(x4,)..f(xn, )

13 Method of Maximum Likelihood
The method of Maximum Likelihood consists in choosing as an estimator of  and that statistic when substituted for  maximizes the likelihood function L. Such a statistic is called the Maximum Likelihood Estimator (M.L.E). Since log L is maximum when L is maximum the maximum likelihood estimator of  is obtained by maximizing log L. This is achieved by differentiating log L partially with respect to  and using the two relations as listed in the following: The Two Conditions:-  log L = when  =0  2 log L < 0 when  =0 2

14 Probability Distribution
Probability Distribution of a random variable is a statement specifying the set of its possible values together with their respective probabilities. When a random experiment is theoretically assumed to serve as a model, the probabilities can be given as a function of the random variable. The probability distribution concerned is then generally known as theoretical distribution. The Binomial Distribution is a discrete probability distribution and is defined by the p.m.f f(x) = nCx px (1-p)n-x (x = 0,1,2,………n) where p is the probability of success.

15 Example : If a supermarket database has 100,000 point of sale transactions, out of which 4,000 include both items A and B and 1000 of these include item C then the association rule “If A and B are purchased then C is purchased in the same trip” has a support of 1% and a confidence of 25%.

16 Can we use the Maximum Likelihood Function to calculate the confidence of an Association Rule ?
Let p be the probability that when items A and B are purchased then item C is also purchased of the transactions have items A and B and out of those 1000 include item C. So the probability that item C occurs 1000 times in 4000 trials is 4000C1000 p1000 (1-p) Since this is a single association rule then the likelihood function is given by L = 4000C1000 p1000 (1-p) = 4000C1000 p1000 (1-p) 3000 log L = log (4000C1000) log p log (1-p) Differentiating with respect to p on both sides :  Log L = 1000/p – 3000/ (1-p) P The maximum likelihood estimator po is therefore obtained by solving 1000/ po – 3000/ (1- po) = 0 Solving , po = .25

17 Using Maximum Likelihood Function to calculate collective Confidence of Association Rules
Let’s assume that out of n items in which A & B are together purchased x number of those items contain item C. If p is the probability of the item C to be occurring then the likelihood function of this rule can be expressed as follows: L1 = nCx px (1-p)n-x where (x = 0,1,2,………n) (i) If we have another independent transaction which shows that “When items D & E are purchased then also item C is also purchased in the same trip”. If out of m items in which items D & E are together purchased y number of those items contain C then the likelihood function of this association rule can be expressed as follows: L2 = mCy py (1-p)m-y where (y = 0,1,2,………m) (ii) Based on both transactions the collective confidence of both of the association rules can be thought to be the maximum likelihood estimate of buying item C whenever a transaction happens.

18 The likelihood function L is given by the product of these likelihood functions. L = L1 . L2 = nCx px (1-p)n-x . mCy py (1-p)m-y =( nCx mCy ) . px+y . (1-p)[(m+n)-(x+y)] log L=log( nCx mCy ) + (x+y)log p +[(m+n)-(x+y)]log(1-p) Differentiating with respect to p on both sides:  Log L = (x+y)/p – [(m+n)-(x+y)] / (1-p) P The maximum likelihood estimator po is therefore obtained by solving (x+y)/ po – [(m+n)-(x+y)] / (1- po) = 0 Solving, po = (x+y) / (m+n) The collective confidence of selecting item C from the association rules is thus found to be (x+y) / (m+n) .

19


Download ppt "By Arijit Chatterjee Dr"

Similar presentations


Ads by Google