# Hypothesis testing and parameter estimation Bhuvan Urgaonkar “Empirical methods in AI” by P. Cohen.

## Presentation on theme: "Hypothesis testing and parameter estimation Bhuvan Urgaonkar “Empirical methods in AI” by P. Cohen."— Presentation transcript:

Hypothesis testing and parameter estimation Bhuvan Urgaonkar “Empirical methods in AI” by P. Cohen

2 System behavior in unknown situations  Self-tuning systems ought to behave properly in situations not previously encountered  How to quantify the goodness of a system in dealing with unknown situations?  Statistical inference is one way

3 Statistical inference  Process of drawing inference about an unseen population given a relatively small sample  Populations and samples  Statistics: Functions on samples  Parameters: Functions on populations

4 Examples  Example 1: Toss a fair coin –Parameter: number of heads in 10 tosses –Can be determined analytically  Example 2: Two chess programs A and B play 15 games, A wins 10, draws 2, loses 3. –Parameter: probability that A wins –Population of all possible chess games too large to enumerate => we cannot know the exact value Can estimate p win as p=0.67 p is a statistic derived from the above sample

5 Two kinds of statistical inference  Hypothesis testing: Answer a yes-or-no question about a population and assess the probability that the answer is wrong –Assume p win =0.5 and assess the probability of the sample result p=0.67 –If this is very small, A and B are not equal  Parameter estimation: Estimate the true value of a parameter given a statistic –If p=0.67, what is the “best” estimate of p win –How wide an interval to draw around p to be confident that p win falls within it?

6 Two kinds of statistical inference  Hypothesis testing: Answer a yes-or-no question about a population and assess the probability that the answer is wrong –Assume p win =0.5 and assess the probability of the sample result p=0.67 –If this is very small, A and B are not equal  Parameter estimation: Estimate the true value of a parameter given a statistic –If p=0.67, what is the “best” estimate of p win –How wide an interval to draw around p to be confident that p win falls within it?

7 Hypothesis testing example  Two programs A and B that summarize news stories –Performance measured as recall, the proportion of the important parts of a story that make it into the summary  Suppose you run A every day for 120 days and record mean recall scores of 10 stories  Then you run B and want to answer: –Is B better than A?

8 Hypothesis testing steps  Formulate a null hypothesis –mean(A) = mean(B)  Gather a sample of 10 news stories and run them through B. Call the sample mean Emean(B)  Assuming the null hypothesis is right, estimate the distribution of mean recall scores for all possible samples of size 10 run through B  Calculate the probability of obtaining Emean(B) given this distribution  If this probability is low, reject the null hypothesis

9 Hypothesis testing steps  Formulate a null hypothesis –mean(A) = mean(B)  Gather a sample of 10 news stories and run them through B. Call the sample mean Emean(B)  Assuming the null hypothesis is right, estimate the distribution of mean recall scores for all possible samples of size 10 run through B  Calculate the probability of obtaining Emean(B) given this distribution  If this probability is low, reject the null hypothesis

10 Sampling distributions  Distribution of a statistic calculated from all possible samples of a given size, drawn from a given population  Example: Two tosses of a fair coin; sample statistic be the number of heads –Sampling distribution is discrete –Elements are 0, 1, 2 with probabilities 0.25, 0.5, 0.25  How to get sampling distributions?

11 Exact sampling distributions  Coin tossed 20 times, num. heads=16 –Is the coin fair?  Sampling distribution of the proportion p h under the null hypo that the coin is fair  Easy to calculate exact probabilities of all the values for p h for N coin tosses –Possible values: 0/N, 1/N, …, N/N –Pr(p h =i/N) = N! * 0.5 N / i! * (N-i)! –Pr(p h =16/20) = 0.0046 --- next to impossible!

12 Estimated sampling distributions  Unlike the sampling distribution of the proportion, that of the mean cannot be calculated exactly. –Recall the news story example  It can, however, be estimated due to a remarkable theorem

13 Central limit theorem  The sampling distribution of the mean of samples of size N approaches a normal distribution as N increases. –If samples are drawn from a population with mean M and std. dev SD, then the mean of the sampling distribution is M, its std. dev is SD/sqrt(N) –This holds irrespective of the shape of the population distribution!

14 The missing piece in hypothesis testing  Null hypothesis –mean(A) = mean(B)  We don’t know the distribution of mean(B), but we do know the distribution of Emean(A)! –CLT: Emean(A) = mean (A) = mean (B)

15 Computer-aided methods for estimating sampling distributions  Use simulation to estimate the sampling distribution  Monte Carlo tests –If population distribution is known but not the sampling distribution of the test statistic –Derive samples from this known distribution  Bootstrap methods –Population distribution is unknown –Idea: Resample from the sample (treat the sample as the population!)

16 Other related concepts/techniques  Hypotheses tests that work under different conditions –Z-test, t-test (small values of N) –Ref: Paul Cohen  Parameter estimation –Confidence intervals –Analysis of variance: interaction among variables –Contingency tables –Ref: Paul Cohen  Expectation maximization –X: observed data, Z: unobserved, Let Y=X U Z –Searches for h that maximizes E[ln P(Y | h)] –Ref: “Machine Learning” by Tom Mitchell

Download ppt "Hypothesis testing and parameter estimation Bhuvan Urgaonkar “Empirical methods in AI” by P. Cohen."

Similar presentations