Hypothesis testing and parameter estimation Bhuvan Urgaonkar “Empirical methods in AI” by P. Cohen.

Slides:



Advertisements
Similar presentations
Estimation of Means and Proportions
Advertisements

Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
Sampling: Final and Initial Sample Size Determination
Review of the Basic Logic of NHST Significance tests are used to accept or reject the null hypothesis. This is done by studying the sampling distribution.
6-1 Stats Unit 6 Sampling Distributions and Statistical Inference - 1 FPP Chapters 16-18, 20-21, 23 The Law of Averages (Ch 16) Box Models (Ch 16) Sampling.
Sampling Distributions (§ )
Business 205. Review Sampling Continuous Random Variables Central Limit Theorem Z-test.
The Normal Distribution. n = 20,290  =  = Population.
ESTIMATION AND CONFIDENCE INTERVALS Up to now we assumed that we knew the parameters of the population. Example. Binomial experiment knew probability of.
Sample size computations Petter Mostad
Chapter 9 Chapter 10 Chapter 11 Chapter 12
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 9: Hypothesis Tests for Means: One Sample.
BHS Methods in Behavioral Sciences I
1 Hypothesis Testing In this section I want to review a few things and then introduce hypothesis testing.
Class notes for ISE 201 San Jose State University
Statistical Methods in Computer Science Hypothesis Testing I: Treatment experiment designs Ido Dagan.
Chapter 3 Hypothesis Testing. Curriculum Object Specified the problem based the form of hypothesis Student can arrange for hypothesis step Analyze a problem.
Statistics 101 Class 9. Overview Last class Last class Our FAVORATE 3 distributions Our FAVORATE 3 distributions The one sample Z-test The one sample.
Sampling We have a known population.  We ask “what would happen if I drew lots and lots of random samples from this population?”
Introduction to Probability and Statistics Chapter 7 Sampling Distributions.
1 Inference About a Population Variance Sometimes we are interested in making inference about the variability of processes. Examples: –Investors use variance.
Stat 217 – Day 15 Statistical Inference (Topics 17 and 18)
Statistical Inference Lab Three. Bernoulli to Normal Through Binomial One flip Fair coin Heads Tails Random Variable: k, # of heads p=0.5 1-p=0.5 For.
Statistical Methods in Computer Science Hypothesis Testing I: Treatment experiment designs Ido Dagan.
5-3 Inference on the Means of Two Populations, Variances Unknown
The Sampling Distribution Introduction to Hypothesis Testing and Interval Estimation.
Hypothesis Testing and T-Tests. Hypothesis Tests Related to Differences Copyright © 2009 Pearson Education, Inc. Chapter Tests of Differences One.
- Interfering factors in the comparison of two sample means using unpaired samples may inflate the pooled estimate of variance of test results. - It is.
Sampling Distribution of the Mean Central Limit Theorem Given population with and the sampling distribution will have: A mean A variance Standard Error.
F OUNDATIONS OF S TATISTICAL I NFERENCE. D EFINITIONS Statistical inference is the process of reaching conclusions about characteristics of an entire.
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
Comparing Two Proportions
Statistical inference. Distribution of the sample mean Take a random sample of n independent observations from a population. Calculate the mean of these.
Sampling Distribution ● Tells what values a sample statistic (such as sample proportion) takes and how often it takes those values in repeated sampling.
Vegas Baby A trip to Vegas is just a sample of a random variable (i.e. 100 card games, 100 slot plays or 100 video poker games) Which is more likely? Win.
1 INTRODUCTION TO HYPOTHESIS TESTING. 2 PURPOSE A hypothesis test allows us to draw conclusions or make decisions regarding population data from sample.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
Chapter 10 – Sampling Distributions Math 22 Introductory Statistics.
Large sample CI for μ Small sample CI for μ Large sample CI for p
1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.
Sociology 5811: Lecture 11: T-Tests for Difference in Means Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Statistical Inference Statistical Inference involves estimating a population parameter (mean) from a sample that is taken from the population. Inference.
Chapter 10: Introduction to Statistical Inference.
Inferential Statistics. Coin Flip How many heads in a row would it take to convince you the coin is unfair? 1? 10?
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
: An alternative representation of level of significance. - normal distribution applies. - α level of significance (e.g. 5% in two tails) determines the.
© 2001 Prentice-Hall, Inc.Chap 7-1 BA 201 Lecture 11 Sampling Distributions.
12.1 Inference for A Population Proportion.  Calculate and analyze a one proportion z-test in order to generalize about an unknown population proportion.
Inen 460 Lecture 2. Estimation (ch. 6,7) and Hypothesis Testing (ch.8) Two Important Aspects of Statistical Inference Point Estimation – Estimate an unknown.
- We have samples for each of two conditions. We provide an answer for “Are the two sample means significantly different from each other, or could both.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Introduction to Inference Sampling Distributions.
Sample Size Needed to Achieve High Confidence (Means)
Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get.
PEP-PMMA Training Session Statistical inference Lima, Peru Abdelkrim Araar / Jean-Yves Duclos 9-10 June 2007.
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate its.
Review Statistical inference and test of significance.
Inference about the mean of a population of measurements (  ) is based on the standardized value of the sample mean (Xbar). The standardization involves.
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate accuracy.
Chapter 9 Introduction to the t Statistic
Statistical Inference
Statistical inference: distribution, hypothesis testing
Two-sided p-values (1.4) and Theory-based approaches (1.5)
Inferences on Two Samples Summary
Sampling Distribution Models
Econ 3790: Business and Economics Statistics
Sampling Distributions (§ )
How Confident Are You?.
Presentation transcript:

Hypothesis testing and parameter estimation Bhuvan Urgaonkar “Empirical methods in AI” by P. Cohen

2 System behavior in unknown situations  Self-tuning systems ought to behave properly in situations not previously encountered  How to quantify the goodness of a system in dealing with unknown situations?  Statistical inference is one way

3 Statistical inference  Process of drawing inference about an unseen population given a relatively small sample  Populations and samples  Statistics: Functions on samples  Parameters: Functions on populations

4 Examples  Example 1: Toss a fair coin –Parameter: number of heads in 10 tosses –Can be determined analytically  Example 2: Two chess programs A and B play 15 games, A wins 10, draws 2, loses 3. –Parameter: probability that A wins –Population of all possible chess games too large to enumerate => we cannot know the exact value Can estimate p win as p=0.67 p is a statistic derived from the above sample

5 Two kinds of statistical inference  Hypothesis testing: Answer a yes-or-no question about a population and assess the probability that the answer is wrong –Assume p win =0.5 and assess the probability of the sample result p=0.67 –If this is very small, A and B are not equal  Parameter estimation: Estimate the true value of a parameter given a statistic –If p=0.67, what is the “best” estimate of p win –How wide an interval to draw around p to be confident that p win falls within it?

6 Two kinds of statistical inference  Hypothesis testing: Answer a yes-or-no question about a population and assess the probability that the answer is wrong –Assume p win =0.5 and assess the probability of the sample result p=0.67 –If this is very small, A and B are not equal  Parameter estimation: Estimate the true value of a parameter given a statistic –If p=0.67, what is the “best” estimate of p win –How wide an interval to draw around p to be confident that p win falls within it?

7 Hypothesis testing example  Two programs A and B that summarize news stories –Performance measured as recall, the proportion of the important parts of a story that make it into the summary  Suppose you run A every day for 120 days and record mean recall scores of 10 stories  Then you run B and want to answer: –Is B better than A?

8 Hypothesis testing steps  Formulate a null hypothesis –mean(A) = mean(B)  Gather a sample of 10 news stories and run them through B. Call the sample mean Emean(B)  Assuming the null hypothesis is right, estimate the distribution of mean recall scores for all possible samples of size 10 run through B  Calculate the probability of obtaining Emean(B) given this distribution  If this probability is low, reject the null hypothesis

9 Hypothesis testing steps  Formulate a null hypothesis –mean(A) = mean(B)  Gather a sample of 10 news stories and run them through B. Call the sample mean Emean(B)  Assuming the null hypothesis is right, estimate the distribution of mean recall scores for all possible samples of size 10 run through B  Calculate the probability of obtaining Emean(B) given this distribution  If this probability is low, reject the null hypothesis

10 Sampling distributions  Distribution of a statistic calculated from all possible samples of a given size, drawn from a given population  Example: Two tosses of a fair coin; sample statistic be the number of heads –Sampling distribution is discrete –Elements are 0, 1, 2 with probabilities 0.25, 0.5, 0.25  How to get sampling distributions?

11 Exact sampling distributions  Coin tossed 20 times, num. heads=16 –Is the coin fair?  Sampling distribution of the proportion p h under the null hypo that the coin is fair  Easy to calculate exact probabilities of all the values for p h for N coin tosses –Possible values: 0/N, 1/N, …, N/N –Pr(p h =i/N) = N! * 0.5 N / i! * (N-i)! –Pr(p h =16/20) = next to impossible!

12 Estimated sampling distributions  Unlike the sampling distribution of the proportion, that of the mean cannot be calculated exactly. –Recall the news story example  It can, however, be estimated due to a remarkable theorem

13 Central limit theorem  The sampling distribution of the mean of samples of size N approaches a normal distribution as N increases. –If samples are drawn from a population with mean M and std. dev SD, then the mean of the sampling distribution is M, its std. dev is SD/sqrt(N) –This holds irrespective of the shape of the population distribution!

14 The missing piece in hypothesis testing  Null hypothesis –mean(A) = mean(B)  We don’t know the distribution of mean(B), but we do know the distribution of Emean(A)! –CLT: Emean(A) = mean (A) = mean (B)

15 Computer-aided methods for estimating sampling distributions  Use simulation to estimate the sampling distribution  Monte Carlo tests –If population distribution is known but not the sampling distribution of the test statistic –Derive samples from this known distribution  Bootstrap methods –Population distribution is unknown –Idea: Resample from the sample (treat the sample as the population!)

16 Other related concepts/techniques  Hypotheses tests that work under different conditions –Z-test, t-test (small values of N) –Ref: Paul Cohen  Parameter estimation –Confidence intervals –Analysis of variance: interaction among variables –Contingency tables –Ref: Paul Cohen  Expectation maximization –X: observed data, Z: unobserved, Let Y=X U Z –Searches for h that maximizes E[ln P(Y | h)] –Ref: “Machine Learning” by Tom Mitchell