Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

Slides:



Advertisements
Similar presentations
This presentation is made available through a Creative Commons Attribution- Noncommercial license. Details of the license and permitted uses are available.
Advertisements

STA305 Spring 2014 This started with excerpts from STA2101f13
A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
Chapter 10, part D. IV. Inferences about differences between two population proportions You will have two population proportions, p1 and p2. The true.
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
Confidence Interval and Hypothesis Testing for:
Single-Sample t-Test What is the Purpose of a Single-Sample t- Test? How is it Different from a z-Test?What Are the Assumptions?
Nemours Biomedical Research Statistics March 19, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
Today Today: Chapter 9 Assignment: Recommended Questions: 9.1, 9.8, 9.20, 9.23, 9.25.
DATA ANALYSIS I MKT525. Plan of analysis What decision must be made? What are research objectives? What do you have to know to reach those objectives?
The z-Test What is the Purpose of a z-Test? What are the Assumptions for a z- Test? How Does a z-Test Work?
Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College.
CSE 221: Probabilistic Analysis of Computer Systems Topics covered: Statistical inference (Sec. )
Horng-Chyi HorngStatistics II_Five43 Inference on the Variances of Two Normal Population &5-5 (&9-5)
Today Today: Chapter 9 Assignment: 9.2, 9.4, 9.42 (Geo(p)=“geometric distribution”), 9-R9(a,b) Recommended Questions: 9.1, 9.8, 9.20, 9.23, 9.25.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
Inference about a Mean Part II
Statistics 101 Class 9. Overview Last class Last class Our FAVORATE 3 distributions Our FAVORATE 3 distributions The one sample Z-test The one sample.
Independent Sample T-test Often used with experimental designs N subjects are randomly assigned to two groups (Control * Treatment). After treatment, the.
Independent Sample T-test Classical design used in psychology/medicine N subjects are randomly assigned to two groups (Control * Treatment). After treatment,
5-3 Inference on the Means of Two Populations, Variances Unknown
1 Confidence Intervals for Means. 2 When the sample size n< 30 case1-1. the underlying distribution is normal with known variance case1-2. the underlying.
The Neymann-Pearson Lemma Suppose that the data x 1, …, x n has joint density function f(x 1, …, x n ;  ) where  is either  1 or  2. Let g(x 1, …,
Study Design and Analysis in Epidemiology: Where does modeling fit? Meaningful Modeling of Epidemiologic Data, 2010 AIMS, Muizenberg, South Africa Steve.
Methods to Study and Control Diseases in Wild Populations Steve Bellan, MPH Department of Environmental Sci, Pol & Mgmt University of California at Berkeley.
Additional Slides on Bayesian Statistics for STA 101 Prof. Jerry Reiter Fall 2008.
Maximum Likelihood See Davison Ch. 4 for background and a more thorough discussion. Sometimes.
STATISTICAL INFERENCE PART I POINT ESTIMATION
Chapter 9 Hypothesis Testing and Estimation for Two Population Parameters.
01/20151 EPI 5344: Survival Analysis in Epidemiology Maximum Likelihood Estimation: An Introduction March 10, 2015 Dr. N. Birkett, School of Epidemiology,
Multinomial Distribution
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
This presentation is made available through a Creative Commons Attribution- Noncommercial license. Details of the license and permitted uses are available.
S-012 Testing statistical hypotheses The CI approach The NHST approach.
Inference for 2 Proportions Mean and Standard Deviation.
Week 8 October Three Mini-Lectures QMM 510 Fall 2014.
1 CHAPTER 4 CHAPTER 4 WHAT IS A CONFIDENCE INTERVAL? WHAT IS A CONFIDENCE INTERVAL? confidence interval A confidence interval estimates a population parameter.
© Copyright McGraw-Hill 2004
Inferences Concerning the Difference in Population Proportions (9.4) Previous sections (9.1,2,3): We compared the difference in the means (  1 -  2 )
Various Topics of Interest to the Inquiring Orthopedist Richard Gerkin, MD, MS BGSMC GME Research.
Fitting Mathematical Models to Data Adapting Likelihood Based Inference Meaningful Modeling of Epidemiologic Data, 2010 AIMS, Muizenberg, South Africa.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
M.Sc. in Economics Econometrics Module I Topic 4: Maximum Likelihood Estimation Carol Newman.
ES 07 These slides can be found at optimized for Windows)
Goodness-of-fit (GOF) Tests Testing for the distribution of the underlying population H 0 : The sample data came from a specified distribution. H 1 : The.
Lecture 11 Dustin Lueker.  A 95% confidence interval for µ is (96,110). Which of the following statements about significance tests for the same data.
Sample Size Needed to Achieve High Confidence (Means)
Lec. 19 – Hypothesis Testing: The Null and Types of Error.
Chapter 14 Single-Population Estimation. Population Statistics Population Statistics:  , usually unknown Using Sample Statistics to estimate population.
Inference about the slope parameter and correlation
Statistical Estimation
Chapter 9 Hypothesis Testing.
Chapter 24 Comparing Means.
Chapter 4. Inference about Process Quality
STAT 312 Chapter 7 - Statistical Intervals Based on a Single Sample
Math 4030 – 10b Inferences Concerning Variances: Hypothesis Testing
Hypothesis Tests: One Sample
Quantitative Methods PSY302 Quiz Chapter 9 Statistical Significance
When we free ourselves of desire,
Problems: Q&A chapter 6, problems Chapter 6:
Statistical Inference
Lecture 2 Interval Estimation
Interval Estimation and Hypothesis Testing
CHAPTER 12 Inference for Proportions
CHAPTER 12 Inference for Proportions
Confidence Interval.
Statistical Inference for the Mean: t-test
Presentation transcript:

Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental Science, Policy & Management University of California at Berkeley

barplot(dbinom(x = 0:100, size = 100, prob =.3), names.arg = 0:size) In a population of 1,000,000 people with a true prevalence of 30%, the probability distribution of number of positive individuals if 100 are sampled:

> rbinom(n = 1, size = 100, prob =.3) [1] 28 We sample 100 people once and 28 are positive:

> rbinom(n = 1, size = 100, prob =.3) [1] 28 We sample 100 people once and 28 are positive: We don’t know the true prevalence! But we can calculate the probability of 28 or a more extreme value occurring for a given prevalence.

We sample 100 people once and 28 are positive. p-value = 0.74 > 2*pbinom(28,100,.3)) [1] Cumulative Probability & P Values for 30% prevalence: p(28 or a more extreme value occurring) =

x2

Which hypotheses do we reject? IF GIVEN THE HYPOTHESIS p value < cutoff THEN REJECT HYPOTHESIS Cutoff usually chosen as α = 0.05

Which hypotheses do we reject?

Which hypotheses do we NOT reject: CONFIDENCE INTERVAL

We don’t know the true prevalence, but the probability that we had exactly 28/100 with 30% prevalence is: > dbinom(x = 28, size = 100, prob =.3) [1] > rbinom(n = 1, size = 100, prob =.3) [1] 28 We sample 100 people once and 28 are positive: Let’s take another approach

Which prevalence gives the greatest probability of observing exactly 28/100?

Which of these prevalence values is most likely given our data?

Maximum Likelihood Estimate parameter value giving greatest probability of the data having occurred. MLE = 28/100 = 0.28 What do you think is the MLE here? true unknown value = 0.30 different null hypotheses

Defining Likelihood L(parameter | data) = p(data | parameter) Not a probability distribution. Probabilities taken from many different distributions. function of x PDF: function of p LIKELIHOOD:

Deriving the Maximum Likelihood Estimate maximize minimize

Deriving the Maximum Likelihood Estimate

The proportion of positives! XX

Maximum Likelihood Estimate

Building Confidence Intervals Likelihood Ratio Test > qchisq(p =.95, df = 1) [1] So if our α =.05, then we reject any null hypothesis for which When l MLE - l null > 1.92, we reject that null hypothesis. If the null hypothesis were true then

Let’s zoom in… Building Confidence Intervals Likelihood Ratio Test Maximum Likelihood Estimate

Building Confidence Intervals Likelihood Ratio Test

Comparing Confidence Intervals

Advantages of Likelihood Practical method for estimating parameters estimating variance of our estimates Easily adaptable to different probability distributions & dynamic models

This presentation is made available through a Creative Commons Attribution-Noncommercial license. Details of the license and permitted uses are available at © 2010 Steve Bellan and the Meaningful Modeling of Epidemiological Data Clinichttp://creativecommons.org/licenses/by-nc/3.0/ Title: Introduction to Likelihood Attribution: Steve Bellan, Clinic on the Meaningful Modeling of Epidemiological Data Source URL: For further information please contact Steve Bellan