The Detail of the Normal Distribution

Slides:



Advertisements
Similar presentations
Probability Distributions CSLU 2850.Lo1 Spring 2008 Cameron McInally Fordham University May contain work from the Creative Commons.
Advertisements

Chapter 7 Introduction to Sampling Distributions
Sampling Distributions & Point Estimation. Questions What is a sampling distribution? What is the standard error? What is the principle of maximum likelihood?
Central Tendency and Variability
© Copyright McGraw-Hill CHAPTER 6 The Normal Distribution.
PARAMETRIC STATISTICAL INFERENCE
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 08/10/ :23 PM 1 Some basic statistical concepts, statistics.
Normal Distribution Introduction. Probability Density Functions.
Medical Statistics as a science
Two Main Uses of Statistics: 1)Descriptive : To describe or summarize a collection of data points The data set in hand = the population of interest 2)Inferential.
Statistics What is statistics? Where are statistics used?
Sampling Distributions Sampling Distributions. Sampling Distribution Introduction In real life calculating parameters of populations is prohibitive because.
CIVE Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 4 Probability distributions -Poisson (discrete events) -Binomial.
Confidence Intervals Dr. Amjad El-Shanti MD, PMH,Dr PH University of Palestine 2016.
Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Estimation and Confidence Intervals Chapter 9.
Sampling Distributions
Chapter 6 The Normal Distribution and Other Continuous Distributions
Variability.
Chapter Six Summarizing and Comparing Data: Measures of Variation, Distribution of Means and the Standard Error of the Mean, and z Scores PowerPoint Presentation.
GOVT 201: Statistics for Political Science
Continuous Probability Distributions
Practice & Communication of Science From Probability to Distributions
Normal Distribution Prepared by: Ameer Sameer Hamood
Sampling Distribution Models
Sampling Distributions
Normal Distribution and Parameter Estimation
Practice & Communication of Science From Distributions to Confidence
Using the t-distribution
Statistics: The Z score and the normal distribution
Chapter 6. Continuous Random Variables
Chapter 6: Sampling Distributions
Practice & Communication of Science
Having Confidence in our Means: Confidence Intervals
Distribution of the Sample Means
Sampling Distributions
Sampling Distributions & Point Estimation
Central Tendency and Variability
Summary descriptive statistics: means and standard deviations:
Chapter 7 Sampling Distributions.
Chapter 5 Sampling Distributions
From Probability to Distributions
From Distributions to Confidence
Introduction to Summary Statistics
The normal distribution
Chapter 7 Sampling Distributions.
Introduction to Summary Statistics
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Fall 2017 Room 150 Harvill Building 10: :50 Mondays, Wednesdays.
Review: What influences confidence intervals?
Econ 3790: Business and Economics Statistics
Inferential Statistics
Chapter 5 Sampling Distributions
AP Biology Intro to Statistic
Chapter 9.1: Sampling Distributions
AP Biology Intro to Statistic
Samples and Populations
AP Biology Intro to Statistic
Chapter 7 Sampling Distributions.
CHAPTER 15 SUMMARY Chapter Specifics
Sampling Distributions
Normal Distribution Z-distribution.
Chapter 7 Sampling Distributions.
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Spring 2019 Room 150 Harvill Building 9:00 - 9:50 Mondays, Wednesdays.
Chapter 7 Sampling Distributions.
Objectives 6.1 Estimating with confidence Statistical confidence
Objectives 6.1 Estimating with confidence Statistical confidence
Scientific Practice The Detail of the Normal
Data Literacy Graphing and Statisitics
Chapter 4 (cont.) The Sampling Distribution
Presentation transcript:

The Detail of the Normal Distribution Scientific Practice The Detail of the Normal Distribution

The Binomial Distribution This distribution can be seen when the outcomes have discrete values… eg rolling dice Assumptions… Fixed number of trials eg we will roll the dice 10 times Independent trials one roll cannot influence another Two different classifications rolled/didn’t roll a 12 = ‘success/failure’ Probability of success stays the same for all trials didn’t add extra dice half way through

Rolling Dice One die… outcome values are 1, 2, 3, 4, 5 or 6 each equally probable (1 in 6) distribution is… boring!

Rolling Dice Two dice… outcome values are 2,3,4,5,6,7,8,9,10,11,12 36 ways of making these each not equally probable only 1 way to get 2 (1+1), 3 ways to get 4, etc distribution is… slightly less boring!

Rolling Dice Three dice… outcome values are 3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18 216 ways of making these each not equally probable 27 ways to throw a 10 or 11, only 1 to get a 3 or 18 distribution is… starting to curve

Rolling Dice Four dice… outcome values are 4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,18,20,21, 22,23,24 1926 ways of making these each not equally probable distribution is… looking familiar!

Rolling Dice 24 dice… outcome values are 24 144 4.73838134 × 1018 ways of making these! each not equally probable distribution is… looking very familiar! still discrete outcomes

Rolling Dice Infinite number of dice… outcome values are no longer discrete but continuous the Binomial Distribution becomes known as… …the Normal Distribution/Bell Curve Substitute dice for something like height and… height, being determined by the sum effect of a large number of factors (genes, nutrition, etc)… looks like a continuous variable approximates the Normal Distribution Ie its variation becomes definable/predictable we can expect our data to behave in a certain way

The Normal Distribution Represents the idealised distribution of a large number of things we measure in biology many parameters approximate to the ND Is defined by just two things… population mean µ (mu) the centre of the distribution (mean=median=mode) population standard deviation (SD) σ (sigma) the distribution ‘width’ (mean  point of inflexion) encompasses 68% of the area under the curve 95% of area found within 1.96 σ either side of mean

The Normal Distribution Is symmetrical mean=median=mode

The Normal Distribution One SD either side of mean includes 68% of represented population SD boundary is inflexion point curvature changes direction the ‘s’ bit 2 SD covers 95% 3 SD covers 99.7%

The Normal Distribution All Normal Distributions are similar differ in terms of… mean SD (governs how ‘spikey’ curve is) Fig below… 4 different SDs, 2 different means

Standardising Normal Distributions Regardless of what they measure, all Normal Distributions can be made identical by… subtracting the mean from every reading the mean then becomes zero dividing each reading by the SD a reading one SD bigger  +1 Called Standard Scores or z-scores amazing! Different measurements  same ‘view’

Standard (z) Scores A ‘pure’ way to represent data distribution the actual measurements (mg, m, sec) disappear! replaced by number of SDs from the mean (zero) For any reading, z = (x - µ) / σ A survey of daily travel time had these results (in minutes): 26,33,65,28,34,55,25,44,50,36,26,37,43,62,35,38,45,32,28,34 The Mean is 38.8 min, and the SD is 11.4 min To convert the values to z-scores… eg to convert 26 first subtract the mean: 26 - 38.8 = -12.8, then divide by the Standard Deviation: -12.8/11.4 = -1.12 So 26 is -1.12 Standard Deviations from the Mean

Familiarity with the Normal Distribution 95% of the class are between 1.1 and 1.7m tall what is the mean and SD? Assuming normal distribution… the distribution is symmetrical, so mean height is (1.7 - 1.1) / 2 = 1.4m the range 1.1  1.7m covers 95% of the class, which equals ± 2 SDs one SD = (1.7 – 1.1) / 4 = 0.6 / 4 = 0.15m

Familiarity with the Normal Distribution One of that class is 1.85m tall what is the z-score of that measurement? Assuming normal distribution… z-score = (x - µ) / σ z = (1.85m - 1.4m) / 0.15m = 0.45m / 0.15m = 3 note there are no units 3 SDs cover 99.7% of the population only 1.5 in 1000 of the class will be as tall/taller a big class, with fractional students! 

Familiarity with the Normal Distribution 36 students took a test; you were 0.5 SD above the average; how many students did better? from the curve, 50% sit above zero from the curve, 19.1% sit between 0 and 0.5 SD so 30.9% sit above you 30.9% of 36 is about 11

Familiarity with the Normal Distribution Need to have a ‘feel’ for this…

Populations and Samples – a Diversion A couple of seemingly pedantic but important points about distributions… population the potentially infinite group on which measurements might be made don’t often measure the whole population sample a sub-set of the population on which measurements are actually made most studies will sample the population n is the number studied n-1 called the ‘degrees of freedom’ often extrapolate sample results to the population

Populations and Samples – so what? The two are described/calculated differently… μ is the population mean, x is the sample mean σ or σn is population SD, s or σn-1 is sample SD Calculating the SD is different for each most calculators do it for you… as long as you choose the right type (pop vs samp)

Populations and Samples – choosing Analysing the results of a class test… population, since you don’t intend extrapolating the results to all students everywhere Analysing the results of a drug trial… sample, since you expect the conclusions to apply to the larger population A national census collects information about age population, since by definition the census is about the population taking part in the survey If in doubt, use the sample SD and as n increases, the difference decreases

Populations and Samples – implications The sample mean and SD are estimates of the population mean and the population SD ie you calculate σn-1 (or s) If the sample observed is the population, then the mean and SD of that sample are the population mean and the population SD ie you calculate σ (or σn)

Implications of Estimating Pop Mean For a sample, the ‘quality’ of the estimate of the population mean and SD depends on the number of observations made if you sampled, say, 1 member of the population, it’s unlikely to be close to the population mean if you sampled the whole population, your estimate is the population mean in between, adding extra samples will improve estimate sampling different amounts  a variety of means that set of means will have its own SD (!) called the Standard Error of the Mean (SEM)

The Standard Error of the Mean Recap… each sampling of a distribution will produce a different estimate of the population mean the variation in those estimates called the SEM Surprisingly easy to calculate SEM = sample standard dev / square root of number of samples SEM = s / √ N eg if N=16, then SEM is 4x smaller than SD

Summary The Binomial Distribution is a basic distribution eg rolling dice With lots of dice Binomial Dist  Normal Dist Normal Dist fully defined just by mean and SD Transformation to z-scores makes all NDs identical SD calculation differs for sample vs population sample is a subset of the whole population population is, erm, the whole population Estimation of population mean from a sample is always prone to uncertainty Standard Error of Mean (s/√N) reflects uncertainty