Psyc 235: Introduction to Statistics

Slides:



Advertisements
Similar presentations
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 18 Sampling Distribution Models.
Advertisements

Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
Sampling Distributions
Segment 3 Introduction to Random Variables - or - You really do not know exactly what is going to happen George Howard.
CHAPTER 13: Binomial Distributions
Chapter 18 Sampling Distribution Models
McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.
Copyright © 2009 Pearson Education, Inc. Chapter 16 Random Variables.
Copyright (c) Bani K. Mallick1 STAT 651 Lecture #15.
Sampling Distributions
Part III: Inference Topic 6 Sampling and Sampling Distributions
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 15, Slide 1 Chapter 15 Random Variables.
Introduction to Probability and Statistics Chapter 7 Sampling Distributions.
 Binomial distributions for sample counts  Binomial distributions in statistical sampling  Finding binomial probabilities  Binomial mean and standard.
1 The Sample Mean rule Recall we learned a variable could have a normal distribution? This was useful because then we could say approximately.
Copyright © 2012 Pearson Education. All rights reserved Copyright © 2012 Pearson Education. All rights reserved. Chapter 10 Sampling Distributions.
Section 9.3 Sample Means.
Chapter 5 Sampling Distributions
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 16 Random Variables.
AP Statistics Chapter 9 Notes.
Agresti/Franklin Statistics, 1e, 1 of 139  Section 6.4 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.
Psyc 235: Introduction to Statistics DON’T FORGET TO SIGN IN FOR CREDIT!
Chapter 6 Lecture 3 Sections: 6.4 – 6.5.
Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems.
1 Chapter 16 Random Variables. 2 Expected Value: Center A random variable assumes a value based on the outcome of a random event.  We use a capital letter,
Psyc 235: Introduction to Statistics DON’T FORGET TO SIGN IN FOR CREDIT!
Copyright © 2009 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Psyc 235: Introduction to Statistics To get credit for attending this lecture: SIGN THE SIGN-IN SHEET
Introduction to Behavioral Statistics Probability, The Binomial Distribution and the Normal Curve.
Psyc 235: Introduction to Statistics DON’T FORGET TO SIGN IN FOR CREDIT!
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 18 Sampling Distribution Models.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 16 Random Variables.
Statistics 300: Elementary Statistics Sections 7-2, 7-3, 7-4, 7-5.
STA 2023 Module 5 Discrete Random Variables. Rev.F082 Learning Objectives Upon completing this module, you should be able to: 1.Determine the probability.
Chapter 6 Lecture 3 Sections: 6.4 – 6.5. Sampling Distributions and Estimators What we want to do is find out the sampling distribution of a statistic.
Introduction to Inference Sampling Distributions.
Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.
From the population to the sample The sampling distribution FETP India.
Sampling Distributions Chapter 18. Sampling Distributions If we could take every possible sample of the same size (n) from a population, we would create.
Copyright © 2010 Pearson Education, Inc. Chapter 16 Random Variables.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Statistics 16 Random Variables. Expected Value: Center A random variable assumes a value based on the outcome of a random event. –We use a capital letter,
Sampling Distributions Chapter 18. Sampling Distributions A parameter is a number that describes the population. In statistical practice, the value of.
Chapter 15 Random Variables.
Sampling Distribution Models
CHAPTER 14: Binomial Distributions*
INF397C Introduction to Research in Information Studies Spring, Day 12
Chapter 15 Random Variables
Chapter 16 Random Variables.
Lesson Objectives At the end of the lesson, students can:
Chapter 5 Sampling Distributions
Chapter 5 Sampling Distributions
Sampling Distribution Models
Chapter 16 Random Variables.
Chapter 15 Random Variables.
CHAPTER 29: Multiple Regression*
Sampling Distribution Models
Chapter 5 Sampling Distributions
Probability Key Questions
Random Variables Binomial Distributions
Probability Theory and Specific Distributions (Moore Ch5 and Guan Ch6)
Chapter 16 Random Variables Copyright © 2009 Pearson Education, Inc.
Sampling Distributions
Lecture 2 Binomial and Poisson Probability Distributions
MGS 3100 Business Analysis Regression Feb 18, 2016
Chapter 5: Sampling Distributions
Presentation transcript:

Psyc 235: Introduction to Statistics http://www.psych.uiuc.edu/~jrfinley/p235/ DON’T FORGET TO SIGN IN FOR CREDIT! Open demo before class: http://intuitor.com/statistics/CentralLim.html 1

About the Graded Assessment… Number One Predictor of Performance on Assessment: How much of the content you’ve covered. Importance of time on ALEKS Help provide a measure to pace yourself Keep on track for option of extra credit final However! Your grade is based on how much of the content you’ve learned. You need to keep up with the content goals! events have outcomes in common, but they don’t affect each other

Trouble meeting content goals? All content goals are listed on the syllabus. (Available on course webpage) Please attend office hours and lab. We are here to help! Special Invited Lectures: Mandatory for invited students Will cover topics that are giving folks trouble Expect notices in the next couple weeks

Concerned about assessment grade? Catch up on content as soon as possible Remember the final extra credit option Feel free to contact us for more specific advice.

Moving Forward: Mid-course evaluation forms soon. Suggestions for course, lecture, lab format.

Data World vs. Theory World Theory World: Idealization of reality (idealization of what you might expect from a simple experiment) POPULATION parameter: a number that describes the population. fixed but usually unknown Data World: data that results from an actual simple experiment SAMPLE statistic: a number that describes the sample (ex: mean, standard deviation, sum, ...)

Last Week… Binomial: n: # of independent trials p: probability of “success” q: probability of failure (1-p) X = # of the n trials that are “successes” x = np x = √np(1-p)

Binomial Probability Formula specific # of successes you could get probability of success specific # of failures Binomial Random Variable probability of failure work out example? n=10, k=7, p=.5 P(X=k)=.117 combination called the Binomial Coefficient Note for p (X ≥ k) Sum p for each k in range.

Jason’s Coin Toss Demo: Population: Outcomes of all possible coin tosses (for a fair coin) Bernoulli Trial: one coin toss Success=Heads p=.5 10 tosses n=10 (sample size) Sample: X = .... Sampling Distribution

Jason’s Coin Toss Demo: Population: Outcomes of all possible coin tosses (for a fair coin) And, we can use the formulas we’ve learned to calculate the population parameters for the sampling distribution: x = np=10 * .5 = 5 x = √np(1-p)≈1.58 Sample: X = .... Sampling Distribution

With different sample sizes, you all discovered something interesting… With large n, the binomial distribution starts to look like a normal distribution!

What is a Normal Distribution? Class of distributions with the same overall shape Continuous probability distribution defined by two parameters: mean:  stdev:  Special: Standard Normal Distribution

Standard Normal Distribution A distribution of z-scores (standardized scores). Scores derived by: Note:  = 0  = 1 Allows comparisons of scores from different normal distributions Note: Link between area and p(x) Note also: +1 unit equals +1  Area = probability

Probability & Standardizing Scores The standard normal distribution allows us to easily calculate probabilities for any normal distribution: Example: Say that we know that the average checking account balance for a UIUC student is normally distributed with an average balance of $150 and a standard deviation of $125. What is the probability of a randomly selected student having a balance of… more than $250? Less than $0 Between $100 and $200? (250-150)/125 = .6667=z p(z>.67)=.252 (Note ALEKS button only does <, so must do 1-p. (0-150)/125 = -1.2 p(z<-1.2)=.115 (100-150)/125=-.4 (200-150)/125=.4 p(z<.4)-p(z>.4)=.31

Why do we care so much about Normal Distributions? What happened to the binomial distribution as n increased? Central Limit Theorem As the sample size n increases, the distribution of the sample average approaches the normal distribution with a mean µ and variance 2/n irrespective of the shape of the original distribution.

Wait. What? Example: Rolling one die, multiple dice… http://www.stat.sc.edu/~west/javahtml/CLT.html So, just like flipping the coin, multiple samples of the sum of the n observations, approaches the normal. Since the mean of a sample is the sum of all observations over n (where n is constant for all samples), this same principle applies to the sample mean.

Hmm. Ok… But, does the underlying distribution really not matter? http://intuitor.com/statistics/CentralLim.html Note that the size of n slightly changed the shape of the normal distribution. Also, note that the central limit theorem stated the mean was µ and variance 2/n (so stdev = /√n ) The variance is a little different than before isn’t it?

T distributions To adjust for the fact that the normal distribution is a better approximation for a sampling distribution as n increases, we have the T distribution… So, the t distribution varies depending on the number of degrees of freedom (n-1) With lower n, the t distribution is more spread out. This means that getting more extreme values is more probable with low n.

So what good does that do us, anyway? Because we can assume that a sampling distribution will be approximately normal with a large n, we can use this distribution to estimate the probability of obtaining a given sample.

Example: (aka excuse to show pictures of my dog) A large dog shelter in Chicago wants to increase awareness of the adorable pups they have for adoption by bringing some dogs to a local festival. They have 50 people who have volunteered to walk the dogs around the festival. In the shelter there are several hundred dogs. The shelter knows that on average their dogs have a 14 point adoptability score (combination of things like behavior, training, breeding, cuteness, etc.), and the scores tend to vary by about 3. The shelter would prefer to show dogs that have an average of at least a 16 adoptability score. Should they go through all the dogs and select 50 by hand, or are they likely to get a group with this average by chance? Notice that we don’t know what the underlying distribution of adoptability scores looks like at this shelter, but because of CLT we can still come up with an answer.

Example: (aka excuse to show pictures of my dog) A large dog shelter in Chicago wants to increase awareness of the adorable pups they have for adoption by bringing some dogs to a local festival. They have 50 people who have volunteered to walk the dogs around the festival. In the shelter there are several hundred dogs. The shelter knows that on average their dogs have a 14 point adoptability score (combination of things like behavior, training, breeding, cuteness, etc.), and the scores tend to vary by about 3. The shelter would prefer to show dogs that have an average of at least a 16 adoptability score. Should they go through all the dogs and select 50 by hand, or are they likely to get a group with this average by chance? What information is important here? T=(16-14)/3/√50=4.714 p(t>4.714)=.00001)-- Better hand pick the dogs. µ = 14  = 3 X = 16 N = 50

A couple more distributions There are 2 more distributions that we will need later. ALEKS is familiarizing them with you now so that you know how to use the calculators etc. when it comes up. Generally, you should know: Shape of the distribution How to use the distribution practically (at this point this means using the ALEKS calculator to find the probability of a given value in a distribution)-- so don’t worry Vague concept of what the distribution means

Chi Square (2) Distribution Distribution of the sum of 2+ squared normal distributions This is useful because later when we’re comparing multiple distributions, we will want to determine whether two distributions are the same thing added together or are actually two separate distributions. Where k is number of groups

F distribution Distribution of the variance of one sample from a normally distributed population divided by the variance of another. This will be useful later when we want to test if there is more variance within a group than across groups (ANOVA)… if there is greater within group variance, then its unlikely that the groupings are meaningful. d1 is degrees of freedom of the top (numerator) distribution d2 is degrees of freedom for the bottom (denominator) distribution

Next Week Keep up with the content goals Watch for an email about course evaluations/suggestions Please let us know if you want or need help If you’ve fallen behind, expect to be contacted by email. Have a good week everyone!