Presentation on theme: "Psychology 10 Analysis of Psychological Data February 26, 2014."— Presentation transcript:
Psychology 10 Analysis of Psychological Data February 26, 2014
The Plan for Today The law of large numbers. The frequentist approach to probability. Areas under the normal curve. Rules for combining probabilities. The binomial distribution. Introducing the idea of the sampling distribution.
Probability distributions A probability distribution is the set of values that a random variable could take on, if we were to observe it… …together with the long-run relative frequencies or probabilities of those values.
The law of large numbers A phenomenon known as “the law of large numbers” states that if a random process is repeated a large number of times, the proportion of times that a particular event occurs will approach the probability of that event. If we were to toss our unfair coin a very large number of times, the proportion of “heads” outcomes would approach.7.
Frequentist approach to probability That idea leads to the frequentist approach to probability. Frequentists define probability as long-run relative frequency. Note that the frequentist definition and the analytical approach to probability converge. For example, if you were to draw a card with replacement a very large number of times, in the long run the proportion of times you draw the Jack of Hearts will approach 1/52.
Another example of a probability distribution Die rollProbability 11/
Another type of random variable Imagine that we are going to observe the heights of adult white males. I can imagine that there would be a lot of them around 69.5 inches tall. As I think about values further from 69.5 inches, I expect that they would occur less frequently.
Continuous random variables I have just drawn a probability distribution. But how many possible values are there? If I let those values take on positive values, the probabilities won’t sum to 1.0. Hence, the probability of a continuous random variable taking on any particular value must be zero.
Continuous random variables (cont.) We can still draw pictures of the relative likelihoods of values. But events must be defined in terms of ranges of those values. For example, we can look at this picture and estimate that the probability of observing an adult white male who is taller than 69.5 inches is about ½.
Another continuous random variable The uniform distribution. The relative likelihood of the possible values is a rectangle. Curves like these (for continuous random variables) are not probabilities. The technical term for such a curve is probability density function (or pdf for short).
The normal probability density function How to draw a normal curve. How to find probabilities associated with particular ranges of values under the normal curve. Using the unit normal table.
Normal probabilities Here are a few probabilities for the normal distribution that come up frequently. You should commit those in bold to memory: –The probability that a random draw will be within 1 sd of the mean is about.68. –The probability that a random draw will be greater than is.05. –The probability that a random draw will be greater than 1.96 is.025. –The probability that a random draw will be greater than is.005.
Some terms related to probability An event is a well defined outcome of a random process. –Examples: coin = heads, exam score > 79. Two events are mutually exclusive if they cannot both occur. –A coin cannot be both H and T on one toss. Two events are independent if knowing something about one event tells you nothing about the other.
Rules for combining probabilities The addition rule: If two events (A and B) are mutually exclusive, then P(A or B) = P(A) + P(B). Example: What is the probability that a single roll of a fair die will be 1 or 2? Note that a single roll cannot be both 1 and 2, so the events are mutually exclusive. P(1 or 2) = P(1) + P(2) = 1/6 + 1/6 = 1/3.
The addition rule (cont.) What is the probability that a randomly observed adult male will be over 6 feet tall or less than 5 feet 6 inches tall? Over 6 feet: (72 – 69.2) / 2.8 = 1.0. The area above 1.0 in a standard normal curve is Under 5 feet 6 inches: (66 – 69.2) / 2.8 = The area below is about =.2858.
The multiplication rule The probability of two independent events A and B both occurring is P(A) × P(B). Events are independent if knowing about the outcome of one tells us nothing at all about the outcome of the other.
Independent events Are these independent? –Ethnicity and eye color? –No –Age and annual income? –No –One coin toss and a second coin toss? –Yes –One randomly chosen IQ score and a second randomly chosen IQ score? –Yes
The multiplication rule (cont.) So I cannot (without more tools) answer a question like “What is the probability that a randomly observed person will be Caucasian and blue eyed?” But I can answer questions like “What is the probability that two randomly observed IQs are both > 120?” (If I know the distribution of IQ.)
The multiplication rule (cont.) Many IQ tests are designed to have mean = 100, sd = 15. P(one IQ is > 120)? (120 – 100) / 15 = Area > 1.33 = So the probability of two independent scores both being above 120 is.0918 ×.0918 =.0084.
The multiplication rule (cont.) BUT: What is the probability that the husband and wife in a randomly observed married couple will both have IQs above 120? We cannot say, because the two events are not independent.
Why do we care about probability? Probability is concerned with making statements about what will happen in the world, given that certain things are true. Inferential statistics is concerned with making statements about what is true in the world, given what has happened. Those are opposite interests. Nevertheless, probability is a crucial tool for inferential statistics.
Introducing the concept of the sampling distribution (Coin tossing exercise.)
In-class exercises Three probability problems: –What is the probability that a single draw from a standard normal distribution will be greater than 0.24? –What is the probability that a single draw from a normal distribution with a mean of 60 and a standard deviation of 8 will be between 59 and 62? –What is the probability that both of those events will occur in two independent draws?