Presentation on theme: "Scatterplots and Correlation"— Presentation transcript:
1 3.1-3.2 Scatterplots and Correlation AP StatisticsScatterplots and Correlation
2 Learning Objectives:Differentiate between an explanatory and response variableDraw and interpret a scatterplotAdd categorical data into a scatterplotCalculate and interpret the Correlation between two variables
3 Response Variable-Measures an outcome of a study Explanatory Variable-Attempts to explain the observed outcomesWe often find explanatory data called independent variable, and response variables called dependent variable.**This simply tells us the response variable depends on the explanatory variable***
4 State average SAT math score vs. Percent of graduates taking the SAT Go to page 29 in your textbook (we are just looking at the columns SAT Math and percent taking)Where does the explanatory data go?x-axisWhere does the response variable go?y-axisIdentify each variable (which one is the independent? Dependent?) Also decide if there is a positive or negative association.Discuss this with your groups and then share your reasoning with the class
5 Independent-% takingDependent-SAT MathRemember: you sign up for the exam first, so we first know the percent of kids taking it. Then you take the exam so we find out the average SAT math score for the state last.It should have a negative association. Think of this:- If a very small % takes the SAT, they are all most likely taking it to get into college and should be smarter.- If a large % takes it, the average is lower(think of the ACT at Athens where they pay for everyone to take- some kids who know they aren’t going to college don’t even attempt this test and lower the overall average)
6 Draw a scatter plot (we are just going to input the data from AL to KY to save time) Step 1: Input % taking into L1SAT Math into L2Step 2: 2nd-statplot-1-enter so the cursor is highlighting ON.Xlist: L1Ylist:L2Then hit zoom 9 (this will fit your window to the data you inputted)Sketch a quick scatter plot on your notes (your axis doesn’t have to start at 0).
7 This is a rough sketch of the scatter plot. Make sure you label it.
8 Give an example with no explanatory-response distinction? Now take 2 minutes and come up with an example in your groups. -Did anyone say hair color vs. weight? (this is incorrect b/c hair color is categorical and you need 2 variables that are quantitative!!)
9 Examining a scatterplot We want to look for an overall pattern (linear?) .We can describe the overall pattern of a scatterplot by the direction, form, and strength!!!An important kind of deviation is an outlier.
10 Example: Use the scatter plot to answer a-c a) Are there any clusters? b) Are there any outliers?c) Is there a clear direction?
15 Correlation (r)The correlation measures the direction and strength of the linear relationship between two quantitative variables. Correlation is usually written as r.
16 1-The mean and standard deviation of the two variables are denoted as: 2- The correlation between x and y is:Your calculator does this for you! First-make sure your diagnostics are on.2nd-catalog (0)-scroll down to diagnostic ON- enter-enter.(you only have to set your calc to this once, unless you change the batteries)
17 Find the correlation between the two bones in the fossil specimens. FemurHumerusInput Femur into L1and humerus into L2Graph it-does it lookpostive or negative? Is the correlation strong, moderatley strong, weak?
18 To find the correlation (r): TI-84: Stat-calc-8 (use 8 not 4)Xlist:L1Ylist:L2FreqList: (leave this blank)Store RegEQ: Y1 (vars-Yvars-1-1)CalculateTI-83: Stat-calc-8 (use 8 not 4)then type in L1,L2, Y1, enterr= There is a very strong positive linear relationship between the femur and the humerus.(make sure you write out the sentence that describes the r value not just r= )
19 When a question asks for the correlation-you have to give the r value and ALSO describe it in a sentence EVERY TIME!!! (that is what they grade you on for the AP exam)The sentence should include the strength (strong,weak,..), direction (pos. or neg.) and the form (linear)So if r=0.87 for your test grades versus the hours you studied.Answer: r=0.87 There is a moderately strong positive linear relationship between hrs. studied and test grades
20 Facts About Correlation #1-Positive r indicates pos. association Negative r indicates negative association#2- correlation always falls between -1 and 1(the closer to 1 and -1, the stronger it is)
21 #3-r is standardized, so it does not change with different measurements. (Go back and look at the actual formula for r. It is really just converting x and y’s to z-scores).#4- correlation measures the strength of only linear relationships b/w 2 variables#5- Correlation is strongly affected by outliers! #6- Correlation is non-directional (flip the x and y doesn’t change it!)Correlation is not a complete description of two variable data!!!!!