Presentation is loading. Please wait.

Presentation is loading. Please wait.

C ORRELATION V S. C AUSATION 4.2 C AUTIONS ABOUT C ORRELATION AND R EGRESSION Correlation and Regression ONLY describe only linear relationships r and.

Similar presentations


Presentation on theme: "C ORRELATION V S. C AUSATION 4.2 C AUTIONS ABOUT C ORRELATION AND R EGRESSION Correlation and Regression ONLY describe only linear relationships r and."— Presentation transcript:

1

2 C ORRELATION V S. C AUSATION 4.2

3 C AUTIONS ABOUT C ORRELATION AND R EGRESSION Correlation and Regression ONLY describe only linear relationships r and Least Squares Line are NOT resistant Extreme values and influential points can have large effect Plot your scatter plot FIRST!!!!

4 E XTRAPOLATION Predicting x values from y’s (Extrapolation) You SHOULD remain within the domain of your data Or very close to it Predictions Outside your domain are often VERY inaccurate The following is the least squares regression equation obtained for a young child’s heights in feet (y) compared to her age in years(x). Assuming the girl will live to be 52, predict her height at this ripe old age. 10 feet tall Age (yrs)Height (ft) 32.795 42.925 4.252.9575 4.53.0225 4.753.055 53.0875 Obviously people don’t continue to grow over time… Just remember to be careful when extrapolating!!

5 L URKING V ARIABLES Lurking Variable Variable not in your study that can (and probably does) effect the interpretation of the relationship between your two measured variables Often makes up the “left over” r 2 May be hidden Can cause a “strong” or “weak” relationship that isn’t true Dangerous to data and Interpretations What do I do about them? Try to identify them BEFORE the study Talk about their possible effects in your interpretations Use a residual plot with time as your x to try to identify potential effects

6 S HOULD I USE A VERAGED D ATA ? Averaged data is okay, BUT It shouldn’t really be used to predict or interpret for INDIVIDUALS Correlations based on Averaged Data are often too High when applied to individuals Averaged Data should be used to make predictions about averages So What Do I Need to Do? Pay attention to the WHOLE Situation:  Look at the Data (Contextually)  Look for Possible Lurking Variables  Make sure to DOUBLE CHECK any Contextual Inferences you make!!

7 C AUSATION r and r 2, our regression statistics are describing an association between 2 variables. But does this association mean that the explanatory variable CAUSES the response variables An obvious example of this statement comes from a true study that found the association listed below: An actual study performed over a one year time span found a statistically strong relationship between the number of ice cream cones sold in a month and the number of homicides in the same month. While there appeared to be a statistical association between these two variables, we know that it would be incorrect to say that the number of ice cream cones sold CAUSES the number of homicides. This is where a LURKING variable comes into play…

8 C AUSATION ( VISUALLY ) Below are three different visual examples of different situations and underlying variables that can Explain an association xy Dotted lines = association Arrow = causal relationship Causation xyzxyz Common Response (lurking variable) ConfoundingCommon Response Causation doesn’t mean there aren’t other factors that effect the result… Just that the response is directly caused by the explanatory variable…

9 C AUSATION ( DIRECT ) Let’s look at situations where direct causation occurs A study of recorded the heights of young males (between the ages of 12 and 15) and their fathers. The study found an association between the two heights with an r 2 of about 25%. While there is a direct cause between the thickness of the rat’s stomach and the ounces of battery acid eaten, this is an example of a situation that you can’t generalize to all cases. IE… The effect might not be the same for humans. There is a direct causal relationship between the height of a father and their son through heredity. It is possible to have direct causation with a low r 2, it just says that the father’s height only explains about 25% of the variation in the son’s height. A study performed on a number of lab rats found an association between the number of ounces of battery acid eaten and the thickness level of the stomach lining.

10 C OMMON R ESPONSE ( LURKING VARIABLE ) Let’s look at situations where there is a “lurking” variable An actual study performed over a one year time span found a strong relationship statistically between the number of ice cream cones sold in a month and the number of homicides sold in the same month Earlier we found a fairly good association between the number of tv’s that a person owns and their life expectancy. While this study may show an association between the two, we know that there are many other “lurking” variables that can have an effect on life expectancy and the # of tv’s you own…. (DISCUSSION!!) While this study provided evidence that there was an association between ice cream and homicides, they both are probably effected by a lurking variable such as heat/temperature. IE – when people are hot, they eat ice cream and when they are hot they are CRANKY The MORAL: Association doesn’t mean CAUSATION

11 C ONFOUNDING Two variables are “confounding” when you can’t tell which variable is effecting the response Mr. Arnold and Mr. Reed have been selected to compare the effectiveness of two well known laundry detergents, PRIDE and NONE. Each takes their respective detergents home, wash their clothes, and then bring them to a panel of judges for submission. It is found that PRIDE is the better detergent because Mr. Reed’s clothes are more clean. While we can say that the detergent had an effect on the cleanliness of their clothes, there are other factors that could have equally effected the outcome… Washer quality, Water Quality, Laundry Cycle, etc… When we can’t tell if the “lurking” variables or the explanatory variable had the effect, the study is CONFOUNDING. The MORAL: Association doesn’t mean CAUSATION

12 S O W HEN C AN I SAY CAUSE? Remember, even HIGH correlation doesn’t mean CAUSATION When can I say it? If you do an EXPERIMENT and control lurking variables OR if you can prove high association over repeated studies, then you can say the magic word!!! Cause Man, I look good!!

13 M ORAL O F THE S TORY Correlation and Association doesn’t mean CAUSATION Really examine the CONTEXT of your data Don’t just look at the numbers Numbers tell you everything!! I love Numbers!! Don’t listen to that Geek! You better look at the CONTEXT, not just the numbers.

14 H OMEWORK #38-45


Download ppt "C ORRELATION V S. C AUSATION 4.2 C AUTIONS ABOUT C ORRELATION AND R EGRESSION Correlation and Regression ONLY describe only linear relationships r and."

Similar presentations


Ads by Google