Vooruitblik 10 en 11 Dinsdag 30 september 2008. Chapter 10 Correlation and Regression 1. Correlation 2. Regression 3. Variation and Prediction Intervals.

Slides:



Advertisements
Similar presentations
Overview of Lecture Parametric vs Non-Parametric Statistical Tests.
Advertisements

STATISTICS ELEMENTARY MARIO F. TRIOLA
Test of (µ 1 – µ 2 ),  1 =  2, Populations Normal Test Statistic and df = n 1 + n 2 – 2 2– )1– 2 ( 2 1 )1– 1 ( 2 where ] 2 – 1 [–
Chapter 18: The Chi-Square Statistic
Lesson 10: Linear Regression and Correlation
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Lecture Slides Elementary Statistics Tenth Edition and the.
13- 1 Chapter Thirteen McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Bivariate Correlation and Regression CHAPTER Thirteen.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-4 Variation and Prediction Intervals.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Linear Regression and Correlation
The Simple Regression Model
SIMPLE LINEAR REGRESSION
Chapter Topics Types of Regression Models
11-3 Contingency Tables In this section we consider contingency tables (or two-way frequency tables), which include frequency counts for categorical data.
SIMPLE LINEAR REGRESSION
Simple Linear Regression and Correlation
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
1 Chapter 10 Correlation and Regression We deal with two variables, x and y. Main goal: Investigate how x and y are related, or correlated; how much they.
Correlation & Regression
SIMPLE LINEAR REGRESSION
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-3 Regression.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Correlation and Regression
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-1 Review and Preview.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
CHAPTER 14 MULTIPLE REGRESSION
© The McGraw-Hill Companies, Inc., Chapter 11 Correlation and Regression.
Production Planning and Control. A correlation is a relationship between two variables. The data can be represented by the ordered pairs (x, y) where.
Elementary Statistics Correlation and Regression.
1 Pertemuan 11 Uji kebaikan Suai dan Uji Independen Mata kuliah : A Statistik Ekonomi Tahun: 2010.
Basic Concepts of Correlation. Definition A correlation exists between two variables when the values of one are somehow associated with the values of.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Section 9-1: Inference for Slope and Correlation Section 9-3: Confidence and Prediction Intervals Visit the Maths Study Centre.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
CHI SQUARE TESTS.
© Copyright McGraw-Hill CHAPTER 11 Other Chi-Square Tests.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.
1 Chapter 10. Section 10.1 and 10.2 Triola, Elementary Statistics, Eighth Edition. Copyright Addison Wesley Longman M ARIO F. T RIOLA E IGHTH E DITION.
Slide 1 Copyright © 2004 Pearson Education, Inc..
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
Statistics 300: Elementary Statistics Section 11-3.
Slide 1 Copyright © 2004 Pearson Education, Inc. Chapter 11 Multinomial Experiments and Contingency Tables 11-1 Overview 11-2 Multinomial Experiments:
Goodness-of-Fit and Contingency Tables Chapter 11.
Chapter 11 Linear Regression and Correlation. Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and.
Slide 1 Copyright © 2004 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-1 Overview Overview 10-2 Correlation 10-3 Regression-3 Regression.
Regression and Correlation
Reasoning in Psychology Using Statistics
Correlation and Regression
CHAPTER 29: Multiple Regression*
Chapter 12 Inference on the Least-squares Regression Line; ANOVA
Chapter 10 Correlation and Regression
Comparing k Populations
Lecture Slides Elementary Statistics Tenth Edition
Contingency Tables: Independence and Homogeneity
Overview and Chi-Square
Correlation and Regression
Correlation and Regression
SIMPLE LINEAR REGRESSION
Linear Regression and Correlation
SIMPLE LINEAR REGRESSION
Linear Regression and Correlation
Reasoning in Psychology Using Statistics
Section 11-1 Review and Preview
Chapter 18: The Chi-Square Statistic
Created by Erin Hodgess, Houston, Texas
Presentation transcript:

Vooruitblik 10 en 11 Dinsdag 30 september 2008

Chapter 10 Correlation and Regression 1. Correlation 2. Regression 3. Variation and Prediction Intervals 4. Rangorde correlatie

1. Correlation Verband tussen twee gemeten variabelen in een dataset op interval of ratio nivo In dit boek: alléén lineaire verbanden Let op de voorwaarden! Maat: Pearson PM correlatie r of rho Geen correlatie: r = 0, maximale correlatie r = -1 of +1 Kritische waarden: tabel A-6

Scatterplots of Paired Data Figure 10-2

Scatterplots of Paired Data Figure 10-2

Formula 10-1 n  xy – (  x)(  y) n(  x 2 ) – (  x) 2 n(  y 2 ) – (  y) 2 r =r = The linear correlation coefficient r measures the strength of a linear relationship between the paired values in a sample. Calculators can compute r Formula

Figure 10-3 Hypothesis Test for a Linear Correlation

2. Regression Vervolg op correlatie Berekening van regressielijn in de scatterplot: de lijn die het beste past in de puntenwolk Doel: voorspellen van waarden

Regression The typical equation of a straight line y = mx + b is expressed in the form y = b 0 + b 1 x, where b 0 is the y -intercept and b 1 is the slope. ^ The regression equation expresses a relationship between x (called the independent variable, predictor variable or explanatory variable), and y (called the dependent variable or response variable). ^

Formulas for b 0 and b 1 Formula 10-2 n(  xy) – (  x) (  y) b 1 = (slope) n(  x 2 ) – (  x) 2 b 0 = y – b 1 x ( y -intercept) Formula 10-3 calculators or computers can compute these values

Given the sample data in Table 10-1, find the regression equation. Example: Old Faithful - cont

Procedure for Predicting Figure 10-7

3. Variation and Prediction Intervals Vervolg op regressielijn (hfst 7) Confidence interval = interval schatting van populatie parameters: proportie, gemiddelde, variantie Hier: interval schatting van de schatting van de waarde van een variabele

Key Concept In this section we proceed to consider a method for constructing a prediction interval, which is an interval estimate of a predicted value of y.

y - E < y < y + E ^ ^ Prediction Interval for an Individual y where E = t   2 s e n(x2)n(x2) – (  x) 2 n(x0 – x)2n(x0 – x) n x 0 represents the given value of x t   2 has n – 2 degrees of freedom

Standard Error of Estimate The standard error of estimate, denoted by s e is a measure of the differences (or distances) between the observed sample y -values and the predicted values y that are obtained using the regression equation. Definition ^

4. Rangorde correlatie Non-parametrische methode = verdelingsvrije toets = geen aannames mbt. Verdeling in de opulatie Associatietest op twee variabelen Spearman’s: r s (sample) of voor populatie: rho s Procedure in fig (p.537)

voorbeeld

1. Goodness-of-fit: multinominaal 2. Kruistabellen (contingency tables) 3. Variantie analyse (ANOVA) Chapter 11 Multinomial Experiments and Contingency Tables

Overview  We focus on analysis of categorical (qualitative or attribute) data that can be separated into different categories (often called cells).  Use the  2 (chi-square) test statistic (Table A- 4).  The goodness-of-fit test uses a one-way frequency table (single row or column).  The contingency table uses a two-way frequency table (two or more rows and columns).

1. Goodness-of-fit: multinominaal Komt een feitelijke kansverdeling op een nominale variabele overeen met een verwachte verdeling? H0: p1 = x, p2 = y, p3 = z, p4 = etc.. H1: Tenminste één van de gevonden proporties is afwijkend van de verwachte kans.

Goodness-of-Fit Test in Multinomial Experiments Critical Values 1. Found in Table A- 4 using k – 1 degrees of freedom, where k = number of categories. 2. Goodness-of-fit hypothesis tests are always right-tailed.  2 =  ( O – E ) 2 E Test Statistics

Example: Last Digit Analysis Test the claim that the digits in Table 11-2 do not occur with the same frequency.

Relationships Among the  2 Test Statistic, P-Value, and Goodness-of-Fit Figure 11-3

2. Kruistabellen (contingency tables) In this section we consider contingency tables (or two-way frequency tables), which include frequency counts for categorical data arranged in a table with a least two rows and at least two columns. We present a method for testing the claim that the row and column variables are independent of each other. We will use the same method for a test of homogeneity, whereby we test the claim that different populations have the same proportion of some characteristics.

BlackWhiteYellow/Orange Row Totals Controls (not injured) Cases (injured or killed) Column Totals For the upper left hand cell: = E = (899)(704) 1232 Case-Control Study of Motorcycle Drivers (row total) (column total) E = (grand total)

BlackWhiteYellow/Orange Row Totals Cases (injured or killed) Expected Column Totals Controls (not injured) Expected Case-Control Study of Motorcycle Drivers

H 0 : Row and column variables are independent. H 1 : Row and column variables are dependent. The test statistic is  2 =  = 0.05 The number of degrees of freedom are (r–1)(c–1) = (2–1)(3–1) = 2. The critical value (from Table A-4) is  2.05,2 = Case-Control Study of Motorcycle Drivers

We reject the null hypothesis. It appears there is an association between helmet color and motorcycle safety. Case-Control Study of Motorcycle Drivers Figure 11-4

3. Variantie analyse (ANOVA) ANalysis Of VAriance H0 = meerdere populatie gemiddeldes zijn gelijk F-verdeling (tabel A7) Toets op P-waarde

TOT SLOT: Bayesiaanse statistiek Teksten en 2 opdrachten (worden uitgedeeld) 1. Intuïtieve benadering 2. Formele benadering

Voorbeeldprobleem Gegeven: In Orange County VS is 51 % man, 9.5% van de mannen rookt sigaren, tegenover 1.7% van de vrouwen Gevraagd: Hoe groot is de kans dat een willekeurige sigarenroker een man is?

1. Intuïtieve benadering

2. Formele benadering

Einde vooruitblik Volgende week (week 6): –Vragenuur –Geen nieuwe stof –Voorbereiding proeftentamen