Presentation is loading. Please wait.

Presentation is loading. Please wait.

SADC Course in Statistics Simple Linear Regression (Session 02)

Similar presentations


Presentation on theme: "SADC Course in Statistics Simple Linear Regression (Session 02)"— Presentation transcript:

1 SADC Course in Statistics Simple Linear Regression (Session 02)

2 To put your footer here go to View > Header and Footer 2 Learning Objectives At the end of this session, you will be able to understand the meaning of a simple linear regression model, its aims and terminology determine the best fitting line describing the relationship between a quantitative response (y) and a quantitative explanatory variable (x) Interpret the unknown parameters of the regression line

3 To put your footer here go to View > Header and Footer 3 An illustrative example Data on the next slide shows the average number of cigarettes smoked per adult in 1930 and the death rate per million in 1952 for sixteen countries. The question of interest is whether there is a relationship between the death rate (y) and level of smoking (x). Here both y and x are quantitative measurements.

4 To put your footer here go to View > Header and Footer 4 The Data CountryCig. Smoked (x)Death rate (y) England and Wales Finland Austria Nethelands Belgium Switzerland New Zealand U.S.A Denmark Australia Canada France Italy Sweden38889 Norway35977 Japan72340

5 To put your footer here go to View > Header and Footer 5 Start by plotting - shows pattern -a straight line relationship seems plausible here.

6 To put your footer here go to View > Header and Footer 6 Recall reasons for modelling To determine which of (often) several factors explain variability in the key response of interest; To summarise the relationship(s); For predictive purposes, e.g. predicting y for given xs, or identifying xs that optimise y in some way; Note: Presence of an association between variables does not necessarily imply causation.

7 To put your footer here go to View > Header and Footer 7 Describe variation in response (here death rate) in terms of its relationship with the explanatory variable (here cig. numbers). Model : Model : data = pattern + residual –can describe pattern as: a + bx, if straight line relationship seems reasonable –residual is unexplained variation - assumed to be random. Describing the Regression Model

8 To put your footer here go to View > Header and Footer 8 If there is only one explanatory variable, we have a Simple Linear Regression Model. Here data = pattern + residual becomes: y = + x + where + x =pattern and = residual. is called the intercept is called the slope the s represent the departure of the true line from the observed values. Simple Linear Regression Model

9 To put your footer here go to View > Header and Footer 9 A Diagrammatic Representation

10 To put your footer here go to View > Header and Footer 10 and are the unknown parameters in the model. They are estimated from the data The random error,, is assumed to have a –normal distribution –with constant variance (whatever the value of x) We shall return to these assumptions later. Parameters of Model & Assumptions

11 To put your footer here go to View > Header and Footer 11 Results of model fitting deathrate|Coef. Std.Err. t P>|t| [95% Conf.Int.] Cigars | Const. | These are estimates of coefficients of the regression equation since this is a sample of data - precision quantified by standard errors Estimated equation is: y = * x Note: The t and P>|t| columns will be discussed in the next session.

12 To put your footer here go to View > Header and Footer 12 The fitted line

13 To put your footer here go to View > Header and Footer 13 Interpreting model parameters Slope (regression coefficient): If cigarettes smoked increases by 1 unit per year, death rate will increase by 0.24 units. In other words, if cigarettes smoked increases by 100 units, death rate will increase by 24 units. Intercept of only has meaning if the range of x values (cigarettes smoked) under study includes the value of zero. Here zero cigarettes smoked still gives an estimated death rate of 28.3

14 To put your footer here go to View > Header and Footer 14 Predictions from the line The model equation can also be used to predict y at a given value of x Thus from y = x, predicted death rate ( ) in a country where number of cigarettes smoked is x =1000, is given by = (1000) = Note: Predictions will be discussed in greater detail in Session 9.

15 To put your footer here go to View > Header and Footer 15 Computation of model estimates (for reference only) Note: Can also write

16 To put your footer here go to View > Header and Footer 16 Practical work follows to ensure learning objectives are achieved…


Download ppt "SADC Course in Statistics Simple Linear Regression (Session 02)"

Similar presentations


Ads by Google