Download presentation

Presentation is loading. Please wait.

1
**STATISTICS Linear Statistical Models**

Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University

2
**The Method of Least Squares**

Consider the data shown in the following table and figure. We are interested in fitting a straight line to the points in order to obtain a simple mathematical relationship for runoff and rainfall. 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

3
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

4
Intuitively, we want that, for each observed value of rainfall, the corresponding value of runoff will be as close as possible to the observed value. It is equivalent to say that we want the vertical deviations to be as small as possible. 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

5
One method of constructing such a straight line to fit the observed data is called the method of least squares. It requires the sum of the squares of the vertical deviations of all the points from the fitted line to be a minimum. Let the rainfall and runoff data in the above figure be respectively represented by x and y. The fitted line is expressed by 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

6
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

7
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

8
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

9
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

10
Remarks 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

11
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

12
**Given a value of x, what dose the predicted value of y really represent?**

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

13
**Given a value of x, what dose the predicted value of y really represent?**

It is unlikely that the predicted value will be the same as the observed value at all times. It may even be possible that the predicted value is the same as the observed value only in very few cases. In some cases, the predicted values are far different from observed values. We are sure that the linear model may overpredict or underpredict the observed values. 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

14
**Linear statistical model**

We are not able to predict y without errors due to existence of the random component. If a phenomenon is stochastic in nature, it cannot be predicted without errors. Random component 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

15
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

16
**Coefficient of determination**

How well does the least squares line explain the variation in the data? The coefficient of determination represents the proportion of data variation that can be explained by the linear regression model. 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

17
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

18
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

19
**Estimating the variance of Y|x**

RSS (Residual sum of squares) = SSE (sum of squared errors) Note: The variance of Y|x is NOT the same as the variance of Y. 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

20
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

21
**Unbiasedness of the least squares estimators**

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

22
**Confidence intervals of the regression coefficients**

Pivotal quantities 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

23
**Hypothesis tests for regression coefficients**

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

24
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

25
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

26
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

27
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

28
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

29
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

30
**Simple linear regression using R**

Useful material Chapter 11 of Introduction to Probability and Statistics Using R (G. J. Kerns) is highly recommended. 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

31
**Defining linear regression models**

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

32
**Conducting regression**

lm(y~model) 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

33
**Other useful commands 2017/3/27**

Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

34
**For prediction (x values not observed)**

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

35
**Graphing the Confidence and Prediction Bands**

You may want to change it. For example, data.frame(x=seq(20,30,by=0.5)) 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

36
**Confidence and prediction intervals**

Line of prediction. It represents the estimated conditional expectation of y given x. 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

37
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

38
Multiple regression The following slides are provided for your reference only. Due to the time constraint, they will not be covered in this class. 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

39
**Now let’s consider fitting a linear function of several variables**

Now let’s consider fitting a linear function of several variables. Suppose that we have the following data set: 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

40
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

41
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

42
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

43
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

44
**The Linear Regression Model**

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

45
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

46
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

47
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

48
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

49
**Covariance and Correlation Coefficient**

Suppose we have observed the following data. We wish to measure both the direction and the strength of the relationship between Y and X. 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

50
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

51
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

52
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

53
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

54
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

55
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

56
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

57
**The Analysis of Variance (ANOVA)**

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

58
**Given X, Y’s are independent normal random variables, i.e.,**

The residual sum of squares (or sum of squared errors, SSE) is expressed by 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

59
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

60
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

61
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

62
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

63
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

64
The total sum of squares corrected for the mean is referred to as the total variation. This total variation is split up in two parts: the regression part (SSRm) “explained by the model”, and the residual part (SSE). 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

65
**The ratio is known as the coefficient of determination. **

If the coefficient of determination is large then the model provides a good fit to the data. It also represents the part of the total variation which is explained by the model. 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

66
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

67
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

68
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

69
**Properties of the Estimators**

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

70
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

71
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

72
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

73
**Confidence Intervals 2017/3/27**

Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

74
**The 100(1 – )% confidence interval of 2 is**

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

75
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

76
**(n–p) degree of freedom.**

However, the true value of is unknown, the above equation can not be used to establish the confidence interval of . We then use s to substitute and it is known that has a t-distribution with (n–p) degree of freedom. 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

77
**The 100(1 – )% confidence interval of is**

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

78
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

79
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

80
Example 1 A scientist carries out an experiment on the relationship between the yield Y of a crop and the amount of irrigation water X. It is believed that the relationship between expected yield and amount of irrigation water (ignore the units) can be described adequately as 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

81
**The data shown in the following table were collected in the field.**

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

82
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

83
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

84
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

85
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

86
Example 2 Data in the following table are rainfall (x) and runoff (y) measured during the rainy season in a study area. A regression model is postulated for the above data 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

87
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

88
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

89
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

90
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

91
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

92
Test of Hypotheses 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

93
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

94
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

95
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

96
2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google