Presentation is loading. Please wait.

Presentation is loading. Please wait.

STATISTICS Linear Statistical Models Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.

Similar presentations


Presentation on theme: "STATISTICS Linear Statistical Models Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University."— Presentation transcript:

1 STATISTICS Linear Statistical Models Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University

2 The Method of Least Squares Consider the data shown in the following table and figure. We are interested in fitting a straight line to the points in order to obtain a simple mathematical relationship for runoff and rainfall. 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 2

3 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 3

4 Intuitively, we want that, for each observed value of rainfall, the corresponding value of runoff will be as close as possible to the observed value. It is equivalent to say that we want the vertical deviations to be as small as possible. 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 4

5 One method of constructing such a straight line to fit the observed data is called the method of least squares. It requires the sum of the squares of the vertical deviations of all the points from the fitted line to be a minimum. Let the rainfall and runoff data in the above figure be respectively represented by x and y. The fitted line is expressed by 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 5

6 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 6

7 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 7

8 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 8

9 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 9

10 Remarks 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 10

11 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 11

12 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 12 Given a value of x, what dose the predicted value of y really represent?

13 – It is unlikely that the predicted value will be the same as the observed value at all times. – It may even be possible that the predicted value is the same as the observed value only in very few cases. – In some cases, the predicted values are far different from observed values. We are sure that the linear model may overpredict or underpredict the observed values. 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 13

14 Linear statistical model 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 14 Random component We are not able to predict y without errors due to existence of the random component. If a phenomenon is stochastic in nature, it cannot be predicted without errors.

15 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 15

16 Coefficient of determination How well does the least squares line explain the variation in the data? The coefficient of determination represents the proportion of data variation that can be explained by the linear regression model. 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 16

17 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 17

18 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 18

19 Estimating the variance of Y | x 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 19 Note: The variance of Y|x is NOT the same as the variance of Y. RSS (Residual sum of squares) = SSE (sum of squared errors)

20 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 20

21 Unbiasedness of the least squares estimators 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 21

22 Confidence intervals of the regression coefficients Pivotal quantities 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 22

23 Hypothesis tests for regression coefficients 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 23

24 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 24

25 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 25

26 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 26

27 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 27

28 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 28

29 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 29

30 Simple linear regression using R Useful material – Chapter 11 of Introduction to Probability and Statistics Using R (G. J. Kerns) is highly recommended. – /ac /Class8/Using%20R%20for%20linear %20regression.pdf /ac /Class8/Using%20R%20for%20linear %20regression.pdf 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 30

31 Defining linear regression models 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 31

32 Conducting regression lm(y~model) 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 32

33 Other useful commands 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 33

34 – For prediction (x values not observed) 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 34

35 Graphing the Confidence and Prediction Bands 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 35 You may want to change it. For example, data.frame(x=seq(20,30,by=0.5))

36 Confidence and prediction intervals 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 36 Line of prediction. It represents the estimated conditional expectation of y given x.

37 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 37

38 Multiple regression – The following slides are provided for your reference only. Due to the time constraint, they will not be covered in this class. 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 38

39 Now let s consider fitting a linear function of several variables. Suppose that we have the following data set: 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 39

40 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 40

41 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 41

42 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 42

43 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 43

44 The Linear Regression Model 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 44

45 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 45

46 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 46

47 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 47

48 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 48

49 Covariance and Correlation Coefficient Suppose we have observed the following data. We wish to measure both the direction and the strength of the relationship between Y and X. 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 49

50 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 50

51 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 51

52 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 52

53 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 53

54 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 54

55 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 55

56 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 56

57 The Analysis of Variance (ANOVA) 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 57

58 Given X, Y s are independent normal random variables, i.e., The residual sum of squares (or sum of squared errors, SSE) is expressed by 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 58

59 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 59

60 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 60

61 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 61

62 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 62

63 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 63

64 The total sum of squares corrected for the mean is referred to as the total variation. This total variation is split up in two parts: – the regression part (SSR m ) explained by the model, and – the residual part (SSE). 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 64

65 The ratio is known as the coefficient of determination. If the coefficient of determination is large then the model provides a good fit to the data. It also represents the part of the total variation which is explained by the model. 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 65

66 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 66

67 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 67

68 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 68

69 Properties of the Estimators 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 69

70 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 70

71 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 71

72 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 72

73 Confidence Intervals 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 73

74 The 100(1 – )% confidence interval of 2 is 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 74

75 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 75

76 However, the true value of is unknown, the above equation can not be used to establish the confidence interval of. We then use s to substitute and it is known that has a t-distribution with (n – p) degree of freedom. 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 76

77 The 100(1 – )% confidence interval of is 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 77

78 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 78

79 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 79

80 Example 1 A scientist carries out an experiment on the relationship between the yield Y of a crop and the amount of irrigation water X. It is believed that the relationship between expected yield and amount of irrigation water (ignore the units) can be described adequately as 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 80

81 The data shown in the following table were collected in the field. 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 81

82 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 82

83 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 83

84 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 84

85 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 85

86 Example 2 Data in the following table are rainfall (x) and runoff (y) measured during the rainy season in a study area. A regression model is postulated for the above data 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 86

87 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 87

88 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 88

89 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 89

90 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 90

91 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 91

92 Test of Hypotheses 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 92

93 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 93

94 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 94

95 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 95

96 2014/1/31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU 96


Download ppt "STATISTICS Linear Statistical Models Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University."

Similar presentations


Ads by Google