STATISTICS Linear Statistical Models

Slides:



Advertisements
Similar presentations
EcoTherm Plus WGB-K 20 E 4,5 – 20 kW.
Advertisements

Números.
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
AGVISE Laboratories %Zone or Grid Samples – Northwood laboratory
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
PDAs Accept Context-Free Languages
Reflection nurulquran.com.
EuroCondens SGB E.
Worksheets.
Slide 1Fig 26-CO, p.795. Slide 2Fig 26-1, p.796 Slide 3Fig 26-2, p.797.
Slide 1Fig 25-CO, p.762. Slide 2Fig 25-1, p.765 Slide 3Fig 25-2, p.765.
& dding ubtracting ractions.
Sequential Logic Design
STATISTICS Joint and Conditional Distributions
STATISTICS Random Variables and Probability Distributions
STATISTICS HYPOTHESES TEST (I)
STATISTICS INTERVAL ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
STATISTICS POINT ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
STATISTICS Univariate Distributions
Addition and Subtraction Equations
1 When you see… Find the zeros You think…. 2 To find the zeros...
Western Public Lands Grazing: The Real Costs Explore, enjoy and protect the planet Forest Guardians Jonathan Proctor.
Add Governors Discretionary (1G) Grants Chapter 6.
CALENDAR.
CHAPTER 18 The Ankle and Lower Leg
The 5S numbers game..
突破信息检索壁垒 -SciFinder Scholar 介绍
A Fractional Order (Proportional and Derivative) Motion Controller Design for A Class of Second-order Systems Center for Self-Organizing Intelligent.
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
Break Time Remaining 10:00.
The basics for simulations
Factoring Quadratics — ax² + bx + c Topic
PP Test Review Sections 6-1 to 6-6
MM4A6c: Apply the law of sines and the law of cosines.
Figure 3–1 Standard logic symbols for the inverter (ANSI/IEEE Std
Regression with Panel Data
TCCI Barometer March “Establishing a reliable tool for monitoring the financial, business and social activity in the Prefecture of Thessaloniki”
Dynamic Access Control the file server, reimagined Presented by Mark on twitter 1 contents copyright 2013 Mark Minasi.
TCCI Barometer March “Establishing a reliable tool for monitoring the financial, business and social activity in the Prefecture of Thessaloniki”
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
Progressive Aerobic Cardiovascular Endurance Run
MaK_Full ahead loaded 1 Alarm Page Directory (F11)
TCCI Barometer September “Establishing a reliable tool for monitoring the financial, business and social activity in the Prefecture of Thessaloniki”
When you see… Find the zeros You think….
2011 WINNISQUAM COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=1021.
Before Between After.
2011 FRANKLIN COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=332.
Foundation Stage Results CLL (6 or above) 79% 73.5%79.4%86.5% M (6 or above) 91%99%97%99% PSE (6 or above) 96%84%100%91.2%97.3% CLL.
Subtraction: Adding UP
Numeracy Resources for KS2
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
Static Equilibrium; Elasticity and Fracture
Converting a Fraction to %
Resistência dos Materiais, 5ª ed.
& dding ubtracting ractions.
Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 11 Simple Linear Regression.
Lial/Hungerford/Holcomb/Mullins: Mathematics with Applications 11e Finite Mathematics with Applications 11e Copyright ©2015 Pearson Education, Inc. All.
Biostatistics course Part 14 Analysis of binary paired data
16. Mean Square Estimation
A Data Warehouse Mining Tool Stephen Turner Chris Frala
1 Dr. Scott Schaefer Least Squares Curves, Rational Representations, Splines and Continuity.
Chart Deception Main Source: How to Lie with Charts, by Gerald E. Jones Dr. Michael R. Hyman, NMSU.
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
Introduction Embedded Universal Tools and Online Features 2.
úkol = A 77 B 72 C 67 D = A 77 B 72 C 67 D 79.
Schutzvermerk nach DIN 34 beachten 05/04/15 Seite 1 Training EPAM and CANopen Basic Solution: Password * * Level 1 Level 2 * Level 3 Password2 IP-Adr.
Dept of Bioenvironmental Systems Engineering National Taiwan University Lab for Remote Sensing Hydrology and Spatial Modeling STATISTICS Linear Statistical.
Presentation transcript:

STATISTICS Linear Statistical Models Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University

The Method of Least Squares Consider the data shown in the following table and figure. We are interested in fitting a straight line to the points in order to obtain a simple mathematical relationship for runoff and rainfall. 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

Intuitively, we want that, for each observed value of rainfall, the corresponding value of runoff will be as close as possible to the observed value. It is equivalent to say that we want the vertical deviations to be as small as possible. 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

One method of constructing such a straight line to fit the observed data is called the method of least squares. It requires the sum of the squares of the vertical deviations of all the points from the fitted line to be a minimum. Let the rainfall and runoff data in the above figure be respectively represented by x and y. The fitted line is expressed by 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

Remarks 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

Given a value of x, what dose the predicted value of y really represent? 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

Given a value of x, what dose the predicted value of y really represent? It is unlikely that the predicted value will be the same as the observed value at all times. It may even be possible that the predicted value is the same as the observed value only in very few cases. In some cases, the predicted values are far different from observed values. We are sure that the linear model may overpredict or underpredict the observed values. 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

Linear statistical model We are not able to predict y without errors due to existence of the random component. If a phenomenon is stochastic in nature, it cannot be predicted without errors. Random component 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

Coefficient of determination How well does the least squares line explain the variation in the data? The coefficient of determination represents the proportion of data variation that can be explained by the linear regression model. 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

Estimating the variance of Y|x RSS (Residual sum of squares) = SSE (sum of squared errors) Note: The variance of Y|x is NOT the same as the variance of Y. 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

Unbiasedness of the least squares estimators 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

Confidence intervals of the regression coefficients Pivotal quantities 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

Hypothesis tests for regression coefficients 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

Simple linear regression using R Useful material Chapter 11 of Introduction to Probability and Statistics Using R (G. J. Kerns) is highly recommended. http://www.montefiore.ulg.ac.be/~kvansteen/GBIO0009-1/ac20092010/Class8/Using%20R%20for%20linear%20regression.pdf 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

Defining linear regression models 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

Conducting regression lm(y~model) 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

Other useful commands 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

For prediction (x values not observed) 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

Graphing the Confidence and Prediction Bands You may want to change it. For example, data.frame(x=seq(20,30,by=0.5)) 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

Confidence and prediction intervals Line of prediction. It represents the estimated conditional expectation of y given x. 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

Multiple regression The following slides are provided for your reference only. Due to the time constraint, they will not be covered in this class. 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

Now let’s consider fitting a linear function of several variables Now let’s consider fitting a linear function of several variables. Suppose that we have the following data set: 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

The Linear Regression Model 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

Covariance and Correlation Coefficient Suppose we have observed the following data. We wish to measure both the direction and the strength of the relationship between Y and X. 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

The Analysis of Variance (ANOVA) 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

Given X, Y’s are independent normal random variables, i.e., The residual sum of squares (or sum of squared errors, SSE) is expressed by 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

The total sum of squares corrected for the mean is referred to as the total variation. This total variation is split up in two parts: the regression part (SSRm) “explained by the model”, and the residual part (SSE). 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

The ratio is known as the coefficient of determination. If the coefficient of determination is large then the model provides a good fit to the data. It also represents the part of the total variation which is explained by the model. 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

Properties of the Estimators 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

Confidence Intervals 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

The 100(1 – )% confidence interval of 2 is 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

(n–p) degree of freedom. However, the true value of  is unknown, the above equation can not be used to establish the confidence interval of . We then use s to substitute  and it is known that has a t-distribution with (n–p) degree of freedom. 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

The 100(1 – )% confidence interval of is 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

Example 1 A scientist carries out an experiment on the relationship between the yield Y of a crop and the amount of irrigation water X. It is believed that the relationship between expected yield and amount of irrigation water (ignore the units) can be described adequately as 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

The data shown in the following table were collected in the field. 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

Example 2 Data in the following table are rainfall (x) and runoff (y) measured during the rainy season in a study area. A regression model is postulated for the above data 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

Test of Hypotheses 2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU

2017/3/27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering, NTU