Download presentation

Presentation is loading. Please wait.

Published byCarlos Vega Modified over 2 years ago

1
A Flavour of Errors in Variables Modelling Jonathan Gillard

2
Constructing the Model We have two variables, ξ and η. ξ and η are linearly related in the form η = α+βξ. Instead of observing n pairs ( ξ i, η i ) we observe the n data pairs (x i,y i ), where x i = ξ i + δ i y i = η i + ε i and it is assumed that i and i are independent error terms having zero mean and variances σ δ and σ ε respectively. 2 2

3
Downs Syndrome Affects 1 in 1000 children born in the UK. Downs is caused by the presence of an extra chromosome. An extra copy of chromosome 21 is included when the sperm and the egg combine to form the embryo. Screening tests are used to calculate the chance of a baby having the condition.

4
The Data Set

5
How can we fit a line? There are clearly errors in both variables. To use standard statistical techniques of estimation to estimate β, one needs additional information about the variance of the estimators – Madansky (1959) We know the dating error is ±2 days – this is enough information!

6
Method of Moments The method of moments has a long history, involves an enormous amount of literature, has been through periods of severe turmoil associated with its sampling properties compared to other estimation procedures, yet survives as an effective tool, easily implemented and of wide generality – Bowman and Shenton

7
Method of Moments The maximum likelihood approach to estimation is primarily justified by asymptotic (as the sample size goes to infinity) considerations – Cheng and Van Ness

8
Estimating the Parameters As the dating error is ±2 days, then σ δ = 2. Use a modified y on x regression estimator: β = s xy / (s xx - σ δ ). Other parameters i.e. intercept α can be estimated from the method of moment equations. 2

9
Regression Lines

10
Typology of Residuals What are residuals used for? 1.Prediction 2.Model checking 3.Leverage 4.Influence 5.Deletion

11
Estimating the true points Two naive m.m.es of ξ: The optimal linear combination is:

12
The Estimated True Points

13
Estimated true against observed

14
A residual? Attempt to write as a usual regression model: y = α + βx + (ε - βδ) 1. x is always random due to random error 2. Cov(x, ε – βδ) = -β σ δ 3. Using ordinary l.s. estimates leads to inconsistent estimators 2

15
Residuals

16
Residuals again!

17
Questions?

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google