ELEC 303 – Random Signals Lecture 18 – Classical Statistical Inference, Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 4, 2010.

Slides:



Advertisements
Similar presentations
Estimation of Means and Proportions
Advertisements

3.3 Hypothesis Testing in Multiple Linear Regression
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Previous Lecture: Distributions. Introduction to Biostatistics and Bioinformatics Estimation I This Lecture By Judy Zhong Assistant Professor Division.
Mean, Proportion, CLT Bootstrap
Uncertainty and confidence intervals Statistical estimation methods, Finse Friday , 12.45–14.05 Andreas Lindén.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
Sampling: Final and Initial Sample Size Determination
Statistics for Business and Economics
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Ch11 Curve Fitting Dr. Deshi Ye
Chapter 11- Confidence Intervals for Univariate Data Math 22 Introductory Statistics.
ELEC 303 – Random Signals Lecture 18 – Statistics, Confidence Intervals Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 10, 2009.
1 Chapter 2 Simple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Chapter 10 Simple Regression.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem, random variables, pdfs 2Functions.
Confidence intervals. Population mean Assumption: sample from normal distribution.
Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.
IEEM 3201 One and Two-Sample Estimation Problems.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
Lecture 7 1 Statistics Statistics: 1. Model 2. Estimation 3. Hypothesis test.
Inferences About Process Quality
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
BCOR 1020 Business Statistics
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Statistical Intervals Based on a Single Sample.
Random Variables and Probability Distributions
Standard error of estimate & Confidence interval.
Estimation Goal: Use sample data to make predictions regarding unknown population parameters Point Estimate - Single value that is best guess of true parameter.
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Statistical Intervals for a Single Sample
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Stats for Engineers Lecture 9. Summary From Last Time Confidence Intervals for the mean t-tables Q Student t-distribution.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
PROBABILITY (6MTCOAE205) Chapter 6 Estimation. Confidence Intervals Contents of this chapter: Confidence Intervals for the Population Mean, μ when Population.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Bayesian inference review Objective –estimate unknown parameter  based on observations y. Result is given by probability distribution. Bayesian inference.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 08/10/ :23 PM 1 Some basic statistical concepts, statistics.
Lecture 15: Statistics and Their Distributions, Central Limit Theorem
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Statistical estimation, confidence intervals
Lab 3b: Distribution of the mean
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
Confidence Interval & Unbiased Estimator Review and Foreword.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
BASIC STATISTICAL CONCEPTS Statistical Moments & Probability Density Functions Ocean is not “stationary” “Stationary” - statistical properties remain constant.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
CLASSICAL NORMAL LINEAR REGRESSION MODEL (CNLRM )
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
G. Cowan Lectures on Statistical Data Analysis Lecture 9 page 1 Statistical Data Analysis: Lecture 9 1Probability, Bayes’ theorem 2Random variables and.
Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.
Richard Kass/F02P416 Lecture 6 1 Lecture 6 Chi Square Distribution (  2 ) and Least Squares Fitting Chi Square Distribution (  2 ) (See Taylor Ch 8,
Confidence Intervals. Point Estimate u A specific numerical value estimate of a parameter. u The best point estimate for the population mean is the sample.
1 Ka-fu Wong University of Hong Kong A Brief Review of Probability, Statistics, and Regression for Forecasting.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
Ch3: Model Building through Regression
Chapter 8: Fundamental Sampling Distributions and Data Descriptions:
Statistical Methods For Engineers
Econ 3790: Business and Economics Statistics
The Regression Model Suppose we wish to estimate the parameters of the following relationship: A common method is to choose parameters to minimise the.
LESSON 18: CONFIDENCE INTERVAL ESTIMATION
Simple Linear Regression
Parametric Methods Berlin Chen, 2005 References:
Chapter 8: Fundamental Sampling Distributions and Data Descriptions:
12. Principles of Parameter Estimation
Fundamental Sampling Distributions and Data Descriptions
Presentation transcript:

ELEC 303 – Random Signals Lecture 18 – Classical Statistical Inference, Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 4, 2010

Lecture outline Reading: Confidence Intervals – Central limit theorem – Student t-distribution Linear regression

Confidence interval Consider an estimator for unknown  We fix a confidence level, 1-  For every  replace the single point estimator with a lower estimate and upper one s.t. We call, a 1-  confidence interval

Confidence interval - example Observations Xi’s are i.i.d normal with unknown mean  and known variance  /n Let  =0.05 Find the 95% confidence interval

Confidence interval (CI) Wrong: the true parameter lies in the CI with 95% probability…. Correct: Suppose that  is fixed We construct the CI many times, using the same statistical procedure Obtain a collection of n observations and construct the corresponding CI for each About 95% of these CIs will include 

A note on Central Limit Theorem (CLT) Let X 1, X 2, X 3,... X n be a sequence of n independent and identically distributed RVs with finite expectation µ and variance σ 2 > 0 CLT: as the sample size n increases, PDF of the sample average of the RVs approaches N(µ,σ 2 /n) irrespective of the shape of the original distribution

CLT A probability density functionDensity of a sum of two variables Density of a sum of three variablesDensity of a sum of four variables

CLT Let the sum of n random variables be S n, given by S n = X X n. Then, defining a new RV The distribution of Z n converges towards the N(0,1) as n approaches  (this is convergence in distribution),written as In terms of the CDFs

Confidence interval approximation Suppose that the observations X i are i.i.d with mean  and variance  that are unknown Estimate the mean and (unbiased) variance We may estimate the variance  /n of the sample mean by the above estimate For any given , we may use the CLT to approximate the confidence interval in this case From the normal table:

Confidence interval approximation Two different approximations in effect: – Treating the sum as if it is a normal RV – The true variance is replaces by the estimated variance from the sample Even in the special case where the X i ’s are i.i.d normal, the variance is an estimate and the RV T n (below) is not normally distributed

t-distribution For normal X i, it can be shown that the PDF of T n does not depend on  and  This is called t-distribution with n-1 degrees of freedom

t-distribution Its is also symmetric and bell-shaped (like normal) The probabilities of various intervals are available in tables When the Xi’s are normal and n is relatively small, a more accurate CI is (z=1-  /2)

Example The weight of an object is measured 8 times using an electric scale It reports true weight + random error ~N(0,  ).5547,.5404,.6364,.6438,.4917,.5674,.5564,.6066 Compute the 95% confidence interval Using the t-distribution

Linear regression Building a model of relation between two or more variables of interest Consider two variables x and y, based on a collection of data points (x i,y i ), i=1,…,n Assume that the scatter plot of these two variables show a systematic, approximately linear relationship between x i and y i It is natural to build a model: y  0 +  1 x

Linear regression Often, we cannot build a model, but we can estimate the parameters: The i-th residual is:

Linear regression The parameters are chosen to minimize the sum of squared residuals Always keep in mind that the postulated model may not be true To perform the optimization, we set the partial derivatives to zero w.r.t  0 and  1

Linear regression Given n data pairs (x i,y i ), the estimates that minimize the sum of the squared residuals are

Example The leaning tower of Pisa continuously tilts Measurements bw Find the linear regression

Solution

Justification of the least square Maximum likelihood Approximation of Bayesian linear LMS (under a possibly nonlinear model) Approximation of Bayesian LMS estimation (linear model)

Maximum likelihood justification Assume that x i ’s are given numbers Assume y i ’s are realizations of a RV Y i as below where Wi’s are i.i.d ~N(0,  2 ) Y i =  0 +  1 x i + W i The likelihood function has the form ML is equivalent to minimizing the sum of square residuals

Approximate Bayesian linear LMS Assume x i and y i are realizations of RVs X i & Y i, (X i,Y i ) pairs are i.i.d with unknown joint PDF Assume an additional independent pair (X 0,Y 0 ) We observe X 0 and want to estimate Y 0 by a linear estimator The linear estimator is of the form

Approximate Bayesian LMS For the previous scenario, make the additional assumption of linear model Y i =  0 +  1 x i + W i W i ’s are i.i.d ~N(0,  2 ), independent of X i We know that E[Y 0 |X 0 ] minimizes the mean squared estimation error, for E[Y 0 |X 0 ]=  0 +  1 X i As n  ,

Multiple linear regression Many phenomena involve multiple underlying variables, also called explanatory variables Such models are called multiple regression E.g., for a triplet of data points (x i,y i,z i ) we wish to estimate the model: y   0 +  1 x +  2 z Minimize:  i (y i -  0 -  1 x i -  2 z i ) 2 In general, we can consider the model y   0 +  j  j h j (x)

Nonlinear regression Sometimes the expression is nonlinear in the unknown parameter Variables x and y obey the form y  h(x;  ) Min  i (y i – h(x i ;  )) 2 The minimization is not typically closed-form Assuming W i ’s are N(0,  2 ), Y i = h(x i ;  ) + W i The ML function

Practical considerations Heteroskedasticity Nonlinearity Multicollinearity Overfitting Causality