Lecture 2 Estimating the population mean

Lecture 2 Estimating the population mean

Harjoitusten palauttamisesta
Harjoitukset palautetaan yhtenä dokumenttina. Nimi ja opiskelijanumero jokaiseen palautukseen. Jos tehtävät tekee kynällä & paperilla, niin paperien skannaaminen on ok kunhan laatu on riittävä.

Estimating the population mean
Why spend time on estimating the mean? Highlights the difference between an object and an estimate of it. Helps to appreciate the Law of Large Numbers. Helps to appreciate the Central Limit Theorem.

Basic concepts Population Sample, random sampling Random variable
Distribution Moments

Population The group or collection of all possible entities of interest (“students in the emetrics class of Otto”) In the developments here, think of a very large (infinitely large) population.

Sample A subset of the population (“all students in the 𝑛 𝑡ℎ row/column of the class”).

Random sampling Random sampling: each object in the population (“student”) has the same probability of being selected into the sample. Any two objects give no information about each other:  Independently distributed. Before being chosen, they are in expectation equal:  Identically distributed.

Random variable Numerical summary of a random outcome (“height of student”).

Distribution All the values that the variable, say Y, may get +
The probability of getting each of those values. Example: coin tosses, lottery numbers, height of students in Otto’s class. Conditional distribution: The distribution of Y conditional on another variable, say, X (“height of students in Otto’s class, conditional on gender”).

Moments How to describe a distribution? mean
Conditional first (and higher) moment: 𝐸 𝑌|𝑋 = 𝜇 𝑌|𝑋 Higher moments: variance, skewness, kurtosis, ...

HS päivän lehti

Estimating the mean of a population
Estimator = function of a sample of data drawn randomly from the population. Estimate = numerical value of the estimator, given a particular sample.

Population mean 𝜇 𝑌 = 1 𝑁 𝑖=1 𝑁 𝑌 𝑖 Sample mean 𝑌 = 1 𝑛 𝑖=1 𝑛 𝑌 𝑖 , n < N 𝑌 is a natural estimate of 𝜇 𝑌 .

2 questions: What are the properties of 𝑌 ? 2. Why use 𝑌 and not some other estimator?

Properties of 𝑌 𝑌 is a random variable.
Its properties are determined by the sampling distribution (“otantajakauma”). The individual observations which are used to calculate 𝑌 were chosen randomly.

Properties of 𝑌  𝑌 is random.
Q: what happens if you take a different random sample? The distribution of 𝑌 over different samples of the same size (n) is called the sampling distribution.

Properties of 𝑌 Sampling distribution: all the values that 𝑌 can take given n + The probability of each of these values. The mean and variance of 𝑌 are the mean and variance of its sampling distribution. The sampling distribution is very important.

Properties of 𝑌 If 𝐸 𝑌 = 𝜇 𝑌 , then 𝑌 is an unbiased (harhaton) estimate of 𝜇 𝑌 . (Note any estimator 𝜇 𝑌 ). If 𝑌 → 𝜇 𝑌 when 𝑛→∞, then 𝑌 is a consistent (tarkentuva) estimate of 𝜇 𝑌 . This is the case, due to the Law of Large Numbers (“suurten lukujen laki”), under certain conditions.

Law of Large Numbers: conditions
𝑌 𝑖 are independently and identically distributed. 𝐸 𝑌 𝑖 = 𝜇 𝑌 No large outliers / 𝑣𝑎𝑟 𝑌 𝑖 <∞

Properties of 𝑌 How precise is 𝑌 , and how does this depend on n?
In other words, how large is the variance of 𝑌 ? Central Limit Theorem (”Keskeinen raja-arvolause”).

Central Limit Theorem Suppose the sample is random and i.i.d.
𝐸 𝑌 𝑖 = 𝜇 𝑌 𝑣𝑎𝑟 𝑌 𝑖 = 𝜎 𝑌 2 , 0< 𝜎 𝑌 2 <∞. Then, as 𝑛→∞, distribution of ( 𝑌 − 𝜇 𝑌 )/ 𝜎 𝑌 2 becomes arbitrarily well approximated by the standard normal distribution.

Central Limit Theorem CLT is about the distribution of the estimate of the mean. CLT applies no matter what the underlying distribution is. Examples: coin tosses (binary), age (only positive values / integers observed), …

Properties of 𝑌 Result 𝐸 𝑌 = 𝜇 𝑌 Var 𝑌 = 𝜎 2 /𝑛

Height of students in class
ind/group 1 2 3 4 5 6 7 168 165 172 170 174 160 173 164 178 177 171 186 180 190 184 185 175

Data in a histogram

same with some statistics
ind/group 1 2 3 4 5 6 7 AVG SD 168 165 172 170 174 160 168.43 4.69 173 164 178 177 170.29 6.55 171 186 180 174.86 9.25 190 184 185 175 182.29 8.06 174.75 171.75 172.25 173.75 178.00 174.50 172.75 173.96 10.24 9.00 10.21 9.32 5.89 11.12 11.24 8.80

7 times sample of 4

10 times sample of 4 100 times sample of 4

𝑌 as a least squares estimator
𝑌 minimizes the sum of squared residuals. 𝑚𝑖𝑛 𝑚 𝑖=1 𝑁 ( 𝑌 𝑖 −𝑚) 2 Optimizing (see App. 3.2) yields 𝑚 = 1 𝑛 𝑖=1 𝑁 𝑌 𝑖 = 𝑌

𝑌 as a least squares estimator
𝑌 has smaller variance than all other linear unbiased estimators.  𝑌 is more efficient than other (linear) estimators.  𝑌 is BLUE (best linear unbiased estimator).

Choosing an objective / loss function
Least squares Absolute deviations Min / max. May depend on context: think of a basket ball team. think of #incubators relative to need.

Comparing means Two means, 𝑌 1 and 𝑌 2 . (height of male / female students). Are they (not) different?   is 𝑌 𝑌 2 =0? What else do you know? You have an estimate of the variances of the means.

Comparing means 𝑌 1 and 𝑌 2 are independently distributed.
 their difference is normally distributed.  variance of 𝑌 1 − 𝑌 2 is 𝜎 𝑛 𝜎 𝑛 2

Female height distr Male height distr

Lecture 2 Estimating the population mean

Similar presentations

Presentation on theme: "Lecture 2 Estimating the population mean"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Lecture 2 Estimating the population mean

Similar presentations

Presentation on theme: "Lecture 2 Estimating the population mean"— Presentation transcript:

Similar presentations

About project

Feedback