ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany Training Workshop on the ICCS 2009 database Weights and Variance Estimation picture.

Slides:



Advertisements
Similar presentations
“Students” t-test.
Advertisements

Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
SAMPLE DESIGN: HOW MANY WILL BE IN THE SAMPLE—DESCRIPTIVE STUDIES ?
9. Weighting and Weighted Standard Errors. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Sampling: Final and Initial Sample Size Determination
Chapter 10: Estimating with Confidence
ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany Using the IEA IDB Analyzer to merge and analyze data.
ICCS 2009 IDB Workshop, 18 th February 2010, Madrid Using the IEA IDB Analyzer to merge and analyze data.
Chapter 8 Estimation: Additional Topics
Variability Measures of spread of scores range: highest - lowest standard deviation: average difference from mean variance: average squared difference.
Chap 9-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 9 Estimation: Additional Topics Statistics for Business and Economics.
Chapter 9 - Lecture 2 Computing the analysis of variance for simple experiments (single factor, unrelated groups experiments).
Major Points Formal Tests of Mean Differences Review of Concepts: Means, Standard Deviations, Standard Errors, Type I errors New Concepts: One and Two.
Chapter 10: Estimating with Confidence
Understanding sample survey data
1 Psych 5500/6500 Statistics and Parameters Fall, 2008.
Review of normal distribution. Exercise Solution.
ICCS 2009 IDB Workshop, 18 th February 2010, Madrid Using the IEA IDB Analyzer Correlations & Regression.
ICCS th NRC Meeting, February 15 th - 18 th 2010, Madrid 1 Sample Participation and Sampling Weights.
Today’s lesson Confidence intervals for the expected value of a random variable. Determining the sample size needed to have a specified probability of.
Chapter 11: Estimation Estimation Defined Confidence Levels
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Lecture 14 Dustin Lueker. 2  Inferential statistical methods provide predictions about characteristics of a population, based on information in a sample.
PROBABILITY (6MTCOAE205) Chapter 6 Estimation. Confidence Intervals Contents of this chapter: Confidence Intervals for the Population Mean, μ when Population.
Comparing two sample means Dr David Field. Comparing two samples Researchers often begin with a hypothesis that two sample means will be different from.
LECTURE 3 SAMPLING THEORY EPSY 640 Texas A&M University.
Confidence intervals and hypothesis testing Petter Mostad
Confidence Intervals: The Basics BPS chapter 14 © 2006 W.H. Freeman and Company.
ICCS 2009 IDB Workshop, 18 th February 2010, Madrid Using the IEA IDB Analyzer Percentages & Means.
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
© Copyright McGraw-Hill 2000
6.1 Inference for a Single Proportion  Statistical confidence  Confidence intervals  How confidence intervals behave.
CHEMISTRY ANALYTICAL CHEMISTRY Fall Lecture 6.
ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 1 Training Workshop on the ICCS 2009 database Weighting and Variance Estimation picture.
Analysis Overheads1 Analyzing Heterogeneous Distributions: Multiple Regression Analysis Analog to the ANOVA is restricted to a single categorical between.
Introduction to Statistical Inference Jianan Hui 10/22/2014.
Review Normal Distributions –Draw a picture. –Convert to standard normal (if necessary) –Use the binomial tables to look up the value. –In the case of.
Measuring change in sample survey data. Underlying Concept A sample statistic is our best estimate of a population parameter If we took 100 different.
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
CHAPTER 6: SAMPLING, SAMPLING DISTRIBUTIONS, AND ESTIMATION Leon-Guerrero and Frankfort-Nachmias, Essentials of Statistics for a Diverse Society.
Variability. The differences between individuals in a population Measured by calculations such as Standard Error, Confidence Interval and Sampling Error.
Estimating standard error using bootstrap
Sampling and Sampling Distribution
Variability.
Chapter 8: Estimating with Confidence
Dependent-Samples t-Test
Chapter 8: Estimating with Confidence
Chapter 6 Inferences Based on a Single Sample: Estimation with Confidence Intervals Slides for Optional Sections Section 7.5 Finite Population Correction.
ESTIMATION.
ECO 173 Chapter 10: Introduction to Estimation Lecture 5a
ECO 173 Chapter 10: Introduction to Estimation Lecture 5a
Estimation of Sampling Errors, CV, Confidence Intervals
2. Stratified Random Sampling.
Random sampling Carlo Azzarri IFPRI Datathon APSU, Dhaka
Sampling Distribution
Sampling Distribution
Chapter 8: Estimating with Confidence
15.1 The Role of Statistics in the Research Process
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
STA 291 Spring 2008 Lecture 13 Dustin Lueker.
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Presentation transcript:

ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany Training Workshop on the ICCS 2009 database Weights and Variance Estimation picture

Content of this presentation Weights What are they? Why do we need them? How do we use them? Standard errors What are they? Why do we need them? How do we estimate them?

What are weights? Values assigned to all sampling units. Students, teachers, schools. The weight of a sampled unit indicates the number of units in the population that is represented by this sampled unit.

Weights and selection probabilities Weights are based on the sample selection probabilities. ICCS sampling units had different selection probabilities. High selection probability  small weight, Low selection probability  large weight. Weights get applied at each sampling stage. School sampling weights, Within-school sampling weights.

Weights and non-participation adjustments Weights are adjusted for non-participation. Non-participation can happen at each level. School level, Within-school level. Non-participation adjustments compensate losses for specific groups of sampled units.

Why do we need weights? Weights allow conclusions to be drawn about the population based on information from the sample. Weights allow estimates of population parameters. Un-weighted data only allow conclusions about the sampled units.

Population  sample Example: in the population, 20% of the students are in private schools (red), 80% are in public schools (blue). In the sample, 50% of the students are from private schools, 50% are from public schools.

Sample  estimate In order to estimate the correct proportion of students in the population, different weights must be assigned to the students in the sample.

Analyzing weighted data – a simple example 1:10 1:1

Un-weighted mean

Weighted mean

Weights in ICCS The ICCS Data contain several weight variables. Total Student Weight: TOTWGTS, Total Teacher Weight: TOTWGTT, Total School Weight: TOTWGTC. The IDB Analyzer automatically selects the correct weight.

ICCS example: civic knowledge in Chile Unweighted: average of

ICCS example (cont.) Difference: 10.8 score points. Reason for the difference: over-sampling of students in private schools. 13.7% of the tested students, 5.9% of the sum of weights.

What are standard errors? The standard error of a statistic is the standard deviation of the sampling distribution of that statistic. The sampling distribution is the distribution of the statistic for all possible samples of the same size and method. Since no one can select all possible samples, the standard error can only be estimated. Title of presentation: "Weights and Variance Estimation"

Why do we need standard errors? The ICCS results are based on samples. All ICCS results are therefore estimates of unknown population values. Standard errors can be used to measure how close these estimates are to the real values.

Standard errors and confidence intervals Let ε stand for any statistic of interest (mean, percentage...) A 95% confidence interval is defined as This is the black bar in Table 3.10 of the ICCS International Report. With a confidence of 95%, the true mean is between and Take rounding into account!

Standard errors and significance tests Whenever you see a funny little triangle in the ICCS international report, standard errors are hidden in the background.

Estimating standard errors In a simple random sample, estimating the standard error of a statistic ε is easy. Just divide the standard deviation of the sample (s) by the square root of the sample size (n) In a complex sample design like in ICCS, it is not as easy to estimate the standard error as in a simple random sample. ^

Complex sample design - clustering Clustered sample: Students within a school are likely to be more similar to each other than students from different schools. Similarly for teachers. Clustering usually decreases sampling precisison.

Complex sample design – strata and weights Stratification: Students within a stratum are more similar to each other than students from different strata. Similar for schools and teachers. Stratification usually increases sampling precision. Weights: Using weights usually decreases sampling precision. They complicate the calculations.

Why not just use SPSS? Standard software packages like SPSS will not provide correct estimates for standard errors. The software assumes that the data is from a simple random sample, and uses the incorrect formula. Generally, the estimate will be too small.

Jackknife Repeated Replication Solution: Jackknife Repeated Replication (JRR). Used for estimating standard errors in complex designs. Basic idea: systematically re-compute a statistic on a set of replicated samples: By setting the weights to zero for one school at a time, While doubling the weights of another school. Estimate the standard error of a statistic from the variability of that statistic between the full sample and the replicates.

The basics of the JRR method Jackknife variance estimation in ICCS: Participating schools are paired according to the order in which they were sampled. These school pairs are called jackknife zones. JKZONES, JKZONET, JKZONEC One school in each zone is randomly assigned an indicator of 1 (0 for the other school). JKREPS, JKREPT, JKREPC This indicator decides whether a school gets its replicate weight doubled or zeroed.

A look inside the IDB Analyzer

Standard error: ^

ICCS example: teacher age in Chile Standard error of the teacher age in Chile. SPSS just can’t do that:

Standard errors and plausible values For ICCS achievement data, the standard error consists of two components. Sampling error: This is what we just discussed. Addtionally, measurement error: Resulting from the use of plausible values. This is the topic of the next presentation.

Conclusion Use sampling weights! Compute standard errors using the JRR!

Thank you for your attention!