# Constructing Confidence Intervals based on Register Statistics Thomas Laitila Statistics Sweden and Örebro university Presentation.

## Presentation on theme: "Constructing Confidence Intervals based on Register Statistics Thomas Laitila Statistics Sweden and Örebro university Presentation."— Presentation transcript:

Constructing Confidence Intervals based on Register Statistics Thomas Laitila Statistics Sweden and Örebro university thomas.laitila@scb.se Presentation at Q2014, Vienna, June 2-5, 2014

Outline Why - Why confidence intervals? Criteria - Criteria on measures of uncertainty of register statistics CIm - Confidence Image, a new tool Example Discussion Q2014, Vienna, June 2-5Thomas Laitila2

Why - Chatterjee (2003) There are two methods for deriving statements – deduction and induction Statistics provide with a method for inductive inference Q2014, Vienna, June 2-5Thomas Laitila3

Why - Induction Assumptions Evidence (Statistics) Area of concern Statement Q2014, Vienna, June 2-5Thomas Laitila4

Why - Induction and Evidence All evidence come with uncertainty of the general Statements derived by induction are uncertain Example: Inductive statement – A man will inevitably die – Evidence – No man born for more than 150 years ago is still alive. Q2014, Vienna, June 2-5Thomas Laitila5

Why - Why is statistical inference so special? Statistics is the only theory providing with objective measures of uncertainty of inductive inference. Objective measures of uncertainty essential in official statistics Q2014, Vienna, June 2-5Thomas Laitila6

Why - Summing up Register statistics are uncertain Statistical inference provide with objective measurements of uncertainty Inference on register statistics should be founded in statistical inference theory Do we have appropriate statistical tools? – Yes, and no No tool fulfills reasonable criteria Q2014, Vienna, June 2-5Thomas Laitila7

Criteria – Criteria on a measure a)Founded within statistical inference theory Interpretable and objective measures b)Easy to interpret by users How easy is the interpretation of an ordinary confidence interval? c)Of low cost d)Comparable with measures in sample surveys Comparability/coherency Q2014, Vienna, June 2-5Thomas Laitila8

Confidence Image (Cim) - Laitila (2014) Idéa: Use external information to restrict the potential values of study variables (y 1,y 2,…,y N ) – This will restrict the potential values of the population parameter of interest t=f(y 1,y 2,…,y N ) – The more information, the more t is restricted. Information can come in any form, as long it comes with a measure of uncertainty We can use registers, sample surveys, old statistics, big data, google, facebook, whatever!!! Q2014, Vienna, June 2-5Thomas Laitila9

Example - Estimation of total number of cattle in Swedish farms CountyN:o unitsN:o missing valuesSum of y_k 1187133817393797 2143212918296944 3122812475261832 4108362213216535 586461763185285 672331485148029 Total72030146711502422 Table 1: Information 1 in available register on farms (N=72030) 1) No measurement or coverage errors in the register. Problem: Estimate the total number of cattle with an interval estimate using the information in the register, which contains missing values. Q2014, Vienna, June 2-5Thomas Laitila10

Example - Pieces of information A1: Available data in the register A2: The 100 largest farms are in the register and the N:o cattle for the 100 th largest farm is 553. A3: N:o farms with ≥ 100 cattle (Table 2) A4: A 95% CI of the proportion of farms with zero cattle: 0.6 – 0.71 Q2014, Vienna, June 2-5Thomas Laitila11

In registerIn population Countyy_k=0y_k>=553y_k>=100 191082912521288 2698917931959 3596021784800 4532912677701 5419610581601 6356511467477 Total3514710046924826 Table 2: Additional information (N:o units) Example – Table 2 Q2014, Vienna, June 2-5Thomas Laitila12

Example – Calculated CIms Information Used Confidence Level Lower bound Upper bound A1 - A2100%15029615 A1 - A3100%15163016 A1 – A495%15162217 Table 3:Confidence intervals for the total number of cattle based on information sets A1 – A4. (Thousands cattle, True value 1,56 million) Q2014, Vienna, June 2-5Thomas Laitila13

Discussion The CIm can directly be generalized to multivariate cases. The CIm fulfill all the four criteria listed above. Traditional confidence intervals are special cases of CIms Any kind of information (data) can be used, as long as there is a probability measure of its certainty The CIm is a theory, there is a need of methodological developments Q2014, Vienna, June 2-5Thomas Laitila14

Thanks for Your attention! Request for paper Laitila (2014) thomas.laitila@scb.se Q2014, Vienna, June 2-5Thomas Laitila15

Similar presentations