# Module B-4: Processing ICT survey data TRAINING COURSE ON THE PRODUCTION OF STATISTICS ON THE INFORMATION ECONOMY Module B-4 Processing ICT Survey data.

## Presentation on theme: "Module B-4: Processing ICT survey data TRAINING COURSE ON THE PRODUCTION OF STATISTICS ON THE INFORMATION ECONOMY Module B-4 Processing ICT Survey data."— Presentation transcript:

Module B-4: Processing ICT survey data TRAINING COURSE ON THE PRODUCTION OF STATISTICS ON THE INFORMATION ECONOMY Module B-4 Processing ICT Survey data Unctad Manual Chapter 7

Module B4: Processing ICT survey data UNCTAD 2 Objectives After completing this module you will know how to do: Data processing Data weighting (grossing-up) Data editing Data analysis Contents of this module 4. Data processing and analysis 4.1 Data editing 4.2 Data weighting 4.3 Estimating ICT indicators

Module B4: Processing ICT survey data UNCTAD 3 Data editing  Statistical information provided by businesses can contain errors such as  Wrong or missing data,  Incorrect classifications  Inconsistent or illogical responses.  Solutions to minimize such errors  Ex ante optimize the effectiveness of data capture instruments collection procedures.  Ex post application of robust data editing techniques Editing! What is editing? B4.1. Data editing Page 82

Module B4: Processing ICT survey data UNCTAD 4 Phases of data processing Raw data Quality controls during data collection and entry Clean data file Data editing Treatment of internal errors and inconsistencies Estimation of missing data Outlier analysis Re-weighting procedures Editing of aggregates Micro-editing (input) Macro-editing (output ) Editing! B4.1. Data editing

Module B4: Processing ICT survey data UNCTAD 5 Internal inconsistencies and errors  Validity control of an individual data item requires: 1.To define a valid set of responses (in general, gender should be = 0 or 1, age should not be 110 years, etc; in ICT use of Internet by business should be 0 or 1) 2.To check questions against valid responses - Definition of rules based in relationships between questions (see Box 15 of the Manual: some logical tests) 3.Arithmetic checks during data entry or batch mode (totals, subtotals, frequencies) B4.3. Estimating ICT indicators Page 82

Module B4: Processing ICT survey data UNCTAD 6 Treatment of missing data  Final non-response (missing data) should be treated to avoid biased estimates.  Unit non-response treatment: Corrective weighting. Sample-based methods (the original weights are modified with sample information) Population-based method (the weights are modified with population information, the classical post stratification procedure) B4.3. Estimating ICT indicators Page 84

Module B4: Processing ICT survey data UNCTAD 7 Treatment of missing data (cont.)  Final non-response (missing data) should be treated in order to avoid biased estimates.  Item non-response treatment: Imputation. Deterministic imputation (a law). Hot deck imputation (let’s do it now). Cold deck imputation (using other information, models, econometrics…). Mean or modal value imputation ( it is clear). Historical imputation (long series). B4.3. Estimating ICT indicators Page 151 Annexe 5

Module B4: Processing ICT survey data UNCTAD 8 Misclassified units  Two cases of misclassification  Non-eligibility unit erroneously included This will reduce the effective sample size unless a reserve list is prepared  Eligible unit included in the wrong stratum or omitted from the frame altogether The technical solution consists of recalculating sample weights (see Box 17) B4.3. Estimating ICT indicators Page 86

Module B4: Processing ICT survey data UNCTAD 9 Some simple weighting methods  The sample average in stratum h is defined as  The estimate for the total for stratum h can be obtained by multiplying the stratum average by the total number of businesses in the stratum (Nh) B4.2. Data weighting

Module B4: Processing ICT survey data UNCTAD 10   The estimate for the total in the population is just or See boxes 18 and 19 pag 89 B4.3. Estimating ICT indicators Some simple weighting methods (cont.)

Module B4: Processing ICT survey data UNCTAD 11 Estimating proportions and ratios  A proportion:  Four different types of estimates are very usual  Simple random sampling of a non-stratified population  Stratified random sampling With one or several strata exhaustively investigated  Ratio estimates with simple random sampling  Ratio estimates with stratified random sampling A ratio : B4.3. Estimating ICT indicators n n ICT indicators are mainly proportions and ratios.

Module B4: Processing ICT survey data UNCTAD 12 CASE 1: Simple random sampling of a non- stratified population  The indicator can be expressed as the sample proportion:  The standard error (SE) of the sample proportion is estimated by :  SE expression valid with a sampling fraction of 10% or less B4.3. Estimating ICT indicators

Module B4: Processing ICT survey data UNCTAD 13 CASE 2: Stratified random sampling  An unbiased estimate of p is: Where, L : the number of strata Nh : the population in stratum h (h=1, 2,... L) nh : the sample size in stratum h (h=1, 2,... L)  The estimate of the SE of :  See Annex 4 of the Manual for more details B4.3. Estimating ICT indicators

Module B4: Processing ICT survey data UNCTAD 14 CASE 3: Ratio estimates with simple random sampling  The indicator to estimate is :  The natural estimate of ratio p is:  Finally, one approximation of the SE is: where is the sample average of n X observations, This is a reference outside the scope of our course B4.3. Estimating ICT indicators

Download ppt "Module B-4: Processing ICT survey data TRAINING COURSE ON THE PRODUCTION OF STATISTICS ON THE INFORMATION ECONOMY Module B-4 Processing ICT Survey data."

Similar presentations