unido.org/statistics Effect of the Cut-off Size on International Comparability of Industrial Statistics Shyam Upadhyaya International workshop on industrial statistics 8 – 10 July, Beijing

unido.org/statistics Outline Why cut-off sampling Various thresholds and cut-off point Cut-off size and problems of data comparability UNIDO databases and estimation of cut-off portion

unido.org/statistics 3 Characteristics of the industrial survey  Target population highly heterogeneous by kinds of activities There are more than 150 activity groups in manufacturing at 4-digit  Highly skewed distribution A few large companies account for a significant amount of major variables of interest (employment, output, capital formation)  Statistical units are not uniform Vary from hardly identifiable family business to huge industrial enterprise with a complex corporate structure  Specific geographical distribution Designated industrial estates, export processing zones, area of mineral resources or other raw materials

unido.org/statistics 4 Example of the distribution pattern

unido.org/statistics 5 Cut-off sampling – theoretical framework Use of cut-off in two ways:  Designate a cut-off size to divide population into two sub-populations -W. Edwards Deming (1960)  Exclude smaller units from the survey “Cut-off is a sampling procedure in which a predetermined threshold is established with all units in the universe at or above the threshold being included in the sample and all units below the threshold being excluded… In the case of establishments, size is usually defined in terms of employment or output.” - OECD Glossary of statistical terms

unido.org/statistics 6 Design considerations for cut-off sampling

unido.org/statistics 7 Decision on cut-off point An indicative cut-off point can be determined by where m j is the size measure and n is sample size For an economy with 150 000 employees a sample of 500 establishments would mean that establishments with 300 and more employees can be selected with certainty More precise cut-off size for certainty - Vijaya Verma (1991), SIAP π j is the selection probability and M j - the cumulative value of size measure of j-th unit. Application of this method shows that units larger than a particular size gets the selection probability exceeding 1 which determines the cut-off point for certainty

unido.org/statistics 8 Illustration of two sides cut-off... Two sides cut-off: 1. For selection of units with certainty 2. For exclusion of units from sample

unido.org/statistics 9 Estimation of cut-off portion - Särndal, Swensson, Wretmann (1991)

unido.org/statistics Survey thresholds in practice Lower the cut-off point better the coverage and more accurate are survey estimates

unido.org/statistics 11 Cut-off as a threshold for two independent surveys … Survey of larger units  Based on the list frame from the business register  Information needed for stratification normally available  A detail questionnaire on a large range of data items is implemented  Data collection through mail or web- based questionnaire Survey of smaller units Area-cum-list frame Limited information for stratification A smaller version of the questionnaire with basic data items Extensive field survey through direct interview A threshold may divide the population into sub-populations for independent surveys

unido.org/statistics Pros and cons of two independent surveys 12 Single estimates of entire industry cannot be produced Larger establishments survey gets priority and survey of smaller units is ignored Results of larger establishment survey are de facto presented as the estimates for total industry Design flexibility Better planning More applicable survey instruments

unido.org/statistics Cut-off size and problems of international data comparability 1. Size measure applied to different statistical units Enterprise - in European countries Establishment – in North America, Japan and most of the developing countries 2. Cut-off size varies by country Bangladesh, Nepal 10 persons engaged India - units using power and employing 10 or more workers; all others employing 20 or more workers. Sri Lanka - 25 persons or more engaged 3. Cut-off size changes over time in the same country Switch over from employment to output criteria

unido.org/statistics Number of countries by type of cut-off size Source: UNIDO Metadata Mostly developed statistical system

unido.org/statistics Problems and solutions No standard cut-off point across the countries (not a solution !) Data represents varying portions of industry if no estimation is made for cut- off portion Detail information about the cut-off size is missing in international reporting Results without estimation for cut-off portion become incomplete and incomparable  Lower cut-off size produces more comparable estimates - US cut-off for ASM is 1 paid employee  Cut-off based on the contribution of units to total value added - 1% cut-off means smallest units with total combined contribution less than 1 percent are excluded  An established estimation procedure for establishments below the cut-off point

unido.org/statistics Estimation at international level Source: UNIDO Size-class database Additional data on ratio from other survey data for entire population is rarely available to UNIDO Instead, UNIDO uses its size class database to get such ratio as VA per employee to estimate the value for cut-off portion. A regression model is used to predict the VA per employee for smaller establishments.

unido.org/statistics In conclusion: Designation of a threshold or cut-off in industrial surveys is a convenient approach, but not an ideal one. It has trade off between the time and cost on one hand and the precision of results on the other hand. Cut-off sampling should be applied with some precaution: - whether cut-off portion is really negligible - if not, whether data are available for estimation of cut- off portion and how the estimates will be produced for cut-off portion For international organizations like UNIDO it is necessary that any cut-off point applied to survey should be reported so that any necessary estimation could be made.

