Modeling approaches for the allocation of costs

Slides:

Advertisements

Similar presentations

Data Imputation United Nations Statistics Division (UNSD) 16 March 2011 Santiago, Chile.

Advertisements

Unido.org/statistics International workshop on industrial statistics 8 – 10 July, Beijing Non response in industrial surveys Shyam Upadhyaya.

Design of Experiments Lecture I

CountrySTAT Team-I November 2014, ECO Secretariat,Teheran.

Linear Programming. Introduction: Linear Programming deals with the optimization (max. or min.) of a function of variables, known as ‘objective function’,

Notes Sample vs distribution “m” vs “µ” and “s” vs “σ” Bias/Variance Bias: Measures how much the learnt model is wrong disregarding noise Variance: Measures.

Variance reduction techniques. 2 Introduction Simulation models should be coded such that they are efficient. Efficiency in terms of programming ensures.

Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.

1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.

Steps of a sound simulation study

Lecture 19 Simple linear regression (Review, 18.5, 18.8)

Simple Linear Regression. Introduction In Chapters 17 to 19, we examine the relationship between interval variables via a mathematical equation. The motivation.

1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.

Research Methodology Lecture No :16

Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.

Introduction A GENERAL MODEL OF SYSTEM OPTIMIZATION.

Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.

Campus Presentation at National Taiwan University Wesley Shu Assistant Professor San Diego State University.

Eurostat Statistical Matching using auxiliary information Training Course «Statistical Matching» Rome, 6-8 November 2013 Marco Di Zio Dept. Integration,

The Measurement of Nonmarket Sector Outputs and Inputs Using Cost Weights 2008 World Congress on National Accounts and Economic Performance Measures for.

Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT.

Chapter 1 Introduction n Introduction: Problem Solving and Decision Making n Quantitative Analysis and Decision Making n Quantitative Analysis n Model.

Chapter1: Introduction Chapter2: Overview of Supervised Learning

Machine Learning 5. Parametric Methods.

An Overview of Editing and Imputation Methods for the next Italian Censuses Gianpiero Bianchi, Antonia Manzari, Alessandra Reale UNECE-Eurostat Meeting.

Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.

Fundamentals of Data Analysis Lecture 10 Correlation and regression.

Sampling and Sampling Distribution

Demand Forecasting.

Typical farms and hybrid approaches

Introduction to Marketing Research

Short Training Course on Agricultural Cost of Production Statistics

Chapter 7. Classification and Prediction

The treatment of uncertainty in the results

Regression Analysis AGEC 784.

Allocation of costs in complex cropping and mixed farming systems

Multiple Imputation using SOLAS for Missing Data Analysis

The Simple Linear Regression Model: Specification and Estimation

Uncertainty Analysis in Emission Inventories

Determining How Costs Behave

Al-Imam Mohammad Ibn Saud University Large-Sample Estimation Theory

Chapter 17: Optimization In Simulation

CH 5: Multivariate Methods

Item Analysis: Classical and Beyond

Auditing & Investigations I

Introduction to Survey Data Analysis

Data Mining Lecture 11.

Chapter Three Research Design.

How to handle missing data values

Introduction to Instrumentation Engineering

Introduction to Systems Analysis and Design

Cost Estimation Chapter 5

Presenter: Ting-Ting Chung July 11, 2017

Random sampling Carlo Azzarri IFPRI Datathon APSU, Dhaka

Missing Data Imputation in the Bayesian Framework

The European Statistical Training Programme (ESTP)

Chapter 8: Weighting adjustment

CH2. Cleaning and Transforming Data

Spatial Data Analysis: Intro to Spatial Statistical Concepts

Classification Trees for Privacy in Sample Surveys

Product moment correlation

Non response and missing data in longitudinal surveys

Data processing German foreign trade statistics

Positive analysis in public finance

Sampling and estimation

The European Statistical Training Programme (ESTP)

Item Analysis: Classical and Beyond

DESIGN OF EXPERIMENTS by R. C. Baker

Item Analysis: Classical and Beyond

Chapter 13: Item nonresponse

Presentation transcript:

Modeling approaches for the allocation of costs Short Training Course on Agricultural Cost of Production Statistics

1 – Introduction and rationale Models (statistical , econometric, etc.) generally make a more efficient use of auxiliary information than rule-based procedures (which are also models, but more rudimentary) They can improve the precision and accuracy of the estimates if: They are used appropriately The data required for the modeling is available The assumptions are plausible (compare with assumptions made by other allocation methods) Models are generally more flexible than rule-based procedures The results are less dependent on the analysts’ subjectivity

2 – Types of models Statistical imputation techniques: Nearest neighbor imputation, interpolation, etc. These methods can be used if a sufficient pool of questionnaires with detailed data on costs by commodity exists. Econometric models: In general, their objective is to estimate commodity-specific technical coefficients They are based on a certain number of assumptions In the most sophisticated models, some of the most restrictive assumptions can be relaxed

3 - Statistical imputation techniques (1/3) Statistical methods that use data from a sub-set of the sample or from other surveys/censuses for which detailed activity data is available to estimate commodity-specific CoP Nearest neighbor imputation is the most common method: Hot-deck imputation: missing records (activity-specific CoP) are estimated from “similar” records from the same survey (hot-deck imputation) Cold-deck or donor-based imputation : when data from other surveys are used The difficulty resides in identifying the matching records (the nearest neighbors)

3 – Statistical imputation techniques (2/3) CoP Survey Missing records estimated from records from the same survey (hot-deck) Other survey Missing records estimated from records from another survey (cold-deck)

3 - Statistical imputation techniques (3/3) Step 1: Obtain commodity specific data on CoP Step 2: Identify the variables on which the matching will be realized (size, region, etc.). They have to be correlated with CoP but not between them Step 3: Identify the distance used to identify nearest neighbors Step4: Define the procedure (function) used for the imputation: Simple average, median Min, max, etc.

4 – The standard econometric model (1/2) The model estimates commodity-specific CoP under the assumption that input use is linearly dependent on the quantities produced and that inputs are not substitutable: The model simply formalizes the fact that: The total amount of input consumed by a farm should be equal to the sum of the input uses across all the activities of the farm, and that: This relationship is true but subject to measurement errors and gaps in the data Amount of input j used to produce one unit of commodity k Input j used by farm i Output k produced by farm i

4 – The standard econometric model (2/2) Data requirements: Survey data on farm-level input costs and output by commodity Sufficient number of farms and combinations of inputs and outputs is required in order to estimate technical coefficients Estimation techniques (depending on how the data is structured): MCO or Generalized MCO Adapted methods for panel data (Between, Within, etc.) Limitations: This technique may lead to obvious errors: negative technical coefficients, estimates outside reasonable bounds, etc. The use of more sophisticated models, such as entropy-based approaches, can eliminate some of these errors

5 – Entropy-based regressions (1/2) Rationale: making use of prior information on: Technical coefficients: minimum and maximum bounds, etc. The existence of constraints: accounting equations, etc. => To improve the quality of the estimations Main assumptions: The unknown technical coefficients are a random variable Auxiliary information can be used to bound the technical coefficients and provide a set of plausible values Additional prior information is available (accounting equations, non-negativity constraints, etc.) Estimation procedure (idea): the coefficients minimize the distance between the “true” probability distribution of the technical coefficients and the prior probability distribution

5 – Entropy-based regressions (2/2) Data requirements (minimal): Survey data on farm-level output and input Information on plausible values for the technical coefficients Advantages: cost-effective and statistically sound way to estimate commodity CoP Limitations: Complexity: the implementation of this method requires advanced statistical knowledge and experience R is one of the very few software proposing ready-to-use packages for entropy-based estimations

6 – Examples of model use Several references on model-based methods to estimate commodity CoP : Fragoso (2011): estimation of commodity-specific technical coefficients from the 2004 FADN database for a Portuguese region: prior information on the cost structure considerably improved the estimates Peeters (2002): similar model to undertake cost-allocation for a small number of dairy-beef farms in Brittany (France). The results clearly show the superiority of these methods with respect to more rudimentary approaches (rule-based, etc.). However, these modeling tools are very little used to produce national-level official statistics. This may be due to: A lack of data and prior information on the parameters and variables A lack of capacity to implement these methods and/or A defiance with respect to modeling tools in general

7 – References • The FACEPA project, European Union (7th Framework Program) The FACEPA project develops tools and methods to analyze production costs in European agriculture using FADN (Farm Accountancy Data Network) data • Fragoso and Carvalho (2011), Estimation of Cost Allocation Coefficients at the Farm Level Using and Entropy Approach • Desbois (2006), Méthodologie d’estimation des coûts de production agricole : comparaison de deux méthodes sur la base du RICA, Revue MODULAD, num. 35 • Peeters and Surry (2002), Generalized Cross Entropy Estimation of a Varying-Coefficients Model of Cost Allocation in a Multi-Product Farming, Symposium on Maximum entropy approaches to modeling agricultural diversity in European agriculture • CEoptim, Cross-entropy R package for optimization