# Efficiency and Productivity Measurement: Stochastic Frontier Analysis

## Presentation on theme: "Efficiency and Productivity Measurement: Stochastic Frontier Analysis"— Presentation transcript:

Efficiency and Productivity Measurement: Stochastic Frontier Analysis
D.S. Prasada Rao School of Economics The University of Queensland, Australia

Stochastic Frontier Analysis
It is a parametric technique that uses standard production function methodology. The approach explicitly recognises that production function represents technically maximum feasible output level for a given level of output. The Stochastic Frontier Analysis (SFA) technique may be used in modelling functional relationships where you have theoretical bounds: Estimation of cost functions and the study of cost efficiency Estimation of revenue functions and revenue efficiency This technique is also used in the estimation of multi-output and multi-input distance functions Potential for applications in other disciplines Fare et al 1994, OECD and MPI (regional concept) Coelli and Rao (2005) Ag Economics 95 countries Ag productivity using MPI technical efficiency, which reflects the ability of a firm to obtain maximal output from a given set of inputs.

Some history…. Much of the work on stochastic frontiers began in 70’s. Major contributions from Aigner, Schmidt, Lovell, Battese and Coelli and Kumbhakar. Ordinary least squares (OLS) regression production functions: fit a function through the centre of the data assumes all firms are efficient Deterministic production frontiers: fit a frontier function over the data assumes there is no data noise SFA production frontiers are a “mix” of these two. OLS assumes that all deviations from estimated line are due to noise (eg. measurement error, missing variables, etc.). Deterministic frontiers assume all deviations are due to inefficiency - hence data noise could bias estimation - also we cannot conduct standard hypothesis tests. SFA attempts to create a balance by having 2 error terms - one for noise and one for inefficiency (see slide 6) - and one can use standard hypothesis tests.

Production Function It is a relationship between output and a set of input quantities. We use this when we have a single output In case of multiple outputs: people often use revenue (adjusted for price differences) as an output measure It is possible to use multi-output distance functions to study production technology. The functional relationship is usually written in the form: Fare et al 1994, OECD and MPI (regional concept) Coelli and Rao (2005) Ag Economics 95 countries Ag productivity using MPI technical efficiency, which reflects the ability of a firm to obtain maximal output from a given set of inputs.

Production Function Specification
A number of different functional forms are used in the literature to model production functions: Cobb-Douglas (linear logs of outputs and inputs) Quadratic (in inputs) Normalised quadratic Translog function Translog function is very commonly used – it is a generalisation of the Cobb-Douglas function It is a flexible functional form providing a second order approximation Cobb-Douglas and Translog functions are linear in parameters and can be estimated using least squares methods. It is possible to impose restrictions on the parameters (homogeneity conditions) Fare et al 1994, OECD and MPI (regional concept) Coelli and Rao (2005) Ag Economics 95 countries Ag productivity using MPI technical efficiency, which reflects the ability of a firm to obtain maximal output from a given set of inputs.

Cobb-Douglas Functional form
Linear in logs Advantages: easy to estimate and interpret requires estimation of few parameters: K+3 Disadvantages: simplistic - assumes all firms have same production elasticities and that subsitution elasticities equal 1 K=number of inputs (plus 1 if a time trend included in a panel data model). For example, 2 inputs require 5 parameters, 5 inputs require 8 parameters.

Translog Functional form
Quadratic in logs Advantages: flexible functional form - less restrictions on production elasticities and subsitution elasticities Disadvantages: more difficult to interpret requires estimation of many parameters: K+3+K(K+1)/2 can suffer from curvature violations K=number of inputs (plus 1 if a time trend included in a panel data model). For example, 2 inputs require 8 parameters, 5 inputs require 23 parameters.

Functional forms Cobb-Douglas: lnqi = 0 + 1lnx1i + 2lnx2i + vi - ui
Translog: lnqi = 0 + 1lnx1i + 2lnx2i + 0.511(lnx1i) 22(lnx2i)2 + 12lnx1ilnx2i + vi - ui Recall that lower case y and x represent logged variables. There are many other functional forms - but these two are the most popular. Can use LR test to choose between these models. In this example, there are 3 restrictions (the three second-order parameters are set to 0).

Interpretation of estimated parameters
Cobb-Douglas: Production elasticity for j-th input is: Ej = j Scale elasticity is:  = E1+E2 Translog: Production elasticity for i-th firm and j-th input is: Eji = j+ j1lnx1i+ j2lnx2i Scale elasticity for i-th firm is: i = E1i+E2i Note: If we use transformed data where inputs are measured relative to their means, then Translog elasticities at means would simply be i.

Test for Cobb-Douglas versus Translog
Using sample data file which comes with the FRONTIER program H0: 11=22=12=0, H1: H0 false Compute -2[LLFo-LLF1] which is distributed as Chi-square (r) under Ho. For example, if: LLF1=-14.43, LLF0=-17.03 LR=-2[ (-14.43)]=5.20 Since 32 5% table value = 7.81 => do not reject H0

Deterministic Frontier models
In this model all the errors are assumed to be due to technical inefficiency – no account is taken of noise. Consider the following simple specification: xi’s are in logs and include a constant ui is a non-negative random variable. Therefore ln yi  xi . Given the frontier nature of the model we can measure technical efficiency using: Fare et al 1994, OECD and MPI (regional concept) Coelli and Rao (2005) Ag Economics 95 countries Ag productivity using MPI technical efficiency, which reflects the ability of a firm to obtain maximal output from a given set of inputs.

Deterministic Frontier models
We have Frontier is deterministic since it is given by exp(xi) which is non-random. Estimation of Parameters: Linear programming approach Fare et al 1994, OECD and MPI (regional concept) Coelli and Rao (2005) Ag Economics 95 countries Ag productivity using MPI technical efficiency, which reflects the ability of a firm to obtain maximal output from a given set of inputs.

Deterministic Frontier models
Aigner and Chu suggested a Quadratic prrogramming approach Afriat (1972) suggested the use of a Gamma distribution for ui and the use of maximum likelihood estimation. Corrected ordinary least squares [COLS] If we apply OLS, intercept estimate is biased downwards, all other parameters are unbiased. Fare et al 1994, OECD and MPI (regional concept) Coelli and Rao (2005) Ag Economics 95 countries Ag productivity using MPI technical efficiency, which reflects the ability of a firm to obtain maximal output from a given set of inputs.

Deterministic Frontier models
So COLS suggests that the OLS estimator from OLS be corrected. If we do not wish to make use of any probability distribution for yi then where is the OLS residual for i-th firm. If we assume that ui is distributed as Gamma then It is a bit more complicated if ui follows half-normal distribution. Fare et al 1994, OECD and MPI (regional concept) Coelli and Rao (2005) Ag Economics 95 countries Ag productivity using MPI technical efficiency, which reflects the ability of a firm to obtain maximal output from a given set of inputs.

Production functions/frontiers
x × q Deterministic SFA OLS

Production functions/frontiers
OLS: qi = 0 + 1xi + vi Deterministic : qi = 0 + 1xi - ui SFA: qi = 0 + 1xi + vi - ui where vi = “noise” error term - symmetric (eg. normal distribution) ui = “inefficiency error term” - non-negative (eg. half-normal distribution) For example, if Cobb-Douglas functional form used, then yi = log of output and xi = log of input - and Greek letters are parameters to be estimated. If there were 2 inputs then the function would be: yi = 0 + 1x1i + 2x2i + vi - ui Can only estimate SFA if we specify particular distributions for the two error terms. The ability of the method to correctly separate the total error into the two components of noise and efficiency depends on the assumption that the assumed distributions are reasonable.

Stochastic Frontier: Model Specification
We start with the general production function as before and add a new term that represents technical inefficiency. This means that actual output is less than what is postulated by the production function specified before. We achieve this my subtracting u from the production function Then we have In the Cobb-Douglas production function with one input we can write the stochastic frontier function for the i-th firm as: Fare et al 1994, OECD and MPI (regional concept) Coelli and Rao (2005) Ag Economics 95 countries Ag productivity using MPI technical efficiency, which reflects the ability of a firm to obtain maximal output from a given set of inputs.

Stochastic frontiers Fare et al 1994, OECD and MPI (regional concept)
Coelli and Rao (2005) Ag Economics 95 countries Ag productivity using MPI technical efficiency, which reflects the ability of a firm to obtain maximal output from a given set of inputs.

Stochastic Frontier: Model Specification
noise inefficiency deterministic component In general, we write the stochastic frontier model with several inputs and a general functional form (which is linear in parameters) as We stipulate that ui is a non-negative random variable By construction the inefficiency term is always between 0 and 1. This means that if a firm is inefficient, then it produces less than what is expected from the inputs used by the firm at the given technology. We can define technical efficiency as the ratio of “observed” or “realised output” to the stochastic frontier output Fare et al 1994, OECD and MPI (regional concept) Coelli and Rao (2005) Ag Economics 95 countries Ag productivity using MPI technical efficiency, which reflects the ability of a firm to obtain maximal output from a given set of inputs.

Stochastic Specifications
The SF model is specified as: The following are the assumptions made on the distributions of v and u. Standard assumptions of zero mean, homoskedasticity and independence is assumed for elements of vi. We assume that ui’s are identically and independently distributed non-negative random variables. Further we assume that vi and ui are independently distributed. The distributional assumptions are crucial to the estimation of the parameters. Standard distributions used are: Half-normal (truncated at zero) Exponential Gamma distribution Fare et al 1994, OECD and MPI (regional concept) Coelli and Rao (2005) Ag Economics 95 countries Ag productivity using MPI technical efficiency, which reflects the ability of a firm to obtain maximal output from a given set of inputs.

Truncated normal distribution for u
Fare et al 1994, OECD and MPI (regional concept) Coelli and Rao (2005) Ag Economics 95 countries Ag productivity using MPI technical efficiency, which reflects the ability of a firm to obtain maximal output from a given set of inputs. Distribution of u: We note that: As u is truncated from a normal distribution with mean equal to 0, E(u) is “towards” zero and therefore technical efficiency tends to be high just by model construction.

Truncated normal with non-zero means
Fare et al 1994, OECD and MPI (regional concept) Coelli and Rao (2005) Ag Economics 95 countries Ag productivity using MPI technical efficiency, which reflects the ability of a firm to obtain maximal output from a given set of inputs. A more general specification: This forms the basis for the inefficiency effects model where

Estimation of SF Models
Parameters to be estimated in a standard SF model are: Likelihood methods are used in estimating the unknown parameters. Coelli (1995)’s Montecarlo study shows that in large samples MLE is better than COLS. Usually variance parameters are reparametrized in the following forms. Testing for the presence of technical inefficiency depends upon the parametrization used. Fare et al 1994, OECD and MPI (regional concept) Coelli and Rao (2005) Ag Economics 95 countries Ag productivity using MPI technical efficiency, which reflects the ability of a firm to obtain maximal output from a given set of inputs. and Aigner, Lovell and Schmidt Battese and Corra

Estimation of SFA Models
In the case of translog model, it is a good idea to transform the data – divide each observation by its mean Then the coefficients of ln Xi can be interpreted as elasticities. Most standard packages such as SHAZAM and LIMDEP. FRONTIER by Coelli is a specialised program for purposes of estimating SF models. Available for free downloads from CEPA website: Fare et al 1994, OECD and MPI (regional concept) Coelli and Rao (2005) Ag Economics 95 countries Ag productivity using MPI technical efficiency, which reflects the ability of a firm to obtain maximal output from a given set of inputs.

FRONTIER Instruction File
Fare et al 1994, OECD and MPI (regional concept) Coelli and Rao (2005) Ag Economics 95 countries Ag productivity using MPI technical efficiency, which reflects the ability of a firm to obtain maximal output from a given set of inputs. Here MU refers to inefficiency effects models and ETA refers to time-varying inefficiency effects (we will come to this shortly) The program uses the ratio of variances as the transformation It allows for the use of single cross-sections as well as panel data sets

FRONTIER output the final mle estimates are :
coefficient standard-error t-ratio beta E E E+01 beta E E E+01 beta E E E+01 beta E E E+01 beta E E E+01 beta E E E+01 beta E E E+01 beta E E E+00 beta E E E+01 beta E E E+00 beta E E E-01 sigma-squared E E E+01 Fare et al 1994, OECD and MPI (regional concept) Coelli and Rao (2005) Ag Economics 95 countries Ag productivity using MPI technical efficiency, which reflects the ability of a firm to obtain maximal output from a given set of inputs. gamma E E E+02 mu is restricted to be zero eta is restricted to be zero log likelihood function = E+02

SF Models - continued Predicting Firm Level Efficiencies:
Once the SF model is estimated using MLE method, we compute the following: and We use estimates of unknown parameters in these equations and compute the best predictor of technical efficiency for each firm i : Fare et al 1994, OECD and MPI (regional concept) Coelli and Rao (2005) Ag Economics 95 countries Ag productivity using MPI technical efficiency, which reflects the ability of a firm to obtain maximal output from a given set of inputs. We use standard normal density and distribution functions to evaluate technical efficiency.

SF Models - continued Industry efficiency:
Industry efficiency can be computed as the average of technical efficiencies of the firms in the sample Industry efficiency can be seen as the expected value of a randomly selected firm from the industry. Then we have Fare et al 1994, OECD and MPI (regional concept) Coelli and Rao (2005) Ag Economics 95 countries Ag productivity using MPI technical efficiency, which reflects the ability of a firm to obtain maximal output from a given set of inputs. Confidence intervals for technical efficiency scores (for the firms and the industry as a whole) can also be computed. We note that there are no firms with a TE score of 1 as in the case of DEA. No concept of peers exists in the case of SFA.

technical efficiency estimates :
FRONTIER output technical efficiency estimates : firm eff.-est. mean efficiency = Fare et al 1994, OECD and MPI (regional concept) Coelli and Rao (2005) Ag Economics 95 countries Ag productivity using MPI technical efficiency, which reflects the ability of a firm to obtain maximal output from a given set of inputs. Mean efficiency can be interpreted as the “industry efficiency”.

Tests of hypotheses e.g., Is there significant technical inefficiency?
H0: =0 versus H1: >0 Test options: t-test t-ratio = (parameter estimate) / (standard error) Likelihood ratio (LR) test [note that the above hypothesis is one-sided - therefore must use Kodde and Palm critical values (not chi-square) for LR test LR test “safer” Standard errors are only approximate because of numerical method used in MLE - thus LR test more reliable.

Likelihood ratio (LR) tests
Steps: 1) Estimate unrestricted model (LLF1) 2) Estimate restricted model (LLF0) (eg. set =0) 3) Calculate LR=-2(LLF0-LLF1) 4) Reject H0 if LR>R2 table value, where R = number of restrictions (Note: Kodde and Palm tables must be used if test is one-sided) The LR test can also be used to test various other hypotheses. For example: test for technical change (if panel data involved) test among functional forms - such as Cobb-Douglas versus translog. See CRB for more on one-sided tests.

The LR statistic has mixed Chi-square distribution
Example - estimate translog production function using sample data file which comes with the FRONTIER program firms t-ratio for  = 24.36, and N(0,1) critical value at 5% = => reject H0 Or the LR statistic = , and Kodde and Palm critical value at 5% = 2.71 => reject H0 The LR statistic has mixed Chi-square distribution Standard errors are only approximate because of numerical method used in MLE - thus LR test more reliable.

Distributional assumptions – the truncated normal distribution
N(,2) truncated at zero More general patterns Can test hypothesis that =0 using t-test or LR test The restriction =0 produces the half-normal distribution: |N(0,2)|

Scale efficiency For a Translog Production Function (Ray, 1998)
An output-orientated scale efficiency measure is: SEi = exp[(1-i)2/2] where i is the scale elasticity of the i-th firm and If the frontier is concave in inputs then <0. Then SE is in the range 0 to 1.

We note the following points with respect to SFA models It is important to check the regularity conditions associated with the estimated functions – local and global properties This may require the use of Bayesian approach to impose inequality restrictions required to impose convexity and concavity conditions. We need to estimate distance functions directly in the case of multi-output and multi-input production functions. It is possible to estimate scale efficiency in the case of translog and Cobb-Douglas specifications

Panel data models Data on N firms over T time periods
Investigate technical efficiency change (TEC) Investigate technical change (TC) More data = better quality estimates Less chance of a one-off event (eg. climatic) influencing results Can use standard panel data models no need to make distributional assumption but must assume TE fixed over time The model: i=1,2,…N (cross-section of firms); t=1,2…T (time points)

Panel data models Some Special cases:
Firm specific effects are time invariant: uit = ui . Time varying effects: Kumbhakar (1990) Time-varying effects with convergence – Battese and Coelli (1992) Sign of  is important. As t goes to T, uit goes to ui. In FRONTIER Program, this is under Error Components Model.

Time profiles of efficiencies
Fare et al 1994, OECD and MPI (regional concept) Coelli and Rao (2005) Ag Economics 95 countries Ag productivity using MPI technical efficiency, which reflects the ability of a firm to obtain maximal output from a given set of inputs. Note: These are all smooth functions of trends of technical efficiency over time. These trends are also independent of any other data on the firms. There is scope for further work in this area.

Accounting for Production Environment
Technical efficiency is influenced by exogenous factors that characterise the environment in which production takes place Government regulation, ownership, education level of the farmer, etc. Non-stochastic Environmental Variables In this case firm-level technical efficiency levels predicted will vary with traditional inputs and environmental variables. Inefficiency effects model (Battese, Coelli 1995) where  is a vector of parameters to be estimated. In the FRONTIER program, this is the TEEFFECTS model

Current research There is scope to conduct a lot of research in this area. Some areas where work is being done and still be done are: Modelling movements in inefficiency over time – incorporating exogenous factors driving inefficiency effects Possibility of covariance between the random disturbance and the distribution of the inefficiency term Endogeniety due to possible effects of inefficiency and technical change on the choice of input variables Modelling risk into efficiency estimation and interpretation

Current research Bayesian estimation of stochastic frontier models
We have seen how technical efficiency can be computed, but it is difficult to compute standard errors. Peter Schmidt and his colleagues have been working on a number of related topics here. Bootstrap estimators and confidence intervals for efficiency levels in SF models with pantel data Testing whether technical inefficiency depends on firm characteristics On the distribution of inefficiency effects under different assumptions Bayesian estimation of stochastic frontier models Posterior distribution of technical efficiencies Estimation of distance functions

Application to Residential Aged Care Facilities
Residential aged care is a multi-billion dollar industry in Australia Considered even more important in view of an ageing population. Aged care facilities are funded by the Commonwealth government and are run by local government, community/religious and private organisations. Efficiency of residential aged care facilities is considered quite important in view of reducing costs. CEPA conducted a study for the Commonwealth Department of Health and aged Care. Data was collected by the Department from the residential aged care facilities.

Application to Residential Aged Care Facilities
Data: 912 Aged Care Facilities 30% response rate – response bias Data validation Actual observations used: 787 Methods: DEA and SFA Methods Peeled DEA

Application to Residential Aged Care Facilities
Preferred model: Output variables: High care weighted bed days Low care weighted bed days Input variables: Floor area in square meters Labour costs Other costs

General findings Average technical efficiency was calculated to be % cost savings could be achieved if all ACFs operated on the frontier. Results consistent between SFA and DEA models Found variation across ACFs in different states (lowest 0.79 in Victoria and 0.87 in NSW/ACT). Privately run ACFs has higher mean TE of 0.89. Average scale efficiency of 0.93 was found. An average size of 30 to 60 beds was found to be near optimal scale. A second stage Tobit model was run to see the factors driving inefficiency. Potential cost savings was calculated to be \$316 m for the number of firms in the sample –greater savings for the industry!! Economies of Scope were examined using new methodology developed – but no significant economies were found.

Similar presentations