Presentation on theme: "Model Risk – sources and some examples Tony Bellotti Department of Mathematics Imperial College London."— Presentation transcript:
Model Risk – sources and some examples Tony Bellotti Department of Mathematics Imperial College London
Model development A highly simplified model development framework:- Model development Model Use In this framework, once the model is developed, we then think of it as correct. However, the model is only an approximation to reality.
Thinking about model risk Do you factor in the uncertainty of your model when you use it? Model Risk Model development Model Use Assess- ment Measure Firstly, we need to understand the sources of model risk and how to measure those risks. Secondly, the consequences of using the model needs to be assessed in light of the model risks, prior to use.
Does model risk matter? But… does model risk really matter? Does it make a substantial difference in the real world? “The reliance on models to handle risk carries its own risk” * In securities markets, where complex pricing models are used, there is such a thing as model arbitrage, where a trader will take advantage of known errors in model structure or implementation to make money. So there is a genuine effect. * If this happens in retail credit, perhaps it could lead to adverse selection (eg pricing a loan below the true risk level of the borrower). * Emanuel Derman (1996), Model Risk, Goldman Sachs Quantitative Strategies Research Notes
What about model risk in retail credit? But retail credit employs relatively simple models, so perhaps there is no problem?.... But model complexity is not the only source of model risk (although it is an important one for pricing models). In the following slides I will consider several possible sources of model risk. Note: This is not an exhaustive list and also there is some overlap between the various categories. Later, I give some examples from retail finance to illustrate when there could be model risk issues.
Sources of model risk Statistical:- »Model misspecification »Model efficiency/inefficiency »Data problems and selection bias »Robustness over time »Inappropriate use Other/management:- »Model development resources (analysts/time) »Publication, implementation and software error We consider only the statistical sources of model risk.
Model misspecification (1) Model structure »Do we have the correct general model structure to model the data? »In the past, it was common to use OLS. Now it is standard to use logistic regression. Perhaps now we can ask if logit is the correct link function? »Is the basic linear scorecard correct? Is a nonlinear structure more appropriate? Model assumptions: what are they and are we breaking them? »Distributions on error terms (eg normality for OLS). »Independence for observations in standard logistic regression. Is this really true in retail credit?
Model misspecification (2) Inclusion of variables. »Too few variables may lead to biassed estimates. »Too many will lead to less efficient estimates and, hence, less robust models. Variable transformations (to log or not to log?). »With some variables like income, it is “standard” to take log. »What about others? Age, eg? »Some modellers use all weights-of-evidence – is this appropriate? Multicollinearity. »Where predictor variables are themselves highly correlated, this can lead to inefficient or wrong estimates (in particular, it can lead to the wrong sign).
Model efficiency/inefficiency Every model is inaccurate and every estimate is just that: an estimate. Fortunately, most statistical models provide a measure of the accuracy of estimates (ie the standard errors). »This is not true of all models (eg standard linear discriminant analysis and machine learning algorithms) – although it’s always possible to bootstrap. »Remember though that the accuracy of the standard errors themselves can be suspect and is dependent on following model assumptions (or relying on model robustness).
Data problems and selection bias Is the data appropriate for the modelling task? »Reliability in data collection; eg how reliable is a self-assessment of income? »Or, eg, based on an existing portfolio of predominantly older customers, build a model for a card targeting young customers. »A data set of accepted loan applications, to build a scorecard across all new applications. Of course, the last example is the problem of selection bias. »It is a fairly well understood model risk issue in retail credit. »Several reject inference techniques to handle it: eg parcelling and augmentation.
Robustness over time (1) There are some problem domains where risk factors and distributions on variables are stable over time. In such domains, models remain stable. »For example, mortality scoring models based on physiology of hospital in-patients (eg Apache III) are stable since human physiology does not change much over time. However, consumer credit does not remain stable over time. »Credit risk changes over the business cycle. »Credit usage behaviour changes over time. »Banks’ risk appetite changes over time. »Innovations in technology and product development change risk. All of these time-varying factors affect the applicability of credit risk models over time.
Robustness over time (2) Changes in the effect size of risk factors will have an obvious effect on the applicability of a model. Population drift: Changes in the distribution of predictor or outcome variables can also affect the robustness of the model. Slow versus sudden change (eg economic crisis) can have different effects on the applicability of a model. Possible approaches to dealing with this problem:- »Rebuild models regularly and Champion/challenger environment. »Dynamic models (ie including time-varying factors in the risk model). »Adaptive models.
Model robustness, in general The problem of model robustness over time generalizes to different domains: eg geographic or product type. For example, if we have a credit card product operating in UK, does the same scorecard model apply to Ireland? »How different will it be?
Inappropriate use “In terms of risk control, you’re worse off thinking you have a model and relying on it than in simply realizing there isn’t one.” * A model may be built correctly. However, it may be used for the wrong task. For example, using a default model as the basis of a strategy on customer retention…. Better to build a new model focussed on retention. * Emanuel Derman (1996), Model Risk, Goldman Sachs Quantitative Strategies Research Notes
Consequences of model risk (1) What are the consequences of model risk ? Need to measure the effect of model risk on model use :- (1) Explanatory model If it is important that the model is used as an explanatory model, then bias and inefficiency in model estimation will be important. Eg for discussion with management and regulators. (2) Forecasting Individual / account level; Aggregate / loss forecasting; Does the flat maximum effect provide some robustness against model bias and inefficiency?
Consequences of model risk (2) (3) Stress testing Predictions of outcome for extreme values. Typically, value-at-risk, expected shortfall, or scenarios. Effects of model risk on stress testing are likely to be different to the effect on standard forecasts. I now give some quick examples of model risk, looking at usage, measurement issues and consequences….
Example 1: Misspecification / Misapplication Performance of models for extreme cases * Models work well at estimating expected values for “typical” cases from the population. However, how do they fare when predicting default rates (DR) for extreme cases? In this experiment, a logistic regression model is built for credit card data. DR is then predicted for an independent test set of extreme cases (with respect to variables such as age and job) and compared with observed DR. * Work conducted by Alice Wang as part of her third year undergraduate project.
Example 1: Results We see that these models tend to under- or over- estimate DR for extreme cases. Interestingly, the parsimonious model gives better forecast results. Note: all extreme criteria represent 2% of the test data (N=600). VariableDRFull modelParsimonious Age>67Observed0.0664 Predicted0.07220.0698 Error+8.8%+5.2% Years in current job>24 Observed0.110 Predicted0.07170.0724 Error-34.6%-33.9% Income (log) > 7.84 Observed0.1034 Predicted0.11800.1148 Error+14.1%+11.0% Years in current residence > 41 Observed0.1146 Predicted0.11210.1147 Error-2.5%+0.1%
Example 2: Selection bias Simulation study The problem of selection bias in application models is well known and several reject inference methods have been proposed. Unfortunately, in a real world context it is not usually possible to accurately evaluate the extent of the bias, or the effectiveness of a reject inference method, since outcomes for rejects are unknown. However, simulation studies can be used to show the effect. These are valuable to demonstrate the extent of the problem. Here is the result of a simulation study using an augmentation method. In a nutshell, augmentation is a method that weights observations from the accepts; usually according to how typical they are of being accepts, based on an Accept-Reject model.
Example 2 continued This graph shows the distribution is not the same for the accepted population, compared to all. Those with high numbers of delinquencies are under-represented. This effects the model estimation.
Example 2 continued ModelAUC S1 (unbiassed)0.844 S2 (biassed)0.832 S3 (augmentation)0.840
Example 4: Misspecification Using Logit versus Poisson link function In the context of large defaultable bond portfolios, Lucas and Verhoef* experiment with Logit and Poisson link function. Note: there is a good rationale for using a Poisson link function since default time can be modelled as a Poisson process. How do the models perform in estimating expected loss? * Lucas A and Verhoef B (2012), Aggregating Credit and Market Risk: the Impact of Model Specification, working paper, Tinbergen Institute, VU University Amsterdam
Example 4 continued For two segments, they report these results:- Hardly any model misspecification problem for Expected Loss estimates… But, importantly, for VaR, Logit underestimates (relative to Poisson). “model specification matters … This is surprising, as the shape of the link function is deemed to be less important for computing capital requirements.” * Low qualityMedium quality LogitPoissonLogitPoisson Expected Loss-2.6 -5.3 VaR (99.9%)184.108.40.206.9
Example 5: Robustness over time Use of time-varying risk factors for loss forecasting One approach to dealing with changing risk levels over time is to include macroeconomic time series. Survival models are a good way to do this since macroeconomic and behavioural data can be included as time-varying covariates (TVCs). Model time to default as a failure event. Experiment on portfolio of UK credit card data: * »Training data: 400,000 credit cards over period 1999 to 2004. »Forecast for 150,000 credit cards from 2005 to mid-2006. * Bellotti and Crook (2009), Forecasting and stress testing credit card default using dynamic models, working paper, Credit Research Centre, Edinburgh
Example 5: Results Inclusion of interest rate and unemployment rate are statistically significant. We compare default rate (DR) forecasts between models with application variables (AV) only (eg age, income, employment status, housing status, at application), behavioural variables (BV) and macroeconomic variables (MV). MAD = mean absolute difference between estimated and observed DR. This shows an improvement in aggregate forecasts when macroeconomic data is included in the model. ModelMAD AV0.087 AV+BV0.058 AV+BV+MV0.049
Conclusion There is a genuine problem of model risk. We have seen some suggestive examples. We need to understand the sources of model risk. We need to know the consequences of model risk and how to measure it. We need to find ways to manage model risk: Develop methods to reduce or control it, and Incorporate model risk in our decision making.