Download presentation

Presentation is loading. Please wait.

Published byKorey Ransford Modified about 1 year ago

1
© Deloitte Consulting, 2004 Loss Reserving Using Policy-Level Data James Guszcza, FCAS, MAAA Jan Lommele, FCAS, MAAA Frank Zizzamia CLRS Las Vegas September, 2004

2
© Deloitte Consulting, Agenda Motivations for Reserving at the Policy Level Outline one possible modeling framework Sample Results

3
© Deloitte Consulting, 2004 Motivation Why do Reserving at the Policy Level?

4
© Deloitte Consulting, Basic Motivations 1. Better reserve estimates for a changing book of business Do triangles “summarize away” important patterns? Could adding predictive variables help? 2. More accurate estimate of reserve variability Summarized data require more sophisticated models to “recover” heterogeneity. Is a loss triangle a “sufficient statistic” for ultimate losses & variability?

5
© Deloitte Consulting, (1) Better Reserve Estimates Key idea: use predictive variables to supplement loss development patterns Most reserving approaches analyze summarized loss/claim triangles. Does not allow the use of covariates to predict ultimate losses (other than time-indicators). Actuaries use predictive variables to construct rating plans & underwriting models. Why not loss reserving too?

6
© Deloitte Consulting, Why Use Predictive Variables? Suppose a company’s book of business has been deteriorating for the past few years. This decline might not be reflected in a summarized loss development triangle. However: The resulting change in distributions of certain predictive variables might allow us to refine our ultimate loss estimates.

7
© Deloitte Consulting, Examples of Predictive Variables Claim detail info Type of claim Time between claim and reporting Policy’s historical loss experience Information about agent who wrote the policy Exposure Premium, # vehicles, # buildings/employees… Other specifics Policy age, Business/policyholder age, credit…..

8
© Deloitte Consulting, More Data Points Typical reserving projects use claim data summarized to the year/quarter level. Probably an artifact of the era of pencil-and- paper statistics. In certain cases important patterns might be “summarized away”. In the computer age, why restrict ourselves? More data points less chance of over- fitting the model.

9
© Deloitte Consulting, Danger of over-fitting One well known example Overdispersed Poisson GLM fit to loss triangle. Stochastic analog of the chain-ladder 55 data points 20 parameters estimated parameters have high standard error. How do we know the model will generalize well on future development? Policy-level data: 1000’s of data points

10
© Deloitte Consulting, Out-of-Sample Testing Policy-level dataset has 1,000’s of data points Rather than 55 data points. Provides more flexibility for various out-of- sample testing strategies. Use of holdout samples Cross-validation Uses: Model selection Model evaluation

11
© Deloitte Consulting, (2) Reserve Variability Variability Components Process risk Parameter risk Model specification risk Predictive error = process + parameter risk Both quantifiable What we will focus on Reserve variability should also consider model risk. Harder to quantify

12
© Deloitte Consulting, Reserve Variability Can policy-level data give us a more accurate view of reserve variability? Process risk: we are not summarizing away variability in the data. Parameter risk: more data points should lead to less estimation error. Prediction variance: brute force “bootstrapping” easily combines Process & Parameter variance. Leaves us more time to focus on model risk.

13
© Deloitte Consulting, Disadvantages Expensive to gather, prepare claim-detail information. Still more expensive to combine this with policy-level covariates. More open-ended universe of modeling options (both good and bad). Requires more analyst time, computer power, and specialist software. Less interactive than working in Excel.

14
© Deloitte Consulting, 2004 Modeling Approach Sample Model Design

15
© Deloitte Consulting, Philosophy Provide an example of how reserving might be done at the policy level To keep things simple: consider a generalization of the chain-ladder Just one possible model Analysis is suggestive rather than definitive No consideration of superimposed inflation No consideration of calendar year effects Model risk not estimated etc…

16
© Deloitte Consulting, Notation L j = {L 12, L 24, …, L 120 } Losses developed at 12, 24,… months Developed from policy inception date PY i = {PY 1, PY 2, …, PY 10 } Policy Years 1, 2, …, 10 {X k } = covariates used to predict losses Assumption: covariates are measured at or before policy inception

17
© Deloitte Consulting, Model Design Build 9 successive GLM models Regress L 24 on L 12 ; L 36 on L 24 … etc Each GLM analogous to a link ratio. The L j L j+1 model is applied to either Actual j Predicted values from the L j-1 L j model Predict L j+1 using covariates along with L j.

18
© Deloitte Consulting, Model Design Idea: model each policy’s loss development from period L j to L j+1 as a function of a linear combination of several covariates. Policy-level generalization of the chain-ladder idea. Consider case where there are no covariates

19
© Deloitte Consulting, Model Design Over-dispersed Poisson GLM: Log link function Variance of L j+1 is proportional to mean Treat log(L j ) as the offset term Allows us to model rate of loss development

20
© Deloitte Consulting, Using Policy-Level Data Note: we are using policy-level data. Therefore the data contains many zeros Poisson assumption places a point mass at zero How to handle IBNR Include dummy variable indicating $0 mo Interact this indicator with other covariates. The model will allocate a piece of the IBNR to each policy with $0 loss as of 12 months.

21
© Deloitte Consulting, 2004 Sample Results

22
© Deloitte Consulting, Data Policy Year 1991 – 1 st quarter 2000 Workers Comp 1 record per policy, per year Each record has multiple loss 12, 24, …,120 months j months” means: j months from the policy inception date. Losses coded as “missing” where appropriate e.g., PY 1998 losses at 96 months

23
© Deloitte Consulting, Covariates Historical LR and claim frequency variables $0 12 month indicator Credit Score Log premium Age of Business New/renewal indicator Selected policy year dummy variables Using a PY dummy variable is analogous to leaving that PY out of a link ratio calculation Use sparingly

24
© Deloitte Consulting, Covariates Interaction terms between covariates and the $0-indicator Most covariates only used for the 12 24 GLM For other GLMs only use selected PY indicators These GLMs give very similar results to the chain ladder

25
© Deloitte Consulting, Results

26
© Deloitte Consulting, Comments Policy-Level model produces results very close to chain-ladder. is a proper generalization of the chain-ladder The model covariates are all statistically significant, have parameters of correct sign. In this case, the covariates seem to have little influence on the predictions. Might play more of a role in a book where quality of business changes over time.

27
© Deloitte Consulting, 2004 Model Evaluation Treat Recent Diagonals as Holdout 10-fold Cross-Validation

28
© Deloitte Consulting, Test Model by Holding Out Most Recent 2 Calendar Years

29
© Deloitte Consulting, Cross-Validation Methodology Randomly break data into 10 pieces Fit the 9 GLM models on pieces 1…9 Apply it to Piece 10 Therefore Piece 10 is treated as out-of-sample data Now use pieces 1…8,10 to fit the nine models; apply to piece 9 Cycle through 8 other cases

30
© Deloitte Consulting, Cross-Validation Methodology Fit 90 GLMs in all 10 cross-validation iterations Each involving 9 GLMs Each of the 10 “predicted” pieces will be a 10x10 matrix consisting entirely of out-of- sample predicted values Can compare actuals to predicteds on upper half of the matrix Each cell of the triangle is treated as out-of-sample data

31
© Deloitte Consulting, Cross-Validation Results

32
© Deloitte Consulting, 2004 Reserve Variability Using the Bootstrap to estimate the probability distribution of one’s outstanding loss estimate

33
© Deloitte Consulting, The Bootstrap The Statistician Brad Efron proposed a very simple and clever idea for mechanically estimating confidence intervals: The Bootstrap. The idea is to take multiple resamples of your original dataset. Compute the statistic of interest on each resample you thereby estimate the distribution of this statistic!

34
© Deloitte Consulting, Motivating Example Suppose we take 1000 draws from the normal(500,100) distribution Sample mean ≈ 500 what we expect a point estimate of the “true” mean From theory we know that:

35
© Deloitte Consulting, Sampling with Replacement Draw a data point at random from the data set. Then throw it back in Draw a second data point. Then throw it back in… Keep going until we’ve got 1000 data points. You might call this a “pseudo” data set. This is not merely re-sorting the data. Some of the original data points will appear more than once; others won’t appear at all.

36
© Deloitte Consulting, Sampling with Replacement In fact, there is a chance of (1-1/1000) 1000 ≈ 1/e ≈.368 that any one of the original data points won’t appear at all if we sample with replacement 1000 times. any data point is included with Prob ≈.632 Intuitively, we treat the original sample as the “true population in the sky”. Each resample simulates the process of taking a sample from the “true” distribution.

37
© Deloitte Consulting, Resampling Sample with replacement 1000 data points from the original dataset S Call this S*1 Now do this 399 more times! S*1, S*2,…, S*400 Compute X-bar on each of these 400 samples

38
© Deloitte Consulting, The Result The green bars are a histogram of the sample means of S*1,…, S*400 The blue curve is a normal distribution with the sample mean and s.d. The red curve is a kernel density estimate of the distribution underlying the histogram Intuitively, a smoothed histogram

39
© Deloitte Consulting, The Result The result is an estimate of the distribution of X-bar. Notice that it is normal with mean≈500 and s.d.≈3.2 The purely mechanical bootstrapping procedure produces what theory tells us to expect. Can we use resampling to estimate the distribution of outstanding liabilities?

40
© Deloitte Consulting, Bootstrapping Reserves S = our database of policies Sample with replacement all policies in S Call this S* 1 Same size as S Now do this 199 more times! S* 1, S* 2,…, S* 200 Estimate o/s reserves on each sample Get a distribution of reserve estimates

41
© Deloitte Consulting, Bootstrapping Reserves Compute your favorite reserve estimate on each S* k These 200 reserve estimates constitute an estimate of the distribution of outstanding losses Notice that we did this by resampling our original dataset S of policies. This differs from other analyses which bootstrap the residuals of a model. Perhaps more theoretically intuitive. But relies on assumption that your model is correct!

42
© Deloitte Consulting, Bootstrapping Results Standard Deviation ≈ 5% of total o/s losses 95% confidence interval ≈ (-10%, +10%) Tighter interval than typically seen in the literature. Result of not summarizing away variability info?

43
© Deloitte Consulting, Reserve Dist: All Years

44
© Deloitte Consulting, Reserve Dist: 1992

45
© Deloitte Consulting, Reserve Dist: 1993

46
© Deloitte Consulting, Reserve Dist: 1994

47
© Deloitte Consulting, Reserve Dist: 1995

48
© Deloitte Consulting, Reserve Dist: 1996

49
© Deloitte Consulting, Reserve Dist: 1997

50
© Deloitte Consulting, Reserve Dist: 1998

51
© Deloitte Consulting, Reserve Dist: 1999

52
© Deloitte Consulting, Reserve Dist: 2000

53
© Deloitte Consulting, Interpretation This result suggests: With 95% confidence, the total o/s losses will be within +/- 10% of our estimate. Assumes model is correctly specified. Too good to be true? Yes: doesn’t include model risk! Tighter confidence than often seen in the literature. Bootstrapping a model can be tricky. However: we are using 1000s of data points! We’re not throwing away heterogeneity info.

54
© Deloitte Consulting, Closing Thoughts The +/- 10% result is only suggestive Applies to this specific data set. Bootstrapping methodology can be refined. Suggestion: using policy-level data can yield tighter confidence intervals Doesn’t throw away information pertaining to process & parameter risk. Bootstrapping is conceptually simple. Requires more computer power than brain power! Leaves us more time to focus on model risk.

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google