Presentation on theme: "Predictive Modeling for Property-Casualty Insurance"— Presentation transcript:
1 Predictive Modeling for Property-Casualty Insurance James Guszcza, FCAS, MAAAPeter Wu, FCAS, MAAASoCal Actuarial ClubLAXSeptember 22, 2004
2 Predictive Modeling: 3 Levels of Discussion StrategyProfitable growthRetain most profitable policyholdersMethodologyModel design (actuarial)Modeling processTechniqueGLM vs. decision trees vs. neural nets…
3 Methodology vs Technique How does data mining need actuarial science?Variable creationModel designModel evaluationHow does actuarial science need data mining?Advances in computing, modeling techniquesIdeas from other fields can be applied to insurance problems
4 Semantics: DM vs PM Data exploration techniques (some brute force) One connotation: Data Mining (DM) is about knowledge discovery in large industrial databasesData exploration techniques (some brute force)e.g. discover strength of credit variablesPredictive Modeling (PM) applies statistical techniques (like regression) after knowledge discovery phase is completed.Quantify & synthesize relationships found during knowledge discoverye.g. build a credit model
6 Bay Area BaseballIn 1999 Billy Beane (manager for the Oakland Athletics) found a novel use of data mining.Not a wealthy teamRanked 12th (out of 14) in payrollHow to compete with rich teams?Beane hired a statistics whiz to analyze statistics advocated by baseball guru Bill JamesBeane was able to hire excellent players undervalued by the market.A year after Beane took over, the A’s ranked 2nd!
7 Implication Beane quantified how well a player would do. Implication: Not perfectly, just better than his peersImplication:Be on the lookout for fields where an expert is required to reach a decision based on judgmentally synthesizing quantifiable information across many dimensions.(sound like insurance underwriting?)Maybe a predictive model can beat the pro.
8 Example Who is worse?... And by how much? 20 y.o. driver with 1 minor violation who pays his bills on time and was written by your best agentMature driver with a recent accident and has paid his bills late a few timesUnlike the human, the algorithm knows how much weight to give each dimension…Classic PM strategy: build underwriting models to achieve profitable growth.
9 Keeping Score Billy Beane CEO who wants to run the next Progressive Beane’s ScoutsUnderwriterPotential Team MemberPotential InsuredBill James’ statsPredictive variables – old or new (e.g. credit)Billy Bean’s number cruncherYou! (or people on your team)
11 Three Concepts Scoring engines Lift curves Out-of-sample tests A “predictive model” by any other name…Lift curvesHow much worse than average are the policies with the worst scores?Out-of-sample testsHow well will the model work in the real world?Unbiased estimate of predictive power
12 Classic Application: Scoring Engines Scoring engine: formula that classifies or separates policies (or risks, accounts, agents…) intoprofitable vs. unprofitableRetaining vs. non-retaining…(Non-)Linear equation f( ) of several predictive variablesProduces continuous range of scoresscore = f(X1, X2, …, XN)
13 What “Powers” a Scoring Engine? score = f(X1, X2, …, XN)The X1, X2,…, XN are as important as the f( )!Why actuarial expertise is necessaryA large part of the modeling process consists of variable creation and selectionUsually possible to generate 100’s of variablesSteepest part of the learning curve
14 Model Evaluation: Lift Curves Sort data by scoreBreak the dataset into 10 equal piecesBest “decile”: lowest score lowest LRWorst “decile”: highest score highest LRDifference: “Lift”Lift = segmentation powerLift ROI of the modeling project
15 Out-of-Sample Testing Randomly divide data into 3 piecesTraining data, Test data, Validation dataUse Training data to fit modelsScore the Test data to create a lift curvePerform the train/test steps iteratively until you have a model you’re happy withDuring this iterative phase, validation data is set aside in a “lock box”Once model has been finalized, score the Validation data and produce a lift curveUnbiased estimate of future performance
16 Comparison of Techniques Models built to detect whether an message is really spam.“Gains charts” from several models Analogous to lift curvesGood for binary targetAll techniques work ok!Good variable creation at least as important as modeling technique.
17 Credit Scoring is an Example All of these concepts apply to Credit ScoringKnowledge discovery in databases (KDD)Scoring engineLift Curve evaluation translates to LR improvement ROIBlind-test validationCredit scoring has been the insurance industry’s segue into data mining
19 Data Sources Company’s internal data Externally purchased data Policy-level recordsLoss & premium transactionsAgent databaseBillingVIN……..Externally purchased dataCreditCLUEMVRCensus….
20 The Predictive Modeling Process Early: Variable CreationMiddle: Data Exploration & ModelingLate: Analysis & Implementation
21 Variable Creation Research possible data sources Extract/purchase data Check data for quality (QA)Messy! (still deep in the mines)Create Predictive and Target VariablesOpportunity to quantify tribal wisdom…and come up with new ideasCan be a very big task!Steepest part of the learning curve
22 Types of Predictive Variables BehavioralHistorical Claim, billing, credit …PolicyholderAge/Gender, # employees …Policy specificsVehicle age, Construction Type …TerritorialCensus, Weather …
23 Data Exploration & Variable Transformation 1-way analyses of predictive variablesExploratory Data Analysis (EDA)Data VisualizationUse EDA to cap / transform predictive variablesExtreme valuesMissing values…etc
24 Multivariate Modeling Examine correlations among the variablesWeed out redundant, weak, poorly distributed variablesModel designBuild candidate modelsRegression/GLMDecision Trees/MARSNeural NetworksSelect final model
25 Building the ModelPair down collection of predictive variables to a manageable setIterative processBuild candidate models on “training data”Evaluate on “test data”Many things to tweakDifferent target variablesDifferent predictive variablesDifferent modeling techniques# NN nodes, hidden layers; tree splitting rules…
26 ConsiderationsDo signs/magnitudes of parameters make sense? Statistically significant?Is the model biased for/against certain types of policies? States? Policy sizes? ...Predictive power holds up for large policies?ContinuityAre there small changes in input values that might produce large swings in scoresMake sure that an agent can’t game the system
27 Model Analysis & Implementation Perform model analyticsNecessary for client to gain comfort with the modelCalibrate ModelsCreate user-friendly “scale” – client dictatesImplement modelsProgramming skills are critical hereMonitor performanceDistribution of scores over time, predictiveness, usage of model...Plan model maintenance
28 Where Actuarial Science Needs Data Mining Modeling TechniquesWhere Actuarial Science Needs Data Mining
29 The Greatest Hits Unsupervised: no target variable ClusteringPrincipal Components (dimension reduction)Supervised: predict a target variableRegression GLMNeural NetworksMARS: Multivariate Adaptive Regression SplinesCART: Classification And Regression Trees
30 Regression and its Relations GLM: relax regression’s distributional assumptionsLogistic regression (binary target)Poisson regression (count target)MARS & NNClever ways of automatically transforming and interacting input variablesWhy: sometimes “true” relationships aren’t linearUniversal approximators: model any functional formCART is simplified MARS
31 Neural Net Motivation Let X1, X2, X3 be three predictive variables policy age, historical LR, driver ageLet Y be the target variableLoss ratioA NNET model is a complicated, non-linear, function φ such that:φ(X1, X2, X3) ≈ Y
35 In more detail…The NNET model results from substituting the expressions for Z1 and Z2 in the expression for Y.
36 In more detail…Notice that the expression for Y has the form of a logistic regression.Similarly with Z1, Z2.
37 In more detail…You can therefore think of a NNET as a set of logistic regressions embedded in another logistic regression.
38 Universal Approximators The essential idea: by layering several logistic regressions in this way……we can model any functional formno matter how many non-linearities or interactions between variables X1, X2,…by varying # of nodes and training cycles onlyNNETs are sometimes called “universal function approximators”.
39 MARS / CART MotivationNNETs use the logistic function to combine variables and automatically model any functional formMARS uses an analogous clever idea to do the same workMARS “basis functions”CART can be viewed as simplified MARSBasis functions are horizontal step functions NNETS, MARS, and CART are all cousins of classic regression analysis
40 Reference For Beginners: For Mavens: Data Mining Techniques --Michael Berry & Gordon LinhoffFor Mavens:The Elements of Statistical Learning--Jerome Friedman, Trevor Hastie, Robert Tibshirani