Presentation is loading. Please wait.

Presentation is loading. Please wait.

Types of Scoring. Credit Scoring Estimate if customer will pay / no pay based upon various information –Applicant Characteristics –Credit Bureau Information.

Similar presentations


Presentation on theme: "Types of Scoring. Credit Scoring Estimate if customer will pay / no pay based upon various information –Applicant Characteristics –Credit Bureau Information."— Presentation transcript:

1 Types of Scoring

2 Credit Scoring Estimate if customer will pay / no pay based upon various information –Applicant Characteristics –Credit Bureau Information –Repayment behaviour of other customers Develop models (scorecards) estimating the probability of default p(Default) Typically assign points to each piece of information, add all points and compare with threshold

3 Judgemental vs. Statistical Judgemental –Based upon experience –5 c’s Character, Capital, Collateral, Capacity, Condition –ID, Ability, Intent Statistical –Based upon multivariate correlations between input and risk of default Both assume that the future will resemble the past

4 Types of Credit Scoring Judgemental (Qualitative credit scoring) Application Scoring Behavioural Scoring Profit Scoring Bankruptcy Prediction Fraud Prediction

5 Application Scoring Estimate probability of default at the time the customer applies for the loan Use predetermined definition of default –E.g. 90 days past due Application variables versus credit reference agency variables Snap shot to snap shot Static

6 Example Application Scorecard

7 Behavioural Scoring Existing customers –Already have the credit –Already have products (Hybrid) Update risk assessment to take into account the customers recent behaviour Uses include –Capital –Credit limits for revolving credit Video clip to snap shot Dynamic

8 Behavioural Scoring (Cont) Estimate future defaults of a given portfolio of customers Debt provisioning and Profit Scoring Many, Many Variables –Input selection Behavioural Scoring can be used for –Authorising accounts for ‘special’ treatment –Setting credit limits –Renewals / Revivals –Collections Strategies

9 Bankruptcy Prediction Binary approach –Predict bankruptcy versus non bankruptcy given ratios describing financial status of companies Multiclass approach –Assign ratings (e.g. AAA, AA, A, BBB, …) to reflect creditworthiness –Each rating corresponds to a default probability (PD)

10 Developing a Rating Based System Application and behavioural scoring models provide ranking of customers according to risk This was ‘ok’ in the past (e.g. for loan approvals from banks) but Basel II required ‘well calibrated default probabilities’ Map the scores (or probabilities) of customers to a number of distinct borrower grades / pools Decide upon the number of classes and their definition Impact upon regulatory capital! Classes should be sufficiently discriminatory and stable (Migration matrix)

11 Developing a Rating System For retail “For each pool identified, the bank must be able to provide quantitative measures of loss characteristics (PD, LGD and EAD) for that pool. The level of differentiation for IRB purposes must ensure that the number of exposures for a given pool is sufficient to allow for meaningful quantification and validation of the loss characteristics at the pool level. There must be a meaningful distribution of borrowers and exposures across pools. A single pool must not include an undue concentration of the banks total retail exposure” paragraph 409 of the Basel Capital Accord

12 So What Basel II rewards with smaller working capital requirements (Big carrot) Basel II sets high standards for scorecard development and monitoring (Big stick) Inevitably this means more accurate models and data within credit industry –Opportunity for debt agencies to benefit from these improvements –Challenge for debt agencies to gain benefits first (and sustain them)

13 Pre-processing Data For Credit Scoring

14 Types of variables Sampling Missing Values Outlier Detection Feature construction and transformation Discretion and Grouping of Attributes Coding Nominal and Ordinal Variables Segmentation Definition of Target Variable

15 Types of Variables Continuous –E.g. Income, amount of savings, … Discrete –Nominal E.g. Purpose of loan, marital status –Ordinal E.g. Age encoded as young, middle age and old –Binary E.g. Gender

16 Sampling Sample of past customers needs to be a similar as possible to current customers Stratified sampling Timing of sampling –How far back do I go –Trade off. Many data vs. recent data Number of ‘Bads’ verses Number of ‘Goods’ –Undersampling verses Oversampling? Make sure performance window is long enough to stabilise bad rate! Reject Inference

17

18 Statistical Methods –Hard Cut Off Augmentation –Parcelling –Nearest Neighbour methods Gain Extra Information –CRA Apply to everybody! Withdrawal inference?

19 Missing Values Keep or Delete or Replace Keep –The fact that a variable is missing can be important information! –Encode variable in a special way Delete –When too many variables are missing removing the variable or the observation may be an option –Horizontally verses Vertically missing values Replace –Estimate missing value using imputation procedures –Mean versus Median

20 Outliers Typically due to recording / data entry errors (noise) Types of Outliers –Valid observation (E.g. salary of Directors) –Invalid observation (E.g. Age = -2003) –Univariate outliers verses Multivariate outliers Detection verses Treatment Detection –‘Orientation Reports’ –Histogram, Box Plot, Max, Min Treatment –Treat as Missing –Truncation

21 Discretisation and Grouping of Attribute Values Motivation –Group values of discrete variables for robust analysis –Interpretation (e.g. point based scorecard) –Create concept hierarchies (Group low level concepts e.g. raw age data to higher level concepts such as 'young', 'middle aged', 'old’) Also called ‘Coarse Classing’ or ‘Classing’ Methods –Equal interval binning –Equal frequency binning (histogram equalisation) –Chi-squared analysis –Entropy based discretisation (Decision Tree)

22 The Binning Method E.g. Consider the attribute ‘Income’ –1000, 1200, 1300, 2000, 1800, 1400 Equal Interval Binning –Bin Width = 500 –Bin 1 (1000 – 1500) : 1000, 1200, 1300, 1400 –Bin2 (1500 – 2000) 1800, 2000 Equal Frequency Binning (Histogram equalisation) –2 Bins –Bin 1 : 1000, 1200, 1300 –Bin 2 : 1400, 1800, 200

23 Recoding Nominal / Ordinal Variables Discrete variables –Dummy encoding Weights of Evidence (WoE) coding

24 Dummy Coding Set 1 dummy to 0 because of perfect correlation Purpose, Job, Marital Status, Residential Status, … Many Dummies! Explosion of data set!

25 Weights of Evidence (WoE) Measure strength of each (grouped) attribute in separating goods and bads Higher the WoE means the less risk the attribute has Weight of Evidence attribute = ln(p_good attribute /p_bad attribute ) Where p_good attribute = number of good attribute / number of good total Where p_bad attribute = number of bad attribute / number of bad total If p_good attribute > p_bad attribute then WoE > 0 If p_good attribute < p_bad attribute then WoE < 0

26 Weight of Evidence Interval Bad AgeObs GoodsGoodBads RateWoE Missing502.50%422.33%84.12%0.16-57.28 18 to 2220010.00%1528.42%4824.74%0.24-107.83 23 to 2630015.00%24613.62%5427.84%0.18-71.47 27 to 2945022.50%40522.43%4523.20%0.10-3.38 30 to 3550025.00%47526.30%2512.89%0.0571.34 35 to 4435017.50%33918.77%115.67%0.03119.71 44 plus1507.50%1478.14%31.55%0.02166.08 2,0001,8061940.10 Information Value0.650

27 Information Value (IV) The Information Value is a measure of the predictive value of a characteristic. It is used to –Judge the appropriateness of the classing –Select predictive characteristics The IV is similar to entropy: IV = ∑((p_good attribute – p_bad attribute ) * WoE attribute) In reality becomes an arbitrary cut off to filter out variables A univariate analysis

28 Segmentation When more than one scorecard is required Build a scorecard for each segment separately Based upon expert knowledge or based upon statistics Beware not to over segment!!!!

29 Definitions of Bad 30 / 60 / 90 days past due Charge / Write Off Bankrupt Claim over £x Profit based Negative NPV Less that x% owed collected Fraud over £500

30 Building Regression Scorecards

31 Linear Regression for Classification Y = Use Ordinary Least Squares (OLS) regression to estimate Y = 1 if good payer; Y = 0 if bad payer Statistical problems –Residuals are not normally distributed –Residuals have unequal variances P can be > 1 and < 0 !! Regression discriminant analysis

32 Logistic Regression


Download ppt "Types of Scoring. Credit Scoring Estimate if customer will pay / no pay based upon various information –Applicant Characteristics –Credit Bureau Information."

Similar presentations


Ads by Google