Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 7: Operational Risk Measurement Created by Bambang Hermanto,Ph.D.

Similar presentations


Presentation on theme: "Lecture 7: Operational Risk Measurement Created by Bambang Hermanto,Ph.D."— Presentation transcript:

1 Lecture 7: Operational Risk Measurement Created by Bambang Hermanto,Ph.D

2 Operational Risk Measurement Agregate Losses Models Convolution Concepts and Application in Loss Distribution Comparison Of Convolution Method TVaR for Aggregate Losses

3 Agregate Losses Models

4 The purpose of this session is to develop models of aggregate losses, the total amount of all losses occurring in a fixed time period. We can represent the aggregate losses as the sum, S, of a random number, N, of the individual loss amounts (X 1, X 2,..., X N ). Hence, where it is understood that S = 0 when N = 0. The distribution of S is obtained from the distribution of N and the distribution of the X j s. Using this approach, the frequency and the severity of losses are modeled separately.

5 Agregate Losses Models The information about these distributions is used to obtain information about S. An alternative to this approach is to simply gather information about S (e.g., total losses each month for a period of months) and to use some continuous distribution model to model the distribution of S. Modeling the distribution of N and the distribution of the X j s separately has some distinct advantages: (1) When the expected number of operational risk losses changes as the company grows, growth needs to be accounted for in forecasting the number of operational risk losses in future years based on past years’ data. (2)The effects of general economic inflation may need to be reflected in the sizes of losses that are subject to inflationary pressures.

6 Agregate Losses Models (3) The impact of changing limits that result from covering excess losses with insurance as a risk mitigation strategy is more easily studied. This is done by changing the specification of the severity distribution. (4) The impact on loss frequencies of changing thresholds for small losses is better understood. (5) Data that are heterogeneous in terms of thresholds and limits can be combined to obtain the hypothetical loss size distribution. This is useful when data from several years are combined.

7 Agregate Losses Models (6) The shape of the distribution of S depends on the shapes of both distributions of N and X. For example, if the severity distribution has a much heavier tail than the frequency distribution, the shape of the tail of the distribution of aggregate losses or losses will be determined by the severity distribution and will be insensitive to the choice of frequency distribution. In summary, a more accurate and flexible model can be constructed by examining frequency and severity separately.

8 Agregate Losses Models Note: We will refer to N as the loss count (or frequency) random variable and will refer to its distribution as the loss count (or frequency) distribution. The expression number of losses will also be used. The X j s are the individual or single loss (or severity) random variables. Finally, S is the aggregate loss random variable or the total loss random variable.

9 Operational Risk Measurement Agregate Losses Models Convolution Concepts and Application in Loss Distribution Comparison Of Convolution Method TVaR for Aggregate Losses

10 The Compound Model For Aggregate Losses Let S denote aggregate losses associated with a set of N observed losses X 1, X 2, …, X N satisfying the following independence assumption: given that there are n losses, the loss sizes are mutually independent random variables whose common distribution does not depend on n. The approach is to: 1. Develop a model for the distribution of N based on data. 2. Develop a model for the common distribution of the X j s based on data. 3. Using these two models, carry out necessary calculations to obtain the distribution of S.

11 The Compound Model For Aggregate Losses Completion of the first two steps follows the ideas in earlier chapters. We now presume that these two models are developed and that we only need to carry out numerical work in obtaining solutions to problems associated with the distribution of S.

12 The Compound Model For Aggregate Losses The random sum: (where N has a counting distribution) has distribution function: Where: is the common distribution function of the X j s (1)

13 The Compound Model For Aggregate Losses In equation 1, F X *n (x) is the “n-fold convolution" of the cdf of X. It can be obtained as: And: (2)

14 The Compound Model For Aggregate Losses If X is a continuous random variable with probability zero on negative values, equation 2 reduces to For k=1 this equation reduces to: By differentiating, the pdf is:

15 The Compound Model For Aggregate Losses In the case of discrete random variables with positive probabilities at 0, 1, 2,..., equation 2 reduces to: The corresponding pf is:

16 The Compound Model For Aggregate Losses The distribution (in equation 1) is called a compound distribution, and the pf for the distribution of aggregate losses is The pgf of S is: because of the independence of X 1, …, X n for fixed n. (3)

17 The Compound Model For Aggregate Losses The pgf of S can be written as: Where P N (z) and P M (z) are called the primary and secondary distributions, respectively. The "secondary" distribution plays the role of the loss size distribution. In the case where: (that is, N is itself a compound distribution), the pgf of aggregate losses is:

18 The Compound Model For Aggregate Losses From equation 3, the moments of S can be obtained in terms of the moments of N and the X j s. The first three moments are: Here, the first subscript indicates the appropriate random variable, the second subscript indicates the order of the moment, and the superscript is a prime (') for raw moments (moments about the origin) and is unprimed for central moments (moments about the mean). (4)

19 The Compound Model For Aggregate Losses The moments can be used on their own to provide approximations for probabilities of aggregate losses by matching the first few model and sample moments. Example 1: The observed mean (and standard deviation) of the number of losses and the individual losses over the past 10 months are 6.7 (2.3) and 179,247 (52,141), respectively. Determine the mean and variance of aggregate losses per month.

20 The Compound Model For Aggregate Losses Example 2 ((Illustration of convolution calculations): Suppose individual losses follow the distribution given an table below (given in units of $1000).

21 The Compound Model For Aggregate Losses Example 2 (cont’d): The probability that the aggregate loss is x thousand dollars is Determine the pf of S up to $21,000. Determine the mean and standard deviation of total losses. The distribution up to amounts of $21,000 is given in table below.

22

23 The Compound Model For Aggregate Losses Example 2 (cont’d): Formula explanation from the table

24 The Compound Model For Aggregate Losses Example 2 (cont’d): To obtain f s (x), each row of the matrix of convolutions of f x (x) is multiplied by the probabilities from the row below the table and the products are summed. Using equation 4, the first two moments of the distribution f s (x) are Hence the aggregate loss has mean $12,580 and standard deviation $7664. (Why can’t the calculations be done from table above ?)

25 Evaluation of The Aggregate Loss Distribution The computation of the compound distribution function: or the corresponding probability (density) function is generally not an easy task, even in the simplest of cases. In this section we discuss a number of approaches to numerical evaluation of the right-hand side of equation 5 for specific choices of the frequency and severity distributions as well as for arbitrary choices of one or both distributions. (5)

26 Evaluation of The Aggregate Loss Distribution That approaches is: The approximating distribution method The direct calculation method The recursive method The inversion method (e.g. Fourier transform) The approximating distribution method This approach was used in example below where the method of moments was used to estimate the parameters of the approximating distribution. The advantage of this method is that it is simple and easy to apply.

27 The approximating distribution method However, the disadvantages are significant. First, there is no way of knowing how good the approximation is. Choosing different approximating distributions can result in very different results, particularly in the right-hand tail of the distribution. Of course, the approximation should improve as more moments are used; but after four moments, we quickly run out of distributions! The approximating distribution may also fail to accommodate special features of the true distribution. For example, when the loss distribution is of the continuous type and there is a maximum possible loss (for example, when there is insurance in place that covers any losses in excess of a threshold), the severity distribution may have a point mass (“atom” or “spike”) at the maximum.

28 The approximating distribution method Example: The observed mean (and standard deviation) of the number of losses and the individual losses over the past 10 months are 6.7 (2.3) and 179,247 (52,141), respectively. Using normal and lognormal distributions as approximating distributions for aggregate losses, calculate the probability that losses will exceed 140% of expected costs. That is,

29 The approximating distribution method Example (cont’d): For the normal distribution The mean and second raw moment of the lognormal distribution are

30 The approximating distribution method Example (cont’d): Equating these to 1.200955 x 10 6 and 1.88180 x 10 11 + (1.200955 x 10 6 ) 2 =1.63047 x 10 12 and taking logarithms results in the following two equations in two unknowns: For this, μ=13.93731 and σ 2 =0.1226361. Then

31 The approximating distribution method Example (cont’d): The normal distribution provides a good approximation when E(N) is large. In particular, if N has the Poisson, binomial, or negative binomial distribution, a version of the central limit theorem indicates that, as λ, m, or r, respectively, goes to infinity, the distribution of S becomes normal. In this example, E(N) is small so the distribution of S is likely to be skewed. In this case the lognormal distribution may provide a good approximation, although there is no theory to support this choice.

32 Direct Calculation method The most difficult (or computer intensive) part is the evaluation of the n-fold convolutions of the severity distribution for n = 2,3,4,.... In some situations, there is an analytic form for example, when the severity distribution is closed under convolution. Otherwise the convolutions must be evaluated numerically using: 6

33 Direct Calculation method When the losses are limited to nonnegative values (as is usually the case), the range of integration becomes finite, reducing formula 6 to: These integrals are written in Lebesgue-Stieltjes form because of possible jumps in the cdf F x (x) at zero and at other points. Without going into the formal definition of the Lebesgue- Stieltjes integral, it suffices to interpret ∫g(y)dF X (y) as to be evaluated by integrating g(y)dF X (y) over those y values for which X has a continuous distribution and then adding g(y i )Pr(X=y i ) over those points where Pr(X=y i )>0. This This allows for a single notation to he used for continuous. discrete, and mixed random variables. (7)

34 Direct Calculation method Numerical evaluation of 7 requires numerical integration methods. Because of the first term inside the integral, the right- hand side of (6.12) needs to be evaluated for all possible values of x and all values of k. This can quickly become technically overpowering! A simple way to avoid these technical problems is to replace the severity distribution by a discrete distribution defined at multiples 0,1,2... of some convenient monetary unit such as $1,000.

35 Direct Calculation method This reduces formula 7 to (in terms of the new monetary unit): The corresponding pf is: In practice, the monetary unit can be made sufficiently small to accommodate spikes at maximum loss amounts. One needs only the maximum to be a multiple of the monetary unit to have it located at exactly the right point. As the monetary unit of measurement becomes smaller, the discrete distribution function will approach the true distribution function. For example, round all losses or losses to the nearest $1,000.

36 Direct Calculation method When the severity distribution is defined on nonnegative integers 0, 1, 2,..., calculating f X *k (x) for integral x requires x+1 multiplications. Then carrying out these calculations for all possible values of k and x up to m requires a number of multiplications that are of order m 3, written as O(m 3 ) to obtain the distribution (in equation 5) for x=0 to x=m. When the maximum value, m, for which the aggregate losses distribution is calculated is large, the number of computations quickly becomes prohibitive, even for fast computers. Further, if Pr(X = 0)>0, an infinite number of calculations are required to obtain any single probability exactly. This is because F X *n (x)>0 for all n and all x and so the sum in equation 5 contains an infinite number of terms.

37 The Recursive Method The recursive method That reduces the number of computations discussed above to O(m 2 ), which is a considerable savings in computer time, a reduction of about 99.9% when m=1000 compared to direct calculation. However, the method is limited to certain frequency distributions. Fortunately, it includes all frequency distributions discussed in before session (lecture 6). Suppose that the severity distribution f x (x) is defined on 0,1,2,…,m representing multiples of some convenient monetary unit. The number m represents the largest possible loss and could be infinite.

38 The Recursive Method The recursive method Further, suppose that the frequency distribution, p k is a member of the (a,b,1) class and therefore satisfies: Then the following result holds. Theorem 1 (Extended Panjer recursion) For the (a,b,1) class, noting that x Λ m is notation for min(z, m). (8)

39 The Recursive Method The recursive method Corollary 1 (Panjer recursion) For the ( a,b,0 ) class, the result (equation 8) reduce to: Note that when the severity distribution has no probability at zero, the denominators of equations (8) and (9) are equal to 1. The recursive formula (9) has become known as the Panjer formula in recognition of the introduction to the actuarial literature by Panjer (1981). The recursive formula (8) is an extension, of the original Panjer formula. It was first proposed by Sundt and Jewell (1981). (9)

40 The Recursive Method The recursive method In the case of the Poisson distribution, equation (9) reduces to: The starting value of the recursive schemes (equation 8) and (equation 9) is f S (0)=P N [f X (0)] following theorem 2 with an appropriate change of notation. Theorem 2: For any compound distribution, g 0 =P N (f 0 ), where P N (z) is the pgf of the primary distribution and f 0 is the probability that the secondary distribution takes on the value zero.

41 The Recursive Method The recursive method In the case of the Poisson distribution, we have: Table below gives the corresponding initial values for all distributions in the (a,b,1) class using the convenient simplifying notation f 0 = f x (0).

42

43 The Inversion (Fast Fourier Transform (FFT) Method The inversion (Fast Fourier transform/FFT) method That numerically inverts a transform, such as the characteristic function or Fourier transform, using general or specialized inversion software. Inversion methods discussed in this section are used to obtain numerically the probability function, from a known expression for a transform, such as the pgf, mgf, or cf of the desired function. Compound distributions lend themselves naturally to this approach because their transforms are compound functions and are easily evaluated when both frequency and severity components are known.

44 The Inversion (Fast Fourier Transform (FFT) Method The inversion (Fast Fourier Transform/FFT) method The pgf and cf of the aggregate loss distribution are: The characteristic function always exists and is unique. Conversely, for a given characteristic function, there always exists a unique distribution. The objective of inversion methods is to obtain the distribution numerically from the characteristic function (equation 10). (10)

45 The Inversion (Fast Fourier Transform (FFT) Method The inversion (Fast Fourier Transform/FFT) method The FFT is an algorithm that can be used for inverting characteristic functions to obtain densities of discrete random variables. The FFT comes from the field of signal processing. It was first used for the inversion of characteristic functions of compound distributions by Bertram (1981) and is explained in detail with applications to aggregate loss calculation by Robertson (1992).

46 The Inversion (Fast Fourier Transform (FFT) Method The inversion (Fast Fourier Transform/FFT) method For any continuous function f(x), the Fourier transform is the mapping: The original function can be recovered from its Fourier transform as: (11)

47 The Inversion (Fast Fourier Transform (FFT) Method The inversion (Fast Fourier Transform/FFT) method When f(x) is a probability density function, is its characteristic function. For our applications, f(z) will be real valued. From formula (11), is complex valued. When f(x) is a probability function of a discrete (or mixed) distribution, the definitions can be easily generalized (see, for example, Fisz (1963)).

48 The Inversion (Fast Fourier Transform (FFT) Method The inversion (Fast Fourier Transform/FFT) method Let f x denote a function defined for all integer values of x that is periodic with period length n (that is, f x+n =f x for all x). For the vector ( f 0, f 1, …, f n-1 ), the discrete Fourier transform is the mapping, x=...,-1,0,1,..., defined by: (12)

49 The Inversion (Fast Fourier Transform (FFT) Method The inversion (Fast Fourier Transform/FFT) method This mapping is bijective. In addition, is also periodic with period length n. The inverse mapping is: This inverse mapping recovers the values of the original function. Because of the periodic nature of f and, we can think of the discrete Fourier transform a bijective mapping of n points into n points. From formula (12), it is clear that, in order to obtain n values of, the number of terms that need to be evaluated is of order n 2, that is, O(n 2 ). (13)

50 The Inversion (Fast Fourier Transform (FFT) Method The inversion (Fast Fourier Transform/FFT) method The Fast Fourier Transform (FFT) is an algorithm that reduces the number of computations required to be of order O(n In 2 n). This can be a dramatic reduction in computations when n is large. The algorithm exploits the property that a discrete Fourier transform of length n can be rewritten as the sum of two discrete transforms, each of length n/2, the first consisting of the even-numbered points and the second consisting of the odd-numbered points.

51 The Inversion (Fast Fourier Transform (FFT) Method The inversion (Fast Fourier Transform/FFT) method When m=n/2. Hence: (14)

52 The Inversion (Fast Fourier Transform (FFT) Method The inversion (Fast Fourier Transform/FFT) method These can, in turn, be written as the sum of two transforms of length m/2. This can be continued successively. For the lengths n/2, m/2,... to be integers, the FFT algorithm begins with a vector of length n=2 r. The successive writing of the transforms into transforms of half the length will result, after r times, in transforms of length 1. Knowing the transform of length 1 will allow us to successively compose the transforms of length 2, 2 2, 2 3,…, 2 r by simple addition using formula (14). Details of the methodology are found in Press et al. (1988).

53 The Inversion (Fast Fourier Transform (FFT) Method The inversion (Fast Fourier Transform/FFT) method In our applications, we use the FFT to invert the characteristic function when discretization of the severity distribution is done. This is carried out as follows: (1) Discretize the severity distribution, obtaining the discretized severity distribution: where n=2 r for some integer r and n is the number of points desired in the distribution f s (x) of aggregate losses.

54 The Inversion (Fast Fourier Transform (FFT) Method The inversion (Fast Fourier Transform/FFT) method (2) Apply the FFT to this vector of values, obtaining the characteristic function of the discretized distribution. The result is also a vector of n=2 r values. 3. Transform this vector using the pgf transformation of the loss frequency distribution, obtaining: which is the characteristic function, that is, the discrete Fourier transform of the aggregate losses distribution, a vector of n=2 r values.

55 The Inversion (Fast Fourier Transform (FFT) Method The inversion (Fast Fourier Transform/FFT) method (4) Apply the Inverse Fast Fourier Transform (IFFT), which is identical to the FFT except for a sign change and a division by n [see formula (13)]. This gives a vector of length n=2 r values representing the exact distribution of aggregate losses for the discretized severity model.

56 The Inversion (Fast Fourier Transform (FFT) Method The inversion (Fast Fourier Transform/FFT) method The FFT procedure requires a discretization of the severity distribution. When the number of points in the severity distribution is less than n=2 r, the severity distribution vector must be padded with zeros until it is of length n. When the severity distribution places probability on values beyond x=n, the probability that is missed in the right-hand tail beyond n can introduce some minor error in the final solution because the function and its transform are both assumed to be periodic with period n, when in reality they are not.

57 The Inversion (Fast Fourier Transform (FFT) Method The inversion (Fast Fourier Transform/FFT) method So we should putting all the remaining probability at the final point at x=n so that the probabilities add up to 1 exactly. This allows for periodicity to be used for the severity distribution in the FFT algorithm and ensures that the final set of aggregate probabilities will sum to 1. However, it is imperative that n be selected to be large enough so that most all the aggregate probability occurs by the nth point.

58 The Inversion (Fast Fourier Transform (FFT) Method The inversion (Fast Fourier Transform/FFT) method Example: Suppose the random variable X takes on the values 1,2, and 3 with probabilities 0.5, 0.4, and 0.1, respectively. Further suppose the number of losses has the Poisson distribution with parameter λ=3. Use the FFT to obtain the distribution of S using n=8 and n=4096.

59 Operational Risk Measurement Agregate Losses Models Convolution Concepts and Application in Loss Distribution Comparison Of Convolution Method TVaR for Aggregate Losses

60 Comparison Of Convolution Method The recursive method The recursive method has some significant advantages over the direct method using convolutions. The time required to compute an entire distribution of n points is reduced to O(n 2 ) from O(n 3 ) for the direct convolution method when its support is unlimited and to O(n) when its support is limited. Furthermore, it provides exact values when the severity distribution is itself discrete (arithmetic). The only source of error is in the discretization of the severity distribution. Except for binomial models, the calculations are guaranteed to be numerically stable.

61 Comparison Of Convolution Method The recursive method This method is very easy to program in a few lines of computer code. However, it has a few disadvantages. The recursive method only works for the classes of frequency distributions. Using distributions not based on the (a,b,0) and (a,b,1) classes requires modification of the formula or developing a new recursion. Numerous other recursions have been developed in the actuarial and statistical literature recently. (see Panjer, 2006)

62 Comparison Of Convolution Method The FFT method is easy to use in that it uses standard routines available with many software packages. It is faster than the recursive method when n is large because it requires calculations of order n In 2 n rather than n 2. However, if the severity distribution has a fixed (and not too large) number of points, the recursive method will require fewer computations because the sum in formula (8) will have at most m terms, reducing the order of required computations to be of order n, rather than n 2 in the case of no upper limit of the severity. The FFT method can be extended to the case where the severity distribution can take on negative values. Like the recursive method, it produces the entire distribution.

63 Operational Risk Measurement Agregate Losses Models Convolution Concepts and Application in Loss Distribution Comparison Of Convolution Method TVaR for Aggregate Losses

64 Now we will discuse the tail-value-at-risk (TVaR) If distributions of gains or losses are restricted to the normal distribution, VaR satisfies all coherencey requirements (or for generally to elliptical distributions) However, in operational loss or other risk types, mostly the loss distribution is not normal, but most loss distribution have considerable skewness  consequently for VaR: lack of subadditivity Let X denote a loss random variable. The TVaR of X at the 100p% confidence level, denoted TVaR p (X), is the expected loss given that the loss exceeds the 100p percentile (or quantile) of the distribution of X. TVaR for Aggregate Losses

65 We can simply write TVaR p (X) for random variable X as Where F(x) is the cdf of X. Furthermore, for continuous distributions, if the above quantity is finite, we can use integration by parts and substitution to rewrite this as Thus TVaR can be seen to average all VaR values above confidence level p. This means that TVaR tells us much more about the tail of the distribution than VaR alone. TVaR for Aggregate Losses

66 Finally, TVaR can also be written as where e(x p ) is the mean excess loss function. Thus TVaR is larger than corresponding VaR by the average excess of all losses that exceed VaR. Developing of TVaR: – in insurence field that called Conditional Tail Expectation (CTE) – widely known in North America (see Wirch (1999)) – in Erope is called Expected Shortfall (ES) – see Tasche (2002) and Acerbi and Tasche (2002) TVaR is a coherent measure (see Artzner et al 1997) TVaR for Aggregate Losses

67 Clearly, the shape of this distribution depends on the shape of both the discrete frequency distribution and the continuous (or possibly discrete) severity distribution. If the severity distribution is light-tailed and the frequency distribution is not, then one could expect the tail of the aggregate loss distribution to be largely determined by the frequency distribution. Indeed, in the extreme case where all losses are of equal size, the shape of the aggregate loss distribution is completely determined by the frequency distribution. TVaR for Aggregate Losses

68 On the other hand, if the severity distribution is heavy- tailed and the frequency is not, then one could expect the shape of the tail of the aggregate loss distribution to be determined by the shape of the severity distribution because extreme outcomes will be determined with high probability by a single, or at least very few, large losses. In practice, if both the frequency and severity distribution are specified, it is easy to compute the TVaR at a specified quantile. TVaR for Aggregate Losses

69 As discussed in earlier sections in this chapter, the numerical evaluation of the aggregate loss distribution requires a discretization of the severity distribution resulting in a discretized aggregate loss distribution. We, therefore, give formulas for the discrete case. Consider the random variable S representing the aggregate losses. The overall mean is the product of the means of the frequency and severity distributions Then the TVaR at quantile x, for this distribution is: TVaR for Discrete Aggregate Loss Distributions (15)

70 Noting that: we see that, because S>=0, the last sum in equation (16) is taken over a finite number of points, the points of support up to the quantile x p. TVaR for Discrete Aggregate Loss Distributions (16)

71 Then the result of the equation (16) can be substituted into equation (15) to obtain the value of the TVaR. The value of the TVaR at high quantiles (as are required in operational risk) depends on the shape of the aggregate loss distribution. For certain distributions, we have analytic results that can give us very good estimates of the TVaR. To do this we first need to give some results on the extreme tail behavior of the aggregate loss distribution. TVaR for Discrete Aggregate Loss Distributions

72 Thanks For Your Attention


Download ppt "Lecture 7: Operational Risk Measurement Created by Bambang Hermanto,Ph.D."

Similar presentations


Ads by Google