Academic journal article Journal of Risk and Insurance

Estimation of Truncated Data Samples in Operational Risk Modeling

Academic journal article Journal of Risk and Insurance

Estimation of Truncated Data Samples in Operational Risk Modeling

Article excerpt

INTRODUCTION

Available databases of operational losses usually do not store records below some data collection thresholds at the event level. Having a data collection threshold helps to avoid difficulties with recording and storing too many small loss events. However, omitting data falling below the threshold will most likely make the difficult problem of modeling operational risk accurately even more challenging. An important challenge is whether one needs to take into account the fact that the data are censored from below at the data collection threshold. Theoretically, it would be more appropriate to account for the existence of censored losses by explicitly acknowledging the truncated nature of the data samples while modeling both severity and frequency distributions under the widely accepted loss distribution approach (LDA). However, the operational risk literature reports that attempts to fit operational losses using truncated severity distributions by the method of maximum likelihood estimation are not always successful. It turns out that, in many cases, the likelihood surface is nearly flat or even ascending with no global maximum, which forces standard optimization algorithms to encounter numerical problems. Also, unconditional frequency estimates are quite high, reflecting a large number of censored small losses and making the convolution algorithms (such as Panjer recursion, Fourier transform, or Monte Carlo simulation) computationally intense. To avoid the above-mentioned difficulties, some researchers suggest the use of the so-called shifting approach. Under the shifting approach, a loss sample is shifted to the left by the threshold value, the shifted sample is fitted by a non-truncated distribution, and the resulting distribution is shifted to the right to derive the estimated loss distribution. Proponents of the shifting approach present the following arguments, among others, to support it. The shifting approach eliminates the numerical difficulties with fitting that the truncation approach often encounters. Under the shifting approach, fitting results are more stable and convolution algorithms are more efficient. Also, in cases of very heavy-tailed severity distributions, omitting censored data leads to negligible changes in capital estimates. Supporters of the truncation approach argue that the shifting approach leads to stability of estimates at the expense of adding significant bias to parameter as well as capital estimates. So far, the operational risk literature could not present clear evidence favoring one approach over the other. For instance, using a simulation study, Shevchenko (2010) shows that for light-tail lognormal severity distributions, the shifting approach might induce significant bias in comparison to the truncation approach, but this bias becomes insignificant for heavy-tail lognormal distributions. Meanwhile, simulation studies performed by Cavallo et al. (2012) under the shifting approach reveal that the overstatement or understatement of the severity of an individual loss in the extreme right tail depends on the characteristics of the data sample. In addition, the literature is not clear about the reasons as to why one sample generates stable and reasonable fitting results under the truncation approach while another sample, with similar characteristics, might lead to unstable and unreasonable results. Also, it is not clear whether the trade-off between stability and bias under the shifting approach is tolerable. In this article, we focus on the challenges of estimating the parameters of the lognormal severity and Poisson frequency distributions under the truncation approach, and derive a specific, necessary, and sufficient condition for the existence of the global solution to the severity parameter estimation problem. In the sequel, we also call this condition the regularity condition. An important implication of this result is that if the regularity condition is not satisfied, the maximum likelihood estimate does not exist, meaning the loss data sample under consideration does not support the lognormal model and a different model needs to be used. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.