Statistical Techniques for Analyzing Defaults
While the option-based approach provides a consistent way of thinking about default probabilities and prices of corporate bonds, it seems implausible that a single value, the value of the firm’s assets, is the sole determinant of default probabilities. We saw, in fact, that the liquidity of assets and restrictions on asset sales were key factors as well. It is not always easy to build full structural models which include all the variables that empirically influence estimated default probabilities. The intensity models that we will turn to later try to include more variables in the default pricing, typically (but not necessarily) at the cost of making their influence exogenously specified. Before we enter into these models it is natural to look at some of the dominating methods used for default probability estimation. As we will see, the most natural statistical framework for analyzing defaults is also the natural framework for linking credit scoring with pricing. The focus of this section is on model structure. No properties of the estimators are proved.
Firm default is an example of a qualitative response—at least if we do not look at the severity of the default in terms of recoveries. We simply observe a firm’s characteristics and whether it defaults or not. Logistic regression views the probability of default as depending on a set of firm characteristics (or covariates). The actual response depends on a noise variable, just like the deviation of a response from its mean in ordinary regression depends on a noise variable.
Specifically, let Y be the observed status of the firm at the end of some predetermined time horizon. Let Y = 1 if the firm has defaulted and 0 otherwise. The assumption in a logistic regression is that for each firm,
The function p of course needs to take values in (0, 1) and this is done typically by using either a logit specification or a probit specification. In the logit specification,