In our previous article, we presented a heuristic for optimal clustering of risk ranking scores in time series data, to achieve robust PD estimates under grade-level (discrete) estimation of PDs. Whilst attractive from a pure compliance perspective, discrete estimation discards information about the relative risk between adjacent bins, and tends to be less useful than continuous estimation for internal risk management purposes.

In this article, we investigate optimising a mapping function to derive a “direct estimation” calibration, using a polynomial in log-odds space.

We found that the standard calibration assumptions of intercept shift and slope-and-intercept adjustments in log-odds space generally fail to provide a sufficiently accurate calibration, in individual bins. However, we also found that a cubic mapping curve not only provides sufficiently accurate calibration at the bin-level, but is also reasonably robust to the choice of binning schema. Accuracy of mapped PDs is usually retained when the mapping is then tested under different binning schemas.

Introduction

The CRR allows firms to adopt “direct” (continuous) or “discrete” (binned pools or grades) estimation of PDs. Direct estimation is generally seen as more useful to banks’ internal processes than discrete estimation, as it facilitates granular risk differentiation between customers when operating close to pricing or credit acceptance decision boundaries. It also avoids incentivising adverse selection biases that can arise if a bank were to lend to riskier customers that can be charged a higher margin, for the same capital requirement.

A particular challenge, however, lies in the choice of calibration function. A common approach is to apply an intercept adjustment in log-odds space. Indeed, in many portfolios there may be insufficient default data to parameterise anything higher order. Indeed paragraph 92 of EBA/GL/2017/16 explicitly positions calibrating to “default rate at the level of the calibration segment” as a compliant approach, albeit subject to checking that the resulting grade-level estimates are sufficiently accurate.

Having established a discrete calibration, we realised that we had in-effect discarded available information about the nature of the calibration curve. Discarding information sits uncomfortably with the CRR Article 179 principle to include “all relevant data, information and methods”. And paragraph 99 of EBA/GL/2017/16 establishes the curve should be monotonic. If for example the PDs for bins 6, 7, 9 and 10 were revealed, how comfortable would we be in guessing the PD for bin 8? Few experienced credit professionals would decline to make an estimate until a large sample study could be completed.

A further disadvantage of a discrete calibrations is that they are vulnerable to over-fit. A non-parametric specification of N bins and PDs require 2N-1 free parameters. 13 bins and PDs therefore require 25 free parameters. This is vulnerable to more than a little perception of over-fit when compared with a cubic function’s 4 free parameters.

We therefore investigated the accuracy and stability of mapping functions, as a means of generalising a discrete calibration.

Approach

We first parameterised a prior probability for the observed average of one year default rates in each bin, using an interval estimate derived from the year with the fewest defaults. The prior was specified using standard Beta distribution, allowing us to generate the Jeffrey’s Interval as well as assign log-likelihoods in the optimisation of the mapping function.

We then optimised a polynomial to map from risk ranking score to calibrated PD. We refer to these estimates as “mapped PDs”. The optimisation used maximum likelihood. Compared with an MSE optimisation (typically performed in log-odds space to avoid “weighting” the fit towards the bin with the highest PD), this has the advantage of introducing a “weighting” that reflects the amount of default data in each bin, acknowledging that each bin’s point estimate may not be perfect.

Alignment between mapped PDs and the observed long-run-average was checked by counting the number of bins where the Jeffreys Test passed.

For all analysis, we used the same binning heuristic and simulated dataset described in our previous article. To recap, “Threshold value” refers to the minimum acceptable number of defaults in any bin’s most-sparsely-populated year. We focused on the Medium (“realistic”) noise scenario.

Results

We found that in the general case it is not possible to optimise an intercept adjustment or linear calibration function that would result in the Jeffreys Test passing in all bins.

When we increased the polynomial order to a cubic function, results were considerably more promising:

We initially set the threshold value to 40 (the minimum value that returns zero rank order reversals in PIT ODR over time – see previous article) and optimised a cubic function. We then sensitised the threshold value to 20 and 80, but did not re-optimise the cubic function. Under both sensitivities, the Jeffreys Test continued to pass in all bins.

The charts below illustrate the accuracy of the mapping function, as the initial threshold value of 40 (13 bins) is sensitised to 20 (19 bins) without re-optimising the cubic curve.

We repeated the experiment, with the initial threshold value reduced to 20 and then 10. We found that the Jeffreys Test only fails in the relatively contrived situation where the cubic is optimised across 28 sparsely-populated bins (resulting in wide Jeffreys Intervals) and tested across 13 bins. By contrast, firms are more likely to be training calibrations in fewer bins and increasing the granularity of their binning as time goes by and more data becomes available.

The results are summarised in the table below.

Threshold value for binning used to cubic function

Threshold value sensitivity

Bins for sensitivity test

Bins in sensitivity test with Jeffreys test passing

Rank Order Reversals in PIT ODRs

40

20

19

19

3

40

40

13

13

0

40

80

8

8

0

20

20

19

19

3

20

40

13

13

0

20

80

8

8

0

10

10

28

28

40

10

20

19

19

3

10

40

13

1

0


Discussion

The results would suggest that a satisfactory mapping from discrete to continuous estimates of PD may be achieved using a polynomial function. The cubic function seems to be reasonably robust to threshold value, number of bins, the minimum number of defaults in a bin over time, and the presence of rank order reversals.

Whether the continuous mapping produces higher-quality PD estimates depends on one’s prior beliefs about over-fit of non-parametric models, and the relationship between adjacent bins’ PDs. This is illustrated in the figure above, where the blue and yellow curves appear to diverge in the higher grades. In practice, however, a truly neutral prior belief is difficult to achieve when knowledge of firm-specific and whole-market capital impacts exists.