In the quest for developing the best underwriting model possible we believe that there is a need for an objective and quantifiable way to measure with a number the quality of a model. Such a metric exists. Many people are familiar with the Gini coefficient for inequality. It can also be used to measure the quality of an underwriting model.
The Gini coefficient is most widely used by economists as a measure of statistical dispersion intended to represent the income distribution of a nation’s residents. It is the most commonly used measure of inequality.
A Gini coefficient of zero expresses perfect equality, where all values are the same (for example, where everyone has the same income). A Gini coefficient of one (or 100%) expresses maximal inequality among values (for example, where only one person has all the income or consumption, and all others have none).
Another use for Gini Coefficients, and more relevant to our readers here at Lending Times, is to evaluate the predictive power of credit scoring models. While calculating a Gini for a lender is fairly complex, understanding the Gini is somewhat simpler.
The GINI co-efficient for measuring credit models also has values between 0 and 1. A higher value means that a particular credit model can better discriminate among good and risky borrowers. A value of 1 means that the model predicts perfectly, and with certainty, which borrowers will repay and which borrowers will default. A value of 0 means that the model is completely random, or in other words, it is the statistical equivalent of a coin toss, resulting in a 50/50 probability of repayment or default for each applicant.
A Gini Coefficient can help a lender (or investor) understand how good the lender’s credit model is at predicting who will repay and who will default on a loan.
The GINI co-efficient compares the “Lorenz” curve (the cumulative distribution) with the line of perfect randomness. Graphically illustrated, the Gini is the ratio of the area under the curve (A) but above the line of perfect randomness to the entire area above the line of perfect randomness (A+B). See graphic below.
Why do underwriting models matter ?
Underwriting models are the key to the lending business for many obvious reasons. Beyond that, with many originators in the small business and consumer loan space, and even more new originators in the pipeline, it is becoming increasingly hard to differentiate among lenders. One clear way to differentiate among originators is building strong, proven, and accurate credit models. And this is where the Gini becomes very useful.
While it is hard to actually measure and only the most sophisticated investors would have the resources to calculate a Gini for a particular originator, there is plenty of data available for the retail investor to analyze.
Let’s look at Lendingclub for one example. In a very simple way of analyzing the credit scoring model, lets look at Lendingclub’s default rates by loan grade. Lendingclub grades its loans on a letter scale of A through G, and each letter has a subscale of 1 through 5 (A1 being the least risky and G5 being the most risky). We would think that the proportion of defaults would increase as the buckets increase in risk. We can easily illustrate this by plotting the percentage of loan defaults (measured by charged-off loans) for each bucket. Using Lendingclub’s loan data from 2009 through 2013, we can see that this generally holds true. Overall, for each year, each bucket of increasing risk has generally resulted in a slightly larger percentage of charged-off loans.
Lendingclub’s risk model appears to be pretty good at measuring risk. We would assume the Gini coefficient to be appropriately high for Lendingclub.
First, it is fairly hard to calculate the Gini coefficient. There no single polynomial formula to do so. It requires more calculations.
Second, the Gini coefficient does have its limitations. The only real way to compare two underwriting models is to run both models on exactly the same data and see how the results compare. In other words the exact way to compare Gini coefficients is to calculate the Gini coefficient on 2 different models using the same training and test data. While this is a limitation of any model comparison we believe that in first approximation it is not a major limitation of the usage of the Gini coefficient.
Lending Times would like to encourage originators to publish the Gini coefficient of their underwriting. We would be ready to publish such information and present it in an organized fashion.
By publishing and comparing the Gini coefficients of different companies each originator will get a better idea how they compare to the industry average. This will allow them to improve or identify what changes are needed. The entire industry will gain is the overall quality of the underwriting will improve. It will be easier to attract funding and investors from major established institutions and regulators will be more comfortable that the industry is stable, reliable and unlike what Lord Turner claimed, marketplace, p2p and online alternative lending will be here to stay for long time.
Authors: Mark Smith and George Popescu