Using Point Biserial Correlation in Credit Analysis
October 3, 2006
Suppose a credit analyst working for an investment manager has categorized a group of 100 companies in five sectors into either "good credit" or "bad credit". And now he wants to find the correlation between the credit quality of the companies and their gross income. Can the correlation coefficient be found out? And if so then how shall he estimate it?
It is an interesting, though pretty old, problem. In engineering one often comes across such issues where one variable is binary and the other is continuous. In the above problem the credit quality is actually a binary variable - the credit of a company is either "good" or "bad" (in one recent training session, a very well informed and quantitatively savvy analyst referred to such a variable as a "Boolean" variable, but we would rather not use that term and instead use either "binary" or "digital"). A "good" credit can therefore be represented by the digit "1" and a "bad" credit can be represented by the digit "0". Thus zero and one are the only two values that the credit of a company can take in the analyst's framework. Hence it is a binary variable.
The gross income of the companies can take any value from (theoretically) minus infinity to plus infinity and therefore is a continuous variable. Therefore, the analyst will have two time series - one binary and one continuous - similar to the one shown below
Our man, Keynes, got it right: diversification is counter-productive. And to paraphrase him, a small investment in a large number of companies in which a manager has very little knowledge (research has its limits) and information to reach a good judgment as opposed to a substantial investment in one single company where a manager's information and knowledge is more or less adequate is a losing strategy.
In other words, fund managers don't make money from their skills but rather from the management fees on the assets under management and gross stupidity of the investors. A trader, on the other hand, makes money from his skills.
The correlation between the above two series is given by what is known as the Point Biserial Correlation (Biserial Correlation). Biserial correlation is expressed as:
In the above expression:
The above formula is very simple and the analyst can thus easily compute the Biserial Correlation between the Credit Quality of the companies and their Gross Income. Biserial correlation can be applied in many areas of quantitative finance where we have binary time series.
The above problem was posed by one of the trainees to us in a Credit Analysis training session. It is easy to create a simple VBA user defined function for the Biserial correlation and use it in Excel.
Reference: An excellent book on advanced statistical concepts is Operational Risk by Nigel Da Costa Lewis (John Wiley & Sons).
Any comments and queries can
be sent through our
More on Quantitative Finance >>
back to top