Academic journal article Australasian Accounting Business & Finance Journal

A Mathematical Model for Predicting Debt Repayment: A Technical Note

Academic journal article Australasian Accounting Business & Finance Journal

A Mathematical Model for Predicting Debt Repayment: A Technical Note

Article excerpt

(ProQuest: ... denotes formula omitted.)


Many industries around the globe have been plagued with bad debt, the cost of debt collection has been ever increasing which has led many companies to outsource debt collection to a collection agency. The debt collection industry is enormous in countries such as USA. In the U.S, third-party debt collection agencies employ more than 140,000 people and recover more than $50 billion each year, mostly from consumers (Fedaseyeu and Hunt 2015), and nearly 14% of American consumers had an account in a collection agency as at 2011 (2014 Annual Report & Form 10-K).

Creditors possess legal and informational advantages, but the information available is limited. Therefore predicting whether a particular customer is likely to repay a debt is a complicated and inherently tedious exercise. This difficulty is amplified because many accounts are forwarded to a collection agency from the healthcare sector, and due to the nature of the industry the information is incomplete and lacks financial information. Hence if it could be accurately predicted if a debt could be repaid is hugely beneficial to a collection agency.

This research project is on predicting debt repayments using historical data of a US based debt collection agency. We use the data to develop mathematical and data mining models to classify and predict if a debt could be recovered or not. The ultimate objective of the study is to build a model which could accurately classify new data using the training data set.

This report mainly focuses on data mining and knowledge discovery tools in logistic regression artificial neural networks and market basket analysis to classify and predict. Knowledge discovery is defined as the process of identifying valid, novel, and potentially useful patterns, rules, relationships, rare events, correlations, and deviations in data (Fayyad 1996). Mathematical and data mining tools are an integral part of the knowledge discovery process, as they can be used to identify hidden patterns and underlying structures in the otherwise unstructured data.

There has been ample evidence in literature of instances where data mining methods were used in classifying problems such as debt scoring, credit scoring and bankruptcy predictions. Logistic regression and neural networks were used to model credit scoring (Desai, Crook and Overstreet, 1996). Zurada and Lonial compared performances of neural networks, logistic regression memory-based reasoning and a combined model to predict debt recovery in health care industry (Z. L and L. S 2005). Hensher and Jones have examined a range of classification techniques such as logit and probit models and neural networks in Advances in Credit Risk Modeling and Corporate Bankruptcy Prediction (UK: Cambridge Press 2008). Ho Ha and Krishnan have used Cox's hazard model in addition to neural networks in predicting credit card debt recovery (Ho Ha and Krishnan 2012).


The debt collection data set consist of thirty seven variables which includes six continuous variables, seven binary variables, five date variables, thirteen categorical variables and four identifiers. The data set consisted with over two hundred thousand transactions. The input data set was a worked data set and hence adjustments were done to take the data set back to its original phase. The main adjustment done was to take the current balance back to the original phase (i.e. current balance before the payment). The important change was done to the total net balance. The original variables itself is not sufficient to do the analysis. To overcome this issue, several variables were derived using the original data set.

2.1Logistic Regression Model

Logistic regression is a regression method used when the response variable is dichotomous. The purpose of a logit model is to derive a mathematical equation that predicts the membership of a given case. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.