# Predicting Customers Churn in a Relational Database

By Cimpoeru, Catalin; Andreescu, Anca | Informatica Economica, July 1, 2014 | Go to article overview

# Predicting Customers Churn in a Relational Database

Cimpoeru, Catalin, Andreescu, Anca, Informatica Economica

(ProQuest: ... denotes formulae omitted.)

1 Introduction

Nowadays, predictive analytics is one of the common buzz words. Its methods and concepts are scientific, based on computer programming, mathematics and statistics, yet the interest for the topic went well beyond academics and research, to business/ corporate area and, more recently, to the general public. In 2012, Nate Silver, editor in chief of FiveThirtyEight blog and author of the book The Signal and The Noise, became famous while correctly predicting the winner of the presidential elections in The United States, in all 50 states and the District of Colombia. In the same year, Harvard Business Review published a frequently mentioned article which described the data scientist as "the sexiest job of the 21st century" [1]. In 2014, Goldman Sachs, one of the most prestigious financial institutions in the world published a consistent paper, showing their predictions for the Football World Cup that took place in Brazil [2]. The same Nate Silver and his team at FiveThirtyEight also took their chance and offered alternative predictions for the main football event of the year. In the same year, Warren Buffet, the well-known financial investor and one of the richest people on Earth, pushed the prediction challenge further and announced his willingness to offer one billion US dollars to anyone who correctly predicts the outcome of all 63 games in this year's NCAA men's college basketball tournament [3].

In most of the cases, the predictive analysis are oriented to solving more "serious" problems like identifying customer with a propensity to churn, detecting fraud or online spam, assessing the risk of an investment, sales forecast, predicting a medical diagnostic etc.

Lots of data science libraries are continuously created around open source programming languages like R and Python. They are free and quickly integrate new algorithms. At the same time, the main commercial database systems (SQL Server, Oracle etc.) introduced their own data mining modules, offering easiness of use for analysts and the possibility to build and integrate data models with the relational database systems. Starting from such data, available in a relational database, the objective of the current paper is to show and explain a few models of predicting customers having a high propensity to churn after expiring the availability of their products. Identifying these cases is highly important for every business built on the subscription model (telecom, antivirus, cable-television etc.), as the reduction of the churn rate among customers can positively impact the retention rate and the profitability of any business. According to The Charted Institute of Marketing [4], in some cases, preventing a customer from churning, involves costs from 5 to 20 times lower than acquiring a new customer.

2 The Data in the Relational Database

The starting point of this project is a relational database, made by 12 tables, designed after a snowflake model and built using Microsoft SQL Server 2012 technology. We deal with the software industry, where a company sells its products online, using the above mentioned subscription model. The data refer to online transactions, products, customers' details, offers, licenses, customer care activities involving employees and customers, online incidents, regions etc. For all the customers, based on the existing data, we are able to find out if they are still customers, or not. However, there are still a few thousand customers (both new and renewals), whose behavior we would like to predict, so that the company can apply a commercial "treatment" for the ones with a high probability to churn.

In order to build a complete profile of all the customers (with as many relevant variables as possible), we created two SQL views so that we can bring together:

* the data describing the "customers", basically including attributes like gender, marital status, income etc. …

If you are trying to select text to create highlights or citations, remember that you must now click or tap on the first word, and then click or tap on the last word.
One moment ...
Default project is now your active project.
Project items
Notes

#### Cited article

Style
Citations are available only to our active members.
Buy instant access to cite pages or passages in MLA 8, MLA 7, APA and Chicago citation styles.

(Einhorn, 1992, p. 25)

(Einhorn 25)

(Einhorn 25)

1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

#### Cited article

Predicting Customers Churn in a Relational Database
Settings

#### Settings

Typeface
Text size Reset View mode
Search within

Look up

#### Look up a word

• Dictionary
• Thesaurus
Please submit a word or phrase above.

Why can't I print more than one page at a time?

Help
Full screen
• Highlights & Notes
• Citations
Some of your highlights are legacy items.

### How to highlight and cite specific passages

1. Click or tap the first word you want to select.
2. Click or tap the last word you want to select, and you’ll see everything in between get selected.
3. You’ll then get a menu of options like creating a highlight or a citation from that passage of text.

## Cited passage

Style
Citations are available only to our active members.
Buy instant access to cite pages or passages in MLA 8, MLA 7, APA and Chicago citation styles.

"Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn, 1992, p. 25).

"Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn 25)

"Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn 25)

"Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences."1

1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

## Thanks for trying Questia!

Please continue trying out our research tools, but please note, full functionality is available only to our active members.

Your work will be lost once you leave this Web page.