Target Marketing with Logit Regression

Article excerpt

Target marketing requires selecting customers on the basis of sociodemographic characteristics ... logit regression can help to identify the right characteristics... demonstrates with the help of an example how this method can be used.

As competition increases and budgets grow tighter, companies are looking for cheaper and more effective ways to sell their products and services, Greater emphasis is placed on identifying those customers who are most likely to purchase. In the past, targeting those customers may have been done in an ad-hoc manner depending on how historical sales data look in tables and cross-tabs.

Now the movement is toward a statistical modeling framework which can be implemented at product introduction or any time during the product's life cycle.


Although the ad-hoc analysis of tables and cross-tabs for targeting customers is very informative, there are some drawbacks. First, it is difficult to perform more than a two or three dimensional look at the data.

What you would like to do is to account for as many customer characteristics (factors) as possible while trying to answer a particular question. For example, do family size, household income, credit card debt, and consumer age have an impact on the decision to purchase a product? These types of questions can be answered only with a statistical based model.

Second, conclusions drawn from crosstab analysis may not take into account sampling error. Constructing a statistically based model allows the analyst to quantify behavior for a number of factors simultaneously and draw conclusions that are statistically valid. A statistically based framework also has the added flexibility of testing for numerous specifications. As a result, a best solution can be selected from a set of plausible solutions. Benchmarks can then be established for comparisons with future analyses as more of the product is sold.


Standard econometric methods like ordinary least squares were designed for evaluating variables that can assume any value within a range, i.e. continuous variables. These methods are usually appropriate for examining data which have been accumulated over time into totals representing aggregate market response.

When the outcome variable is not continuous, for example, to buy or not to buy a product, other techniques will be needed to properly measure and evaluate the decision making process. These techniques are called Discrete Choice Models.

The discrete choice technique discussed here is Logit Regression. Numerous statistical packages are available to handle these types of models (SAS, SPSS, SHAZAM, LIMDEP). In the simplest of cases, the consumer is faced with only two choices: (1) to purchase a product or (2) not to purchase. The consumer is assumed, in general, to make the decision in such a way as to maximize his or her utility. One of the advantages of discrete choice methods is that it treats the decision making process in a probabilistic manner. Once the equation is estimated, we can project the probability that a consumer will purchase the product based upon a set of explanatory criteria (education, income, age, etc.). This probability (ranging from 0 to 100%) can be interpreted as a score and used to rank each customer from those who are most likely to purchase to those least likely to purchase the product. Setting up the data for estimation looks very similar to ordinary least squares. The main difference is that the dependent variable (whether or not the product was purchased) is coded as a zero or one. Moreover, the logit procedure is non-linear and actually estimates the log of the odds of purchase.

Once estimated, the equation can be used to show how the probability of purchase varies across different values of the explanatory variable. Since logit regression is a non-linear technique, it is able to capture certain curvilinear relationships that may exist. …