Academic journal article Applied Health Economics and Health Policy

Estimating Incremental Costs with Skew

Academic journal article Applied Health Economics and Health Policy

Estimating Incremental Costs with Skew

Article excerpt

Key points for decision makers

* Healthcare costs are often skewed, and in response, researchers have used log-based transformations

* When using log-based models, the incremental effects differ at different levels of the covariates, and this can cause dramatic effects on predicted cost


Healthcare costs are often skewed: some patients have much-larger-than-average costs (e.g. patients with rare complications). As a result, researchers interested in estimating the incremental cost associated with a specific treatment, condition or patient, or provider characteristic, must consider the implications of a skewed dependent variable in estimation.

A traditional approach to addressing problems of skewed costs is to perform ordinary least squares (OLS) estimation on the natural logarithm of cost to decrease the importance of observations with extreme costs in estimation.[1-3] This approach was made popular in papers describing results of the Rand Health Insurance Experiments, whose authors found that the log transformation solved their skewed data problem.[4-6] A more recent approach is to use generalized linear models (GLMs) to provide more general dependent variable transformations and error distributions.[7-10] Specifically, many researchers have followed the findings in Manning and Mullahy[11] that examined the relative merits of different approaches to dealing with models based on log transformation or GLM alternatives with a log link, and used a log link or log transformation.[12-16] While these transformation methods mitigate the problems associated with extreme residuals, they introduce other problems, as log-transformed OLS models and GLMs with a log link force estimates of the incremental effect of each independent variable on cost to vary with the levels of the other independent variables in the model. As a result, the estimated interrelationships among the independent variables may not coincide with the underlying data-generating process. Thus, estimation approaches to deal with skewed residuals may produce misleading inferences of the incremental costs associated with independent variables. Policymakers may conclude that the incremental cost associated with a specific treatment, condition or patient, or provider characteristic, varies with the other covariates in the model when it may not. This needs to be determined empirically.

The goal of this paper is to assess the properties of various estimators of incremental cost. We contrast the ability of four commonly used estimators: GLM with a Gaussian family and an identity link (the OLS model), GLM with gamma and Gaussian families with log links, and the extended estimating equations (EEE) estimator,[17,18] to find the incremental cost changes associated with a randomized behavioural intervention in the pain treatment for hospitalized patients with hip fractures.[19,20]

We also simulated data. We specifically estimate the cost of a discrete independent variable defined as a 'treatment' when the treatment is linearly related to cost, and a portion of the population has excessive cost. Previous literature comparing the properties of log-transformed OLS models to various GLM specifications used underlying non-linear simulation models as the source of their comparisons.[11,21-23] However, to the best of our knowledge, no one has examined the properties of various estimators when the underlying cost model is linear in the dependent variables, and the skew is caused by a percentage of patients having excessive costs not attributable to measured variables. In this study, we performed a series of simulations in which we varied the size of the excessive cost and the percentage of the population with excessive cost. In addition, a portion of the simulated patients in these models receive a 'treatment' that is linearly related to cost. Our goal is to estimate the incremental cost of this treatment from the simulated data. Simulations were performed with and without an additional measured covariate to assess the effect of covariates on the incremental treatment cost estimates from various estimators. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.