Coauthored with Timothy Carter, Christopher Kam and Kalina Popova
All imputation methods depend on assumptions about how data come to be missing. Denote partially observed variable(s) as Y, fully observed variables as X, a missing/observed indicator matrix as M, and let the subscripts “obs” and “miss” signify observed and missing data respectively. The implications of various missingness assumptions for using listwise deletion (LD) and multiple imputation (MI) are summarized in Table B.1.
The imputation strategy is not to maximize any objective function but, rather, to generate imputations that reflect as accurately as possible the process that generated the original data. Consider incomplete data generated by flipping an unbalanced coin that lands “heads” with probability. 6. One could minimize the error between the “real” (but unobserved) data and any imputations by setting the missing values equal to “heads.” However, this would be an inferior method to an imputation technique that emulated the original data-generating process by imputing “heads” with probability. 6 and “tails” with probability .4, because setting all the missing values to “heads” would bias the point estimates.1
Rubin (1977) uses a simple regression approach to motivate MI. Imagine a simple linear relationship between Y and X:____________________