Estimation and Inference for Logistic Regression with Covariate Misclassification and Measurement Error in Main Study/Validation Study Designs

By Spiegelman, Donna; Rosner, Bernard et al. | Journal of the American Statistical Association, March 2000 | Go to article overview

Estimation and Inference for Logistic Regression with Covariate Misclassification and Measurement Error in Main Study/Validation Study Designs


Spiegelman, Donna, Rosner, Bernard, Logan, Roger, Journal of the American Statistical Association


In epidemiological studies, continuous covariates often are measured with error and categorical covariates often are misclassified. Using the logistic regression model to represent the relationship between the binary outcome and the perfectly measured and classified covariates, the model for the observed main study data is derived. This derivation relies on the assumption that the error in the continuous covariates is multivariate normally distributed and uses a chain of logistic regression models to describe the misclassification processes. These model assumptions are empirically verified in the validation study, where the misclassified and mismeasured covariates are validated using perfectly measured and classified data. The full data likelihood, including contributions from both the main study and the validation study, is maximized to obtain the maximum likelihood estimates for the parameters of the underlying logistic regression model and of the measurement error model and reclassification models simulta neously. Standard asymptotic theory is applied. An example of this methodology is presented from the Nurses' Health Study investigating the relationship between cumulative incidence of breast cancer and saturated fat, total energy, and alcohol intake. A detailed simulation study was conducted to investigate the small-sample properties of these likelihood-based estimates and inferential quantities. No single estimation/inference option performed satisfactorily when the main study/validation study size was representative of that typically encountered in practice. When the validation size was twice or larger than from the usual one, features of asymptotic optimality were more apparent. By example and through simulation, the procedures appeared to be robust to misspecification of the order of the chain of conditional measurement error/reclassification models.

KEY WORDS: Breast cancer; Epidemiology; Logistic regression; Maximum likelihood; Measurement error; Misclassification; Regression calibration.

1. INTRODUCTION

Numerous statistical investigators have proposed methods for estimation and, to a lesser extent, inference for binary response data with covariate measurement error in one or more continuous exposure variables. (See Carroll, Ruppert, and Stefanski 1995 for a recent comprehensive review.) Computational barriers have impeded the use of maximum likelihood (ML) methods for routine data analysis of binary response models with measurement error, usually taken to be Gaussian, and/or misclassification in model covariates. Because of these barriers, investigators have devised other estimators that are more computationally feasible but still inconsistent and inefficient (Armstrong 1985; Stefanski 1989; Stefanski and Carroll 1985). These estimators are approximately consistent under additional assumptions, involving small measurement error or small relative risk. One inconsistent estimator has enjoyed widespread popularity in applications, because of its intuitive appeal and computational simplicity (Rosner, Spiegelman , and Willett 1990, 1992). Because the maximum likelihood estimator (MLE) is both consistent and efficient, in the absence of obstacles to its implementation it is a desirable choice.

In this article we present efficient computational strategies to facilitate routine maximum likelihood calculations for binary response data with covariate measurement error and reclassification processes of arbitrary complexity in one or more model covariates. Likelihood-based estimation and inference in this setting has the additional advantage of offering full flexibility in model choices. In particular, the model that we present accommodate both measurement error in continuous model covariates and misclassification in categorical ones. These models arise in the analysis of data obtained to study the epidemiology of chronic diseases in relation to nutritional, environmental, and other determinants. …

The rest of this article is only available to active members of Questia

Already a member? Log in now.

Notes for this article

Add a new note
If you are trying to select text to create highlights or citations, remember that you must now click or tap on the first word, and then click or tap on the last word.
One moment ...
Default project is now your active project.
Project items

Items saved from this article

This article has been saved
Highlights (0)
Some of your highlights are legacy items.

Highlights saved before July 30, 2012 will not be displayed on their respective source pages.

You can easily re-create the highlights by opening the book page or article, selecting the text, and clicking “Highlight.”

Citations (0)
Some of your citations are legacy items.

Any citation created before July 30, 2012 will labeled as a “Cited page.” New citations will be saved as cited passages, pages or articles.

We also added the ability to view new citations from your projects or the book or article where you created them.

Notes (0)
Bookmarks (0)

You have no saved items from this article

Project items include:
  • Saved book/article
  • Highlights
  • Quotes/citations
  • Notes
  • Bookmarks
Notes
Cite this article

Cited article

Style
Citations are available only to our active members.
Buy instant access to cite pages or passages in MLA, APA and Chicago citation styles.

(Einhorn, 1992, p. 25)

(Einhorn 25)

1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

Cited article

Estimation and Inference for Logistic Regression with Covariate Misclassification and Measurement Error in Main Study/Validation Study Designs
Settings

Settings

Typeface
Text size Smaller Larger Reset View mode
Search within

Search within this article

Look up

Look up a word

  • Dictionary
  • Thesaurus
Please submit a word or phrase above.
Print this page

Print this page

Why can't I print more than one page at a time?

Help
Full screen

matching results for page

    Questia reader help

    How to highlight and cite specific passages

    1. Click or tap the first word you want to select.
    2. Click or tap the last word you want to select, and you’ll see everything in between get selected.
    3. You’ll then get a menu of options like creating a highlight or a citation from that passage of text.

    OK, got it!

    Cited passage

    Style
    Citations are available only to our active members.
    Buy instant access to cite pages or passages in MLA, APA and Chicago citation styles.

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn, 1992, p. 25).

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn 25)

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences."1

    1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

    Cited passage

    Thanks for trying Questia!

    Please continue trying out our research tools, but please note, full functionality is available only to our active members.

    Your work will be lost once you leave this Web page.

    Buy instant access to save your work.

    Already a member? Log in now.

    Oops!

    An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.