Academic journal article Iranian Journal of Public Health

Multiple Imputation to Deal with Missing Clinical Data in Rheumatologic Surveys: An Application in the WHO-ILAR COPCORD Study in Iran

Academic journal article Iranian Journal of Public Health

Multiple Imputation to Deal with Missing Clinical Data in Rheumatologic Surveys: An Application in the WHO-ILAR COPCORD Study in Iran

Article excerpt

Abstract

Background: The aim of the article is demonstrating an application of multiple imputation (MI) for handling missing clinical data in the setting of rheumatologic surveys using data derived from 10291 people participating in the first phase of the Community Oriented Program for Control of Rheumatic Disorders (COPCORD) in Iran.

Methods: Five data subsets were produced from the original data set. Certain demographics were selected as complete variables. In each subset, we created a univariate pattern of missingness for knee osteoarthritis status as the outcome variable (disease) using different mechanisms and percentages. The crude disease proportion and its standard error were estimated separately for each complete data set to be used as true (baseline) values for percent bias calculation. The parameters of interest were also estimated for each incomplete data subset using two approaches to deal with missing data including complete case analysis (CCA) and MI with various imputation numbers. The two approaches were compared using appropriate analysis of variance.

Results: With CCA, percent bias associated with missing data was 8.67 (95% CI: 7.81-9.53) for the proportion and 13.67 (95% CI: 12.60-14.74) for the standard error. However, they were 6.42 (95% CI: 5.56-7.29) and 10.04 (95% CI: 8.97-11.11), respectively using the MI method (M=15). Percent bias in estimating disease proportion and its standard error was significantly lower in missing data analysis using MI compared with CCA (P< 0.05).

Conclusion: To estimate the prevalence of rheumatic disorders such as knee osteoarthritis, applying MI using available demographics is superior to CCA.

Keywords: Rheumatology, Osteoarthritis,Missing Data, Imputation, COPCORD

Introduction

Missing data is an unavoidable challenge in most epidemiologic researches and occurs under various mechanisms (1). Missing completely at random (MCAR) refers to a condition where missingness is not related to the studied variables. In missing at random (MAR), data is missing at random conditionally, and although unrelated to the variable of interest, it is related to other study variables. Missing not at random (MNAR) is the case where missingness depends on the values of the variable of interest (2-3).

In cross-sectional surveys like any other type of observational studies, missing data can be due to incomplete responses and low rate of respondents' cooperation (4-5). However, probability of missingness is not equal for all variables; those collected by methods that are less costly and less reliant on participant cooperation are also less likely to have missing data. For example, demographics can be collected through simple approaches which are less dependent on subject participation, while clinical data such disease status would at least require taking a medical history and performing physical exam, and in some cases, it may be possible only by utilizing expensive, invasive or time consuming diagnostic procedures as well as subject consent and participation in every stage.

Rheumatologic studies also are not exempt. As a typical example, we can refer to the first phase of the Community Oriented Program for Control of Rheumatic Disorders (COPCORD) performed in Tehran the capital of Iran in 2005 by the Rheumatology Research Center of Tehran University of Medical Sciences in collaboration with the World Health Organization (WHO) and the International League of Associations for Rheumatology (ILAR) to determine the pattern of rheumatic complaints and disorders in this region. As the first step of data gathering procedure, a short preliminary interview was performed by trained health care providers to find eligible individuals in each random selected household considering their demographic characteristics. Then, selected participants were approached at their homes to gather main clinical data on their rheumatic complaints and disorders through verbal interview, and consenting participants had a physical exam and diagnostic tests by trained physicians and clinicians. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.