The FU-PLOT: A Graphical Method for Visualizing the Timing of Follow-Up in Longitudinal Studies
Lesser, Martin L., Kohn, Nina E., Napolitano, Barbara A., Pahwa, Savita, The American Statistician
The longitudinal follow-up of patients is a feature of many clinical studies. To facilitate data collection and eventual statistical analysis, the investigator may try to collect follow-up data at scheduled times. However, irregular or incomplete follow-up is common, creating "holes" in a patient's vector of data points, which can complicate data analysis.
The reasons for incomplete data in such studies include: patient death or drop out, failure to undergo some follow-up evaluations, evaluations made at other than scheduled times, and patients entering the study at different times/ages with no data available prior to date of entry.
From a statistical perspective, incomplete or irregularly collected data present problems in carrying out statistical analyses. For example, missing data can complicate repeated measures analyses or analyses involving time-varying covariates.
In order to carry out longitudinal data analysis, the statistician, as well as the other investigators, must have some idea as to how regular or irregular the data collection has been in the set of patients to be studied. While this can be done by printing out a tabular summary of data, sorted by patient and by time within patient, a graphical display of this information may enhance the investigator's ability to judge the frequency and patterns of missing data so that an analytic strategy can be developed. This is particularly true when there is a large volume of data that may be more concisely displayed in a graph as opposed to a data listing.
In this article, we describe a graphical method for displaying data collection events over time that may help to visualize the volume and patterns of data and assist the statistician in preparing for statistical analysis in the presence of missing data or nonuniform data collection. We have coined the term FU-PLOT (for Follow-Up PLOT) to describe this class of graphical techniques for use with follow-up data. We have applied this technique to a longitudinal follow-up study of young children born to intravenous drug-abusing mothers who have human immunodeficiency virus (HIV).
A related technique, the "EVENTCHART," has been published by Goldman (1992). Her graphical method mainly applied to the plotting of single or multiple outcome events (e.g., death, graft versus host disease, infection, etc.), whereas our method was designed to visually display data collection events (e.g., blood tests, X-rays, surveys, etc.) over time.
2. DESCRIPTION OF THE HIV STUDY
The data set was generated for a longitudinal follow-up study of children born to intravenous drug abusing (IVDA) mothers who are positive for HIV. It is entitled "HIV Infections in Children of IV Drug Abusers" and is funded by the National Institutes of Health, National Institute of Drug Abuse. Although the study was originally designed to follow newborns, it also consists of eligible children who entered the study at some time after birth. In this longitudinal study, a wealth of data is collected over time. Such data consist of basic clinical information, signs and symptoms of AIDS, and laboratory data.
The initial plan called for blood tests at six month intervals; in fact, blood tests are performed more frequently in some patients and less frequently in others. Also, not all patients enter the study at birth, because they are often referred from other hospitals when they exhibit signs or symptoms. There are also some patients who die or are lost to follow-up for whom no further data are collected.
One of the objectives of the study is to see how CD4+ T-cell subset levels and changes in levels correlate with disease progression and survival. Such questions can be addressed through simple univariate analyses, repeated measures analyses, autoregressive modeling, and proportional hazards modeling with time-dependent covariates. All of these analyses require that certain patterns of data availability exist in order that they be carried out. …