Administrative data are attractive sources of information in research and evaluation studies for numerous reasons including relatively low cost, and the availability of longitudinal information and large subject pools. While many professional organizations set standards for members, there exists a patchwork of practices for researchers to follow when performing research. The purpose of this paper is to outline standards and practices for researchers, and to discuss common analysis issues related to the proper use of administrative data. The discussion focuses on data from the two largest United States government-funded health care programs, Medicare and Medicaid. This focus is chosen due to the wide use of such data, and the sensitive nature of healthcare information.
In a recent paper, Safran et al., (2007) discuss the increasing secondary use of health data for research and other purposes. The authors note that the "lack of coherent policies and standard good practices for secondary use of health data impedes efforts to transform the U.S. health care system" (p. 1). This paper seeks to contribute to this important discussion in two ways. First, a set of standards and practices for researchers to follow is proposed for the acquisition and proper use of administrative data. Second, the literature is reviewed that relates to specific shortcomings with administrative databases and methods to address the problems. The paper is geared towards students with an interest in health economics, but may also be useful to other students and established researchers given the increasing use of administrative data (both health-related and otherwise). The goal is to help researchers use administrative data correctly so that policy makers can have greater confidence in findings, and consequently research can have a greater effect on public policy.
Public health care programs in the U.S. such as Medicare and Medicaid finance health care for millions of people. The information collected as a result of health care delivery, enrolling members, and reimbursing for services is referred to as administrative data (lezzoni, 1997). Despite widespread use for research purposes, there exist limited standards and practices for researchers to adhere to in using administrative data (Retchin & Ballard, 1998; Safran, et al., 2007). In addition, while undergraduate and graduate students in economics (and other social sciences) encounter a wide array of courses during their education, few academic programs teach students how to acquire and properly use data.
This paper focuses on data from the two largest government-funded health care programs, Medicare and Medicaid, but the issues discussed in this paper apply to all types of administrative records. The focus was chosen because of the sensitive nature and yet widespread use of such data, the increased vulnerability of the subjects, and the evolving U.S. federal regulatory landscape for healthcare information in general. Examples are discussed based on experiences during the lead author's five years at the Centers for Medicare & Medicaid Services (CMS), the government agency that oversees the programs.
ADVANTAGES AND DISADVANTAGES TO ADMINISTRATIVE DATA
First, let's review a few of the advantages and shortcomings of using administrative data for research. There are a number of advantages to administrative data (lezzoni, 2002; Pandiani & Banks, 2003; Roos, Menee, & Currie, 2004; Roos et al., 2008). It is conceivable to study (almost) all individuals age 65 and above with Medicare enrollment and claims data. The use of population based data enables questions to be considered that could not be addressed with a sample. However, due to cost considerations and the sheer size of the databases, almost all studies use a sample. For example, as discussed in more detail later in the paper, much research uses a 5% sample of Medicare beneficiaries which is approximately 800,000 people. Despite being a small proportion of beneficiaries, the sample size remains substantial and limits concerns about the generalizability of results found in small sample studies. In addition, the large size also allows for adequate numbers of minorities for statistical analysis.
The records are not limited to specific types of setting (e.g., hospitals). Information can be longitudinal covering individuals and institutions across many years. Confidentiality can be maintained due to the large sample sizes. The data exist, and thus are relatively inexpensive to acquire compared to primary data collection, plus the low cost also allows for easy replication of previous studies. Survey attrition due to a loss of contact or refusal to participate is also minimized.
There are, however, many potential problems with the use of administrative data (Retchin & Ballard, 1998; Drake & McHugo, 2003). Such problems include a lack of information on the reliability or accuracy. Public use files may not be available for several years, reducing usefulness for current policy questions. Large samples can lead to statistically significant results that are not very meaningful, as even very small effects are precisely measured. Similarly, researchers may look for questions to fit the data, rather than forming questions and then looking for the appropriate data. Medicaid and Medicare enrollment and claims records include protected health information under the Health Insurance Portability and Accountability Act (HIPAA) and therefore require stringent privacy protection measures.
Due to such potential problems, users should adhere to standards on data use. While most professional organizations establish standards for members, there are no clear standards and practices for users of administrative data to follow. However, appropriate use is crucial in order to increase public confidence in the use of sensitive health care information for research purposes, and for federal agencies to continue to allow access to the data (Safran, et al., 2007).
THE RESEARCH PROTOCOL: DATA ADEQUACY AND ACQUISITION
Acquisition of administrative data typically begins with the development of a detailed research protocol. The protocol is assessed by the data owners to determine whether access should be granted to Medicaid or Medicare enrollment and claims records. A useful resource for researchers developing a protocol, although involving data for Canada, is provided at The Manitoba Centre for Health Policy (MCHP) web site (http://www.umanitoba.ca/centres/mchp/). Some of the information is specific to the MHCP mission and data on Manitoba residents, but much of the information applies to administrative data in general.
The protocol should detail the research questions and explain why they are important to the mission of the Medicare and/or Medicaid programs. Given the inherent concern in releasing sensitive information, research questions need to be of sufficient interest to the data owners to warrant release of the data. The protocol must also identify the specific dataset(s) and justify that the source is appropriate for the proposed analysis, van Eijk, Krist, Avorn, Porsius, & de Boer (2001) created a checklist guidelines to determine whether available data are adequate to answer the research questions. Important considerations used to decide whether the available data are adequate to meet research needs include sample size, whether the claims contain sufficient detail for the study (e.g., diagnoses, procedures, drug and dosing information), accuracy, continuity of variables over time, the ability to link databases, and adequate security and accessibility. In the following sections, several of these considerations are discussed as well as others as they relate to secondary use of health data.
An important early step is to understand the process for data acquisition. Most data available from CMS are acquired through the Research Data Assistance Center (ResDAC), a CMS contractor that provides assistance to researchers using Medicare enrollment and claims records. Their web site www.resdac.umn.edu) contains much information on the process for acquisition and the associated cost but provides limited guidance on the proper use of the data. Together, CMS and ResDAC act as gatekeepers and determine who gains access to CMS data. ResDAC and CMS also make available national Medicaid data, referred to as the Medicaid Extract (MAX) files. The MAX files are a combination of the Medicaid enrollment and claims data complied by each state. Some states make Medicaid data from their state available to researchers, some do not. If you wish to use Medicaid data from a specific state, contact the Medicaid authority and determine whether the data are available and what their process for data acquisition entails.
Consult with data owners
Users should consult with data owners to understand what the data represent and ensure the proposed questions can be appropriately answered with the data. For Medicare data, this may involve discussions with ResDAC personnel and also individuals at CMS who work with the data. There are several reasons for users to seek such consultation. Administrative data are usually compiled for a specific purpose, often related to payment or program monitoring and evaluation. Users need to understand why the administrative database was created. The reason(s) for collecting the data can have an important impact on the universe covered, data elements, variable definitions, frequency and timeliness, quality, and stability over time. A lack of understanding of what the data represent and how it may be used has lead researchers to propose research questions for which the data are poorly suited (Medi-Cal Policy Institute, 2001).
In addition, given that administrative data are often compiled for internal use by the data owners, documentation is often scant compared to survey data primarily produced for research purposes. Even with proper documentation, owners are a valuable resource for understanding technical details and should be consulted by users. The data owners have knowledge of the issues involved in working with the files, problems with specific variables, are aware of other issues not apparent from reading documentation or examining the data, and can verify that the project design is appropriate.
Such discussions also provide opportunities to clarify variable definitions. For example, Medicare enrollment files note when Medicare is a secondary payer. This occurs primarily when the beneficiary has health insurance coverage through a spouse. The person is labeled as working aged despite the fact that the beneficiary is not employed. Consequently, users should not assume the variable name necessarily describes the variable clearly.
Data users should always consider the likely quality of the data for the proposed research questions. The accuracy of data is extremely important, particularly for analyses to inform public policy (Robinson & Tataryn, 1997). While the available quantity of information is often large, the accuracy and completeness is sometimes questioned. The Medi-Cal Policy Institute (2001) reported that California's Medicaid managed care data system could not be "used to make sound policy decisions" because data were inaccurate and incomplete. Most administrative data rely on reporting by individuals or firms and the information respondents provide can cause gains or losses to individuals or businesses (Wolf & Helminiak, 1998). In other cases, information can be underreported if unrelated to the gains or losses of individuals or businesses. As such, there may be biases in the information supplied.
Even if the overall database is considered complete and accurate, specific variables may differ in accuracy. Administrative files used to make payment often have fields that are checked for completeness and reasonableness. As such, these fields are relatively accurate. Other variables may not be checked or edited, especially those that do not affect payment. Users should leam the editing mies used by the owners. Users should determine the likely extent of measurement error and decide whether it should be addressed in the research plan.
One potential benefit of administrative data is the ability to perform population based research. In theory, Medicare data may be available for all individuals age 65 and above. The analysis of population based data avoids many of the concerns with analyzing samples, whether small or large. All statistics are actual statistics, not sample statistics. Thus, conclusions can be drawn without concerns about type I or type II errors.
In practice, the Medicare program does not cover everyone age 65 and above. Individuals must qualify for Medicare based on work history (either their own or a spouse's). Some individuals never establish a sufficient work history to qualify for Medicare. For example, certain immigrant groups are less likely to qualify for Medicare because work histories were not established with the Social Security Administration. Thus, even with a database as large as the Medicare enrollment and claims data, users must be aware of who may not be adequately represented in the data and potential biases this may introduce. In addition, given the size of some administrative databases, users should consider whether they have sufficient resources (both computer and financial) to acquire, store, and analyze the data. For example, there are over a billion Medicare claims in a single year.
In almost all cases, researchers use a sample. A five or ten percent sample from a very large database is sufficient for the majority of studies. For example, many researchers use the CMS 5% Medicare Standard Analytical Files (SAF). The standard analytical files contain all enrollment and claims data for 5% of Medicare beneficiaries (approximately 800,000 people) and are created annually by CMS. Because these files are used by many researchers, the cost of acquiring the data is lower than if a researcher requests a special data pull. The SAFs are created by selecting all enrollment and claims records for individuals with 05, 20, 45, 70 or 95 in positions 8 and 9 of the health insurance claim number (i.e., the last two digits of the Medicare identification number). The sample selection criteria for the SAFs allows for individuals to be followed over time, which would not be possible with a true random sample. At the same time, this could be problematic is the last two digits of the Medicare IDs differed across individuals in a systematic nature. However, the Medicare ID is typically the person's Social Security number (plus characters in the 10th and 1 1th places to denote the reason for eligibility). The last two digits of a Social Security Number are not systematically assigned based on characteristics of the individual, and thus the SAFs are generally considered to be equivalent to a random sample. If, for example, the sample was pulled based on the first three digits (where are assigned based on geographic location), then the sample would be geographically biased and not representative of the population.
While generally not a concern with large administrative databases, users must consider if the expected number of observations is sufficient to generate meaningful results. In general, power tests should be performed to determine the sample size necessary to have reasonable confidence that statistically significant results can be detected. This step is particularly important if studying a rare disease or treatment. At the same time, given the typical large sample size, users have to interpret the economic significance of their results and not simply rely on statistical significance (Drake & McHugo, 2003).
Researchers must also know the decision rules used to pull the data. For example, studies interested in the frequency of services should know if claims are "final action", or if they include denials, interim bills, or adjustments. The inclusion of interim bills and adjustments will lead to an over count of service frequency, and thus should be excluded during the analysis.
Research questions often focus on specific subgroups of individuals with specific diagnoses (e.g., asthma or diabetes). Claims data contain codes that identify specific diseases using the International Classification of Diseases (ICD). ICD codes are five digit codes that can be used to identify individuals with a broad class of diseases or a very specific disease. The first three digits tend to identify a general class (e.g., 250 for diabetes), with the fourth and fifth digits being more specific (e.g., 250.41 denotes type I diabetes with renal manifestations).
Among the issues to consider is whether two years or more of data should be used to identify cases. Dombkowski, Wasilevich, and Lyon-Callo (2005) found that a diagnosis of a chronic disease (asthma) was not observed in every year. Thus, selection of cases based on diagnoses in a single year would undercount the prevalence of a disease. People still have the disease but it did not show up in the claims data during a year for some reason. Consequently, the identification of individuals with chronic diseases may require multiple years of data.
In addition to diagnoses, prescription drug use might also identify people (e.g., insulin or perhaps metformin use for diabetes). Gilmer et al. (2001) find that the use of prescription drug records substantially increases the estimated prevalence of specific diseases. Caution must be used though since many medications are used to treat multiple conditions, and thus might not indicate a specific disease.
On the other hand, consideration might also be given to whether an individual should be included only if there are at least two records with the diagnosis of interest to rule out incorrect or miscoded chronic diagnoses. The presence of a consistent diagnosis overtime provides evidence that the diagnosis is correct. Such concerns arise from studies that compare diagnoses in medical charts and claims. For example, Schwartz et al., (1980) find a relatively poor match between medical charts and claims for Medicaid enrollees; 29% of chart diagnoses of private practitioners, 37% of chart diagnoses in the free standing outpatient clinics; and 46% of diagnoses from outpatient clinics of general hospitals do not match with Medicaid claims. Interested readers should look at Virnig & McBean (2001) for amore thorough discussion of studies that assess reliability by comparing diagnostic data located in charts to claims in the database.
Researchers are responsible for data security, and should have a plan for ensuring that the files cannot be accessed by unauthorized users. Some obvious steps include using automatic screen savers that can only be turned-off with a password if the data reside on an office or personal computer. If storage is on a network, only authorized users should have access, and the data should be behind a firewall if the network is connected to the internet. Email is not a safe way to transmit individually identifiable information unless adequate encryption is used. In addition, user responsibility for the data does not end when the project ends. The data use agreement (DUA) typically specifies whether the data have to be returned to the agency or destroyed.
THE PROTOCOL - DATA ANALYSIS
The protocol must also detail the analysis plan. This section provides an overview of some common analysis issues related to using administrative data. Such analytical issues include the need to empirically assess quality, differentiate between time trends and program effects, and use medical encounters to account for the differing health status of treatment and comparison groups (Ray, 1997). Much depends on the study questions and design for the specific project. The proposed analysis should meet the standards for institutional review boards and peer reviewed publications.
Studies are discussed below that relate to the analysis issues and the solutions employed by researchers. The studies are not an exhaustive overview of questions that can be analyzed with administrative data. Readers interested in a broader discussion of health care topics that can be addressed with administrative data should see a paper by Roos, Menee, & Currie (2004), and for a broader discussion of how administrative data can be used to answer an array of social research questions see Roos et al., (2008).
Users will often need to merge several different data files. Examples of such linkages include combining records from inpatient, outpatient, and physician claims, supplementing claims data with survey data such as the Medicare Current Beneficiary Survey, or matching individuals across years. Privacy concerns may arise when administrative records are linked to other sources and researchers should verify that the data use agreement allows such linkage (Clark, 2004).
Linking may be based on shared identifiers, deterministic matching, or probabilistic matching (Victor & Mera, 2001; Clark, 2004; Roos et al, 2008). Matching records by shared identifiers occurs when there are the same identifiers in data sets (e.g., Social Security Number or Health Insurance Claim number). Most data available from CMS can be matched using individual identifiers. However, researchers may also encounter situations when unique individual identifiers are not available. Deterministic matching examines a subset of variables and matches records that agree on this subset (e.g., name, date of birth, sex). Individuals can have the same name or date of birth or sex, but it is far less likely that different individual in two datasets will have the same name and date of birth and sex. Probabilistic linking matches based on the probability that records refer to the same person. Matching with individual identifiers or deterministic matching is typically used when attempting to draw conclusions about individuals. Probabilistic matching is used when there is limited information on which to base matches (e.g., name, date of birth, sex). Given the difficulty in precisely matching individuals, probabilistic matching is more appropriate when drawing conclusions about populations.
The use of probabilistic matching is illustrated by Banks &Pandiani (1998). The authors derive estimates of the number of people receiving psychiatric care in state hospitals and general medical settings. Typically, the data sets would be merged based on individual identifiers or deterministic matching to avoid double counting patients who receive care in both sectors. Banks and Pandiani use probabilistic matching based only on gender and birth date to derive estimates of sample overlap, and as a result are able to estimate the number of people receiving psychiatric care. The use of probabilistic matching is likely to increase as concerns with patient privacy lead data owners to restrict the release of information that enables direct or deterministic matching to other data sources.
When records from more than one administrative source are combined it is important to be aware of potential differences in concepts, definitions, reference dates, coverage, and quality. For example, recent attention has focused on merging Medicare and Medicaid claims to study dual eligible beneficiaries (e.g., Liu, Wissoker, & Swett, 2007; Yip, Nishita, Crimmons, & Wilber, 2007). These data originate from different sources that may use different definitions, definitely have different coverage issues, and may have differences in quality. While one might expect data within the Medicare program to have consistent standards, even this may not necessarily be the case. For example, the quality of inpatient hospital claims is generally considered better than physician claims (Retchin & Ballard, 1998).
Empirically assess data quality
While data quality should be assessed for expected accuracy prior to acquisition, quality should also be assessed empirically. Once the data are linked and the sample constructed, users should examine descriptive statistics. Users should check the results for reasonableness, and if possible, compare results with alternative data sources or prior research and attempt to explain differences. Many studies have been published using Medicare and Medicaid data providing researchers with a substantial literature for comparison.
Assessing quality is particularly important when data are hand entered because data errors may be more prevalent. An example of such data entry errors occurs with beneficiary location (SSA state and county codes) in Medicare claims. Research often looks at Medicare utilization across counties in the United States. Analyzing claims, there are approximately 5,000 SSA state/county codes that appear in the data, far greater than the 3, 100 actual counties in the US. What accounts for the erroneous counties? State and county codes are often hand entered, there is no payment issue involved (payments are based on provider location, not beneficiary residence), and the field is not checked for accuracy. Such miscoding may be important for sparsely populated counties where a few miscoded observations can make a difference to the results.
Two approaches are used in the literature to address potential problems with examining geographic variation in prevalence rates across counties. For example, Cooper, Yuan, Jethva, & Rimm (2002) examine county level variation in breast cancer rates using Medicare data. The authors attempt to confirm their findings by comparing prevalence rates to the National Cancer Institute's cancer tracking database (the Surveillance, Epidemiology, and End Results, SEER, program) which tracks approximately 10-15 percent of the U.S. population. While a valid test for large counties, comparing prevalence rates in small counties could be problematic due to the (relatively) small sample size of the SEER database. Holcomb & Lin (2006) examine geographic variation of macular disease in Kansas. Because of the potential for unstable prevalence rates in small counties, the authors aggregated sparsely populated counties into larger geographic units.
Researchers should document their findings regarding quality to enable other researchers to understand why certain observations or variables were included or excluded based on data quality considerations. Documentation can also help researchers compare and reconcile studies so that others understand why decisions were made and potential implications of those choices. These are basic steps that all researchers should perform, but studies are often unable to replicate research because such steps are not taken (Dewald, Thursby, & Anderson, 1986). New users of a data set should review the literature to see how others have handled problems with the data.
Time series analysis
Over time, the Medicare and Medicaid programs have moved toward managed care, case management, and provision of prescription drugs. Consequently, it is increasingly important to track people over time to determine how participation in case management or the provision of certain prescription drugs affects health over time . For example, there has been much discussion about creating a database to track outcomes from prescription drug use after the well documented problems with Vioxx (e.g., Lohr, 2007). Multiple years of the CMS Standard Analytical Files are often linked to examine changes over time. This is possible because, as discussed earlier, the 5% sample contains all enrollees with HIC numbers that end in specific digits. Thus, with some exceptions (some people die and new enrollees enter the data) the sample contains the same people over time.
A substantial literature using time series analysis considers the changing prevalence of specific diseases over time. For example, Lakshminarayan, Solid, Collins, Anderson, & Herzog (2006) find an increasing prevalence of atrial fibrillation diagnoses between 1992 and 2002, while SaIm, Belsky, & Sloan (2006) find an increasing prevalence of eye diseases between 1991 and 2000. Lakshminarayan (2006) partly address diagnostic quality by requiring at least one inpatient claim or two outpatient claims with an atrial fibrillation diagnosis. However, both studies may be overstating the increasing prevalence of such diseases. Physicians were required to report diagnostic data on Medicare claims beginning in the early 1 990s. Physician payments are typically based on procedures not diagnoses, and diagnosis is often not necessary to justify a procedure. Over time physicians have reported more thorough diagnostic data. Indeed, diagnostic reporting continues to improve more than a decade later as physicians implement electronic medical records. The point is that if one examines time trends in the prevalence of a disease, one needs to be cautious in looking at diagnostic trends in Medicare claims data. Simply looking at the increased reporting of a diagnosis is likely to overstate the increasing prevalence of a disease. While SaIm (2006) at least note this possibility, neither study attempted to account for this in their analysis.
When records from different time periods are linked, they are a very rich source of information for researchers. However, users should understand whether the data will be consistent across time, and why changes may occur. The reasons for collecting the information may change over time, or variable definitions may change, or reporting may have changed. Coverage changes occur on a regular basis in Medicare based on CMS decisions and Congressional mandates. Such changes can have a substantial effect on services provided.
Accounting for individual heterogeneity
Perhaps the biggest challenges in using administrative data are to create a comparison group and decide on the appropriate analytical techniques. In clinical research, randomized control trials allow researchers to assign individuals to treatment and control groups in a random manner. Administrative data do not typically allow for this type of assignment and there are often non-random differences between individuals that choose a treatment versus no treatment (or an alternative treatment). Pre-treatment differences may bias (typically referred to as sample selection bias) the results if such differences also correlate with the outcome.
Selection issues are common in research using administrative data, requiring researchers to account for differences between individuals. For example, administrative claims are often used to assess quality of care and examine outcomes from patient care. Hospital quality has been considered by many researchers because hospital administrative data are generally considered to be relatively high quality (e.g., Krumholz et al., 2006; Ross et al., 2007). Quality of care by physicians is also considered by Schatz et al., (2005). However, hospitals and physicians that have the most complex cases are more likely to have the highest complication and mortality rates. Consequently, accounting for case-mix is crucial to comparing the care provided across medical care settings or outcomes from alternative treatments.
There are several methods used to account for pre-treatment differences. The first two methods focus on accounting for observed differences between individuals. Many studies use risk adjusted models where control variables thought to be correlated with the outcome and the independent variable of interest (e.g., hospitals, physicians, treatment, gender, race, etc.) are included in a regression specification. Popescu, Vaughan-Sarrazinn, and Rosenthal (2007) examine racial differences in mortality after acute myocardial infarction. The authors control for sociodemographics, comorbidity, and illness severity to account for factors potentially correlated with the outcome (mortality) and variable of interest (race).
A variant on this approach is to use a diagnosis-based risk score as a measure of health (e.g., Ross et al., 2007). The score represents a measure of overall health status based on demographics and diagnoses. CMS and many States use diagnosis-based risk scores to determine compensation for managed care plans (e.g., Pope et al., 2004). While a useful measure, many researchers do not compute the scores correctly. This occurs because the models were developed using diagnoses from specific provider types (e.g., physician specialty). While this detail is contained in the technical instructions for managed care plans to submit data, it is not included in the risk adjustment publications or software published by CMS. Since most researchers do not discuss their research with people who work on risk adjustment at CMS, they often include too much diagnostic data when computing individual risk scores and overstate risk scores. As pointed out earlier, existing documentation may not provide all needed information, but such information can be learned by consulting with knowledgeable individuals. The example suggests that users should initiate such discussions regardless of whether the researcher is aware of a lack of information.
A potential problem with risk adjusted regression models, regardless of whether specific characteristics of risk are used or an overall risk score, is that the comparison groups may not have the control variables in common. For example, if a treatment group is primarily old and a control group is primarily young, then conclusions regarding the effect of treatment may be biased given linearity assumptions in regression modeling.
Propensity score matching has become a popular alternative to regression methods in social science research for addressing selection issues when analyzing administrative files (Rosenbaum & Rubin, 1983; Imbens, 2000). Matching techniques mimic a random experiment by matching individuals in the treatment and control groups based on observed characteristics. The observed characteristics are used to estimate the probability of receiving treatment. Individuals with similar probabilities of treatment are compared, some who do and some who do not receive the treatment, to determine the effect of treatment. Using the age example, the young people in the treatment and control groups would be matched, while the older individuals in the treatment and control groups would be matched. Outcomes are then appropriately compared for similar individuals.
Numerous articles use propensity score methods to examine treatment effects when using administrative data. For example, Berg & Wadhwa (2007) examine the effects of a disease management program for elderly patients with diabetes. Propensity score methods are used to match observations in the treatment group with people in a control group who did not participate in the disease management program. Similarly, Krupski et al., (2007) examine the effects of receiving androgen deprivation therapy for individuals with prostate cancer on skeletal complications. Individuals receiving therapy are matched to individuals not receiving therapy by age, geographic region, insurance plan, and index year.
There is, however, debate about whether matching actually mimics a random experiment (Agodini & Dynarski, 2004; Smith & Todd, 2005). Research attempting to validate propensity score matching uses experimental data, and attempts to replicate the experimental results by reexamining the data using matching techniques. In other words, the data are examined under the assumption that assignment was not random and may be subject to selection biases. The majority of studies find the results from experimental data and matching methods are not similar. Thus, while matching methods may be useful, they should not be viewed as a perfect solution to problems with sample selection.
One potential problem with each of the above methods is the reliance on observed data. As such, the development of risk scores and propensity scores is challenging with administrative claims that often lack key clinical detail (lezzoni, 1997). This issue is particularly salient for research on provider quality. lezzoni (1997) suggests that administrative data be used as a screening tool to highlight areas for further investigation, not to draw conclusions about quality. Information on the process and appropriateness of care may not be adequate to provide accurate measures of provider quality. In general, all studies involve some degree of unobserved data. Instrumental variables methods may be appropriate if unobservable characteristics are thought to be important to the analysis. Of course, it can be extremely challenging to find suitable instruments. In conclusion, controlling for differences between treatment and control groups, or between patients seen at different hospitals, or between any two comparison groups, is crucial to drawing proper conclusions.
There are many tools available to researchers on the internet and it may also be useful to utilize publicly available modules to develop important measures. Consistency across studies is increased if users can access such modules. Such publicly available information is typically tested by numerous users and is likely to be accurate. Much research requires manipulation of the data to create the analysis files and measures needed to answer the research questions. The internet allows researchers to utilize publicly available programs and modules that enable accurate creation of health measures such as the Charlson Index. The Manitoba Centre for Health Policy (MCHP) web site provides a web-based repository of useful tools for conducting research using administrative data (Roos, Soodeen, Bond, & Burchill, 2003). Some of the modules apply specifically to data available from the MCHP, but there are a number of statistical tools for analysis that can apply to a variety of administrative claims sources.
This paper has outlined some practice guidelines for the use of administrative data. While administrative data have great potential, there are also many pitfalls. Research using secondary data will benefit the health care of Americans only if the data are appropriately used. The growing use of such records in research and evaluation necessitates that guidelines be developed and discussed such that the conclusions from research are valid. We hope the guidelines presented in this paper generate further discussion of the appropriate use of such data.
In summary, users of administrative data should develop a research protocol that: presents the research questions including a justification of why the research questions are important to the data owners, assesses whether the data are appropriate for the research questions (i.e., quality, sample size, available variables, and ability to link records) through reviews of the literature and discussions with the data owners, details the security plan including where the data will be stored and how access will be controlled, presents the analysis plan including an empirical assessment of the data quality and the statistical techniques that will be used to answer the research questions, discusses how potential data shortcomings will be addressed, and describes steps that will enable replication by other researchers.
Clearly, there is a need for such standards and practices in the use of administrative data given the continued increase in use. Huax (2005) outlines some of the current trends and his views on upcoming changes in health information systems. The trend continues to be towards using administrative data to inform patient care, strategic management, and clinical and epidemiological research. The future is likely to move towards the development of comprehensive electronic medical records that include information from multiple or all payers. As administrative data become more comprehensive and complex, developing and utilizing standards and practices will become even more important in the future.
Agodini, R., & Dynarski, M. (2004). Are experiments the only option? A look at dropout prevention programs. Review of Economics and Statistics, 86, 180-194.
Banks, S.M., & Pandiani, J.A. (1998). The use of state and general hospitals for inpatient psychiatric care. American Journal of Public Health, 88, 448-451.
Berg, G.D., & Wadhwa, S. (2007). Health services outcomes for a diabetes disease management program for the elderly. Disease Management, 10, 226-234.
Clark, D.E. (2004). Practical introduction to record linkage for injury research. Injury Prevention, 10, 186-191.
Cooper, G.S., Yuan, Z., Jethva, R.N., & Rimm, A.A. (2002). Use of Medicare claims data to measure county-level variation in breast carcinoma incidence and mammography rates. Cancer Detection and Prevention, 26, 197-202.
Dewald, W.G., Thursby, J.G., & Anderson, R.G. (1986). Replication in empirical economics: The Journal of Money, Credit, and Banking project. American Economic Review, 76,587-603.
Dombkowski, K.J., Wasilevich, E.A., & Lyon-Callo, S.K. (2005). Pediatric asthma surveillance using Medicaid claims. Public Health Reports, 120, 515-524.
Drake, R. & McHugo, G .(2003). Large data sets can be dangerous. Psychiatric Services, 54, 133.
Gilmer, T., Kronick, R., Fishman, P. et al. (2001) The Medicaid Rx model: Pharmacy-based risk adjustment for public programs, Medical Care, 39, 1 188-1202.
Haux, R. (2005). Health information systems - past, present, future. International Journal of Medical Informatics, September 15, 2005.
Holcomb, CA., & Lin, M.C. (2005). Geographic variation in the prevalence of macular disease among elderly Medicare beneficiaries in Kansas. American Journal of Public Health, 95,75-77.
Iezzoni, L.I. (2002). Using administrative data to study persons with disabilities. The Milbank Quarterly, 80, 347-378.
Iezzoni, L.I. (1997). Assessing quality using administrative data. Annals of Internal Medicine, 127, 666-674.
Imbens, G. W. (2000). The role of the propensity score in estimating dose-response functions. Biometrika, 57,706-710.
Krumholz, H.M., Wang, Y., Matterà, J.A., Wang, YF., Han, L.F., Ingber, M.J., Roman, S., & Normand, S.L.T. (2006). An administrative claims model suitable for profiling hospital performance based on 30-day mortality rates among patients with an acute myocardial infarction. Circulation, 113, 1683-1692.
Krupski, T.L., Foley, K.A., Baser, O., Long, S., Macarios, D., & Litwin, M.S. (2007). Health care cost associated with prostate cancer, androgen deprivation therapy and bone complications. Journal of Urology, 178, 1423-1428.
Liu, K., Wissoker, D., & Swett, A. (2007). Nursing home use by dual-eligible beneficiaries in the last year of life. Inquiry, 44, 88-103.
Lakshiminarayan, K., Solid, CA., Collins, A. J., Anderson, D.C, & Herzog, CA. (2006). Atrial fibrillation and stroke in the general Medicare population: A 10 year perspective, 1992-2002. Stroke, 37, 1969-1974.
Lohr, K.N. (2007). Emerging methods in comparative effectiveness and safety: Symposium overview and summary. Medical Care, 45, S5-S8.
Medi-Cal Policy Institute. (2001). From Provider to Policymaker: The Rocky Path ofMediCal Managed Care Data.
Pandiani, J. & Banks, S. (2003). Large data sets are powerful. Psychiatric Services, 54, 745.
Pope, G.C., Kautter, J., Ellis, R.P., Ash, A.S., Ayanian, J.Z., lezzoni, L.I., Ingber, M.J., Levy, J.M., & Robst, J. (2004). Risk adjustment of Medicare capitation payments using the CMS-HCC model. Health Care Financing Review, 25(4), 119-141.
Popescu, L, Vaughan-Sarrazin, M.S., & Rosenthal, G.E. (2007). Differences in mortality and use of revascularization in black and white patients with acute MI admitted to hospitals with and without revascularization services. Journal of the American Medical Association, 297, 2489-2495.
Ray, W. A. (1997) Policy and program analysis using administrative databases. Annals of Internal Medicine, 12 7, 7 1 2-7 1 8 .
Retchin, S.M., & Ballard, DJ. (1998). Establishing standards forthe utility of administrative claims data. Health Services Research, 32, 861-866.
Robinson, J. & Tataryn, D. (1997). Reliability of the Manitoba mental health management information system for research. Canadian Journal of Psychiatry, 42 744-749.
Roos, L., Borwnell, M., Lix, L., Roos, N., Walld, R, & MacWilliam, L. (2008). From health research to social policy: Privacy, methods, approaches. Social Science & Medicine, 66, 117-129.
Roos, L.L., Menee, V., & Currie, R.J. (2004). Policy analysis in an information-rich environment. Social Science and Medicine, 58, 223 1-2241.
Roos, L.L., Soodeen, R.A., Bond, R., & Burchill, C. (2003). Working more productively: Tools for administrative data. Health Services Research, 38, 1339-1357.
Ross, J. Cha, S. Epstein, A., Wang, Y., Bradley, E., Herrin, J., Lichtman, J., Normand, S., Masoudi, F., & Kmmholz, H. (2007). Quality of care for acute myocardial infarction at urban safety-net hospitals. Health Affairs, 26, 238-248.
Rosenbaum, P.R., & Rubin, D.B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70, 41-55.
Safran, C, Bloomrosen, M., Hammond, W.E., Labkoff, S., Markel_fox, S., Tang, P.C., & Detmer, D.E. (2007). Toward a framework for the secondary use of health data: An American Medical Informatics Association white paper. Journal of the American Medical Informatics Association, 14, 1-9.
Salm, M., Belsky, D., & Sloan, F.A. (2006). Trends in cost of major eye diseases to Medicare, 1991-2000. American Journal of Ophthalmology, 142,916-982.
Schatz, M., Nakahiro, R., Crawford, W., Mendoza, G., Mosen, D., & Stibolt, T.B. (2005). Asthma quality-of-care markers using administrative data. Chest, 128, 1968-1973.
Schwartz, A.H., Perlman, B.B., Paris, M., Schmidt, K., & Thornton, J.C. (1980). Psychiatric diagnoses as reported to Medicaid and as recorded in patient charts. American Journal of Public Health, 70, 406-408.
Smith, J. & Todd, P.E. (2005). Does matching overcome Lalonde's critique of nonexperimental estimators? Journal of Econometrics, 125, 305-353.
van Eijk, M., Krist, L., Avom, J., Porsius, A., & de Boer, A. (2001). Do the research goal and databases match? A checklist for a systematic approach. Health Policy, 58, 263-274.
Victor, T. W., & Mera, R.M. (2001). Record linkage of health care insurance claims. Journal of the American Medical Informatics Association, 8, 281-288.
Virnig B.A., & McBean A.M. (2001). Using administrative data for public health surveillance and planning. Annual Review of Public Health, 22, 213-230.
Wolf, N. & Helminiak, T. W. (1998). Nonsampling measurement error in administrative data: Implications for economic evaluations. Health Economics, 5, 501-512.
Yip, J., Nishita, CM., Crimmons, E.M., & Wilber, K.H. (2007). High-cost users among dual eligibles in three care settings. Journal of Health Care for the Poor and Underserved, 18, 950-965.
John Robst, University of South Florida
Roger Boothroyd, University of South Florida
Paul Stiles, University of South Florida