Linkages With Data From Social Security Administrative Records in the Health and Retirement Study*
The Health and Retirement Study (HRS) is a major longitudinal study designed for scientific and policy researchers for study of the economics, health, and demography of retirement and aging. The primary HRS sponsor is the National Institute of Aging, and the project is being conducted by the Survey Research Center of the Institute for Social Research (ISR) at the University of Michigan. Several agencies, including the Social Security Administration (SSA), are supporting the project. This is the second note describing SSA's data support for the HRS. The conditions under which SSA has been able to provide data and the respondent consent procedures developed for the release of SSA data are discussed in Olson (1996). This note describes the data from SSA records that have been released for linking to HRS data, linkage rates resulting from the consent process, and subgroup patterns in linkage rates.
HRS Sample and Design Features
The original HRS is a panel survey of a nationally representative sample of households containing at least one person born in the 1931-41 period. The name "Health and Retirement Study" has come to mean two things: (1) both the original cohort, which began in 1992 and (2) the larger project, which includes other survey cohorts that began in 1993, 1998, or are planned for future years. The several cohorts in the full study are summarized in chart 1. This note is generally restricted to the original cohort first interviewed in 1992, although SSA has agreed to provide administrative data for all four cohorts described in the chart.
Members of the cohorts born in the 1931-41 period were aged 51-61 in 1992, and in HRS terms, they are "age-eligible persons." They constituted 9,824 persons or 78 percent of the 12,652 persons in the initial 1992 interview. Spouses or partners were also interviewed, regardless of their year of birth, but spouses or partners born in years other than 193 1-41 are not representative of other age groups, and their HRS weights were set to zero.'
From the outset, the HRS design recognized the benefits to be gained from augmenting survey information with data from SSA's administrative files and from other sources. Examples from non-SSA sources include detailed pension plan data used to estimate pension wealth that were developed from summary pension plan documents from employers of survey respondents.' Data from the National Death Index of the National Center for Health Statistics on the fact and date of death have been added, as relevant, and information is being developed from Medicare records about diagnoses, procedures, hospital stays, and medical use and costs.
Data derived from SSA records and currently available for HRS research are of three types: earnings histories, Social Security benefit histories, and Supplemental Security Income (SSI) payment histories.' First and of primary interest to most HRS restricted data users are data on earnings histories. They are primarily used to develop estimates of Social Security benefits and wealth and pension wealth, which are used to understand retirement behavior, preparedness for retirement, and economic well-being and in related studies.
Two restricted files containing earnings data are currently available. One includes annual taxable earnings up to the Social Security taxable maximum and annual quarters of coverage (or coverage credits) for the 1951-91 period. For the early years, a summary taxable earnings amount for the 193750 period and total quarters of coverage for years 1947-50 are also available. An earnings record can exist for persons with no covered earnings over their working lives, although the overwhelming majority of persons in the HRS cohort have some earnings. A second file-the Summary Earnings and Projected Benefits File (SEPBF)-was created from the annual covered earnings data (Mitchell, Olson, and Steinmeier forthcoming). …