The Statistics Corner: The Statistics of Income Program of the Internal Revenue Service
Petska, Tom, Scheuren, Fritz, Wilson, Robert R., Business Economics
FEDERAL TAX RETURN information is an integral part of the statistical infrastructure enabling analysis of the U.S. economy. Most of this information is compiled by a relatively obscure organization, the Statistics of Income (SOI) Division of the Internal Revenue Service (IRS). In spite of this obscurity, SOl data are part of the bedrock of the U.S. statistical system and central to the understanding of the economy as a whole.
This article is the first of two that provide a brief overview of the SOI program -- its history, products, and services. In this first article, background information on the statutory origins and statistical processing of tax return information is provided. The major SOI programs and their principal customers are then summarized. In a later article, ongoing innovations in the functional structure and technologies in SOI are described. Finally, issues of access to confidential tax return information and services to users are discussed.
BACKGROUND AND HISTORY
The compilation of economic and financial information from tax returns came into being after the adoption of the Sixteenth Amendment to the Constitution and the subsequent enactment of the first modern U.S. income tax law, the Revenue Act of 1916. This Act specifically called for the annual publication of statistics. In spite of many revisions to the tax law, this function remains in the current Internal Revenue Code, which is based on the Revenue Act of 1986 and specifies to - "... prepare and publish not less than annually statistics reasonably available with respect to the operations of the internal revenue laws, including classifications of taxpayers and of income, the amounts claimed or allowed as deductions, exemptions, and credits."(1)
Like other Federal statistical agencies, the SOI Division's mission is to collect and process data so that they become meaningful information. The mission of SOl differs from many other federal statistical agencies in two respects:
1. Unlike many statistical agencies that collect data through surveys, SOI collects data from the administrative records created from processing tax and information returns.
2. Although the IRS is a user of SOI data, the primary uses for SOI data are outside of IRS, in policy analyses on the effects of new or proposed tax laws and for evaluating the functioning of the U.S. economy.
To accomplish its statutory responsibilities, the SOI program presently requires an annual budget of about $25 million. If revenues from reimbursable projects are also considered, this total amounts to nearly $28 million. The SOI national office staff in Washington is comprised of about 200 people -- mostly economists, computer specialists, and statisticians -- and accounts for about 40 percent of total SOl staffing. This staff, working closely with customers, determines the content of each project, designs the samples used, and develops field processing instructions, then works with its field processing staff to carry out the program.
Data capture operations in SOI are largely conducted by paraprofessionals at five of the ten IRS service centers located throughout the country. Programming is done mainly by staffs of computer specialists at two "hub" service centers. Together these centers account for about 50 percent of SOI staffing. The remainder of SOI staff is based at the Detroit Computing Center where activities such as final data file creation are performed.
SOI's statistical processing of tax return data has historically been separate from the mainline processing of tax returns for administrative purposes. SOI operations begin by sampling from tax or information returns in the basic tax administration (or Master File) system. The Master File offers a sampling frame that enables efficient and sophisticated sample designs to be used. After the returns are sampled, data elements already captured for administrative purposes are used as a starting point in statistical processing. …