Anthropology of an Idea: Big Data

Article excerpt

HUMANS HAVE BEEN whining about being bombarded with too much information since the advent of clay tablets. The complaint in Ecclesiastes that "of making many books there is no end" resonated in the Renaissance, when the invention of the printing press flooded Western Europe with what an alarmed Erasmus called "swarms of new books." But the digital revolution--with its ever-growing horde of sensors, digital devices, corporate databases, and social media sites--has been a game-changer, with 90 percent of the data in the world today created in the last two years alone. In response, everyone from marketers to policymakers has begun embracing a loosely defined term for today's massive data sets and the challenges they present: Big Data. While today's information deluge has enabled governments to improve security and public services, it has also sowed fears that Big Data is just another euphemism for Big Brother.

1887-1890 American statistician Herman Hollerith invents an electric machine that reads holes punched into paper cards to tabulate 1890 census data, revolutionizing the concept of a national head count, which had originated with the Babylonians in 3800 B.C. The device, which enables the United States to complete its census in one year instead of eight, spreads globally as the age of modern data processing begins.

[ILLUSTRATION OMITTED]

1935-1937 President Franklin D. Roosevelt's Social Security Act launches the U.S. government on its most ambitious data-gathering project ever, as IBM wins a government contract to keep employment records on 26 million working Americans and 3 million employers. "Imagine the vast army of clerks which will be necessary to keep these records," Republican presidential candidate Alf Landon scoffs. "Another army of field investigators will be necessary to check up on the people whose records are not clear."

1943 At Bletchley Park, a British facility dedicated to breaking Nazi codes during World War II, engineers develop a series of groundbreaking mass data-processing machines, culminating in the first programmable electronic computer. The device, named "Colossus," searches for patterns in intercepted messages by reading paper tape at 5,000 characters per second--reducing a process that had previously taken weeks to a matter of hours. Deciphered information on German troop formations later helps the Allies during their D-Day invasion.

[ILLUSTRATION OMITTED]

1961 The U.S. National Security Agency (NSA), a nine-year-old intelligence agency with more than 12,000 cryptologists, confronts information overload during the espionage-saturated Cold War, as it begins collecting and processing signals intelligence automatically with computers while struggling to digitize a backlog of records stored on analog magnetic tape in warehouses. (In July 1961 alone, the agency receives 17,000 reels of tape.)

1965-1966 The U.S. government secretly studies a plan to transfer all government records--including 742 million tax returns and 175 million sets of fingerprints--to magnetic computer tape at a single national 'data center, though the plan is later scrapped amid public concern about bringing "Orwell's '1984' at least as close as 1970," as one report puts it. The outcry inspires the 1974 Privacy Act, which places limits on federal agencies' sharing of personal information.

1989 British computer scientist Tim Berners-Lee proposes leveraging the Internet, pioneered by the U.S. government in the 1960s, to share information globally through a "hypertext" system called the World Wide Web. "The information contained would grow past a critical threshold," he writes, "so that the usefulness [of] the scheme would in turn encourage its increased use."

[ILLUSTRATION OMITTED]

1997 NASA researchers Michael Cox and David Ellsworth use the term "big data" for the first time to describe a familiar challenge in the 1990s: supercomputers generating massive amounts of information--in Cox and Ellsworth's case, simulations of airflow around aircraft--that cannot be processed and visualized. …