The US government is developing a massive computer system that
can collect huge amounts of data and, by linking far-flung
information from blogs and e-mail to government records and
intelligence reports, search for patterns of terrorist activity.
The system - parts of which are operational, parts of which are
still under development - is already credited with helping to foil
some plots. It is the federal government's latest attempt to use
broad data-collection and powerful analysis in the fight against
terrorism. But by delving deeply into the digital minutiae of
American life, the program is also raising concerns that the
government is intruding too deeply into citizens' privacy.
"We don't realize that, as we live our lives and make little
choices, like buying groceries, buying on Amazon, Googling, we're
leaving traces everywhere," says Lee Tien, a staff attorney with the
Electronic Frontier Foundation. "We have an attitude that no one
will connect all those dots. But these programs are about connecting
those dots - analyzing and aggregating them - in a way that we
haven't thought about. It's one of the underlying fundamental issues
we have yet to come to grips with."
The core of this effort is a little-known system called Analysis,
Dissemination, Visualization, Insight, and Semantic Enhancement
(ADVISE). Only a few public documents mention it. ADVISE is a
research and development program within the Department of Homeland
Security (DHS), part of its three-year-old "Threat and
Vulnerability, Testing and Assessment" portfolio. The TVTA received
nearly $50 million in federal funding this year.
DHS officials are circumspect when talking about ADVISE. "I've
heard of it," says Peter Sand, director of privacy technology. "I
don't know the actual status right now. But if it's a system that's
been discussed, then it's something we're involved in at some
Data-mining is a key technology
A major part of ADVISE involves data-mining - or "dataveillance,"
as some call it. It means sifting through data to look for patterns.
If a supermarket finds that customers who buy cider also tend to buy
fresh-baked bread, it might group the two together. To prevent
fraud, credit-card issuers use data-mining to look for patterns of
What sets ADVISE apart is its scope. It would collect a vast
array of corporate and public online information - from financial
records to CNN news stories - and cross-reference it against US
intelligence and law-enforcement records. The system would then
store it as "entities" - linked data about people, places, things,
organizations, and events, according to a report summarizing a 2004
DHS conference in Alexandria, Va. The storage requirements alone are
huge - enough to retain information about 1 quadrillion entities,
the report estimated. If each entity were a penny, they would
collectively form a cube a half-mile high - roughly double the
height of the Empire State Building.
But ADVISE and related DHS technologies aim to do much more,
according to Joseph Kielman, manager of the TVTA portfolio. The key
is not merely to identify terrorists, or sift for key words, but to
identify critical patterns in data that illumine their motives and
intentions, he wrote in a presentation at a November conference in
For example: Is a burst of Internet traffic between a few people
the plotting of terrorists, or just bloggers arguing? ADVISE
algorithms would try to determine that before flagging the data
pattern for a human analyst's review.
At least a few pieces of ADVISE are already operational. Consider
Starlight, which along with other "visualization" software tools can
give human analysts a graphical view of data. Viewing data in this
way could reveal patterns not obvious in text or number form.
Understanding the relationships among people, organizations, places,
and things - using social-behavior analysis and other techniques -
is essential to going beyond mere data-mining to comprehensive
"knowledge discovery in databases," Dr. …