Agency explores new tool to connect intelligence dots

Document for prospective contractors offers blunt assessment of flaws in efforts to profile terrorists, including inability to move beyond “guilt by association” models.

The government's top intelligence agency is building a computerized system to search very large stores of information for patterns of activity that look like terrorist planning.

The system, which is run by the Office of the Director of National Intelligence, is in the early research phases and is being tested, in part, with government intelligence that may contain information on U.S. citizens and other people inside the country. It encompasses existing profiling and detection systems, including those that create "suspicion scores" for suspected terrorists by analyzing very large databases of government intelligence, as well as records of individuals' private communications, financial transactions, and other everyday activities.

The details of the program, called Tangram, are contained in an unclassified document that National Journal obtained from a government contracting Web site. The document, called a "proposer's information packet," is a technical description of Tangram written for potential contractors who would help design and test the system.

The document was written by officials in the research-and-development section of the national intelligence office. A tangram is an old Chinese puzzle that takes seven geometric shapes -- five triangles, a square, and a parallelogram -- and rearranges them into different pictures.

In addition to descriptions of Tangram, the document offers a rare and surprisingly candid analysis of intelligence agencies' fits and starts -- and failures -- in other efforts to profile terrorists through data mining: Researchers, for example, haven't moved beyond "guilt-by-association models" that link suspected terrorists to other, potentially innocent people, and then rank the suspects by level of suspicion.

"To date, the predominant approaches have used a guilt-by-association model to derive suspicion scores," the Tangram document states. "In the cases where we have knowledge of a seed entity [a known person] in an unknown group, we have been very successful at detecting the entire group. However, in the absence of a known seed entity, how do we score a person if nothing is known about their associates? In such an instance, guilt-by-association fails."

Intelligence and privacy experts who reviewed the document said that it reaffirms their long-held belief that many computerized terrorist-profiling methods are largely ineffective. It also raises significant privacy concerns, because to distinguish terrorists from innocent people, a system that's as broad as Tangram purports to be would require access to many databases that contain private information about Americans, the experts said, including credit card transactions, communications records, and even Internet purchases.

"There is no other way that they could do this," said David Holtzman, former chief technology officer of Network Solutions, the company that runs the Internet's domain-naming system, and author of the book Privacy Lost. "They want to investigate real-time ways of spotting patterns" that might indicate terrorist activity, he said. "Telephone calls, for instance, would be an obvious thing you'd feed into this."

The Tangram document doesn't mention privacy protections or a process for monitoring the system's use to guard against abuse. In an interview, Tim Edgar, the deputy civil-liberties protection officer for the national intelligence director, said that Tangram "is a research-and-development program. We have been assured that it's not deployed for operational use."

Asked whether the intelligence used to test Tangram contains information about U.S. persons, defined as U.S. citizens and permanent resident aliens, Edgar said, "It's not being tested with any data that has unminimized information about U.S. persons in it." Minimization procedures are used by intelligence agencies to expunge people's names from official reports and replace them with an anonymous designation, such as U.S. Person No. 1.

Tangram is being tested "only with synthetic data or foreign-intelligence data already being used by analysts that meet Defense Department guidelines for handling of U.S. person information," Edgar said. The Office of the Director of National Intelligence "has not funded and is not planning to fund any contracts for the Tangram program using unminimized data with U.S. persons in it," he said.

Tangram drew skeptical reviews from technology and privacy experts because of its links to Total Information Awareness, a controversial research program started by the Pentagon in 2002. TIA also aimed to detect patterns of terrorist behavior. Congress ended all public funding for the program in 2003, but allowed research to continue through the classified intelligence budget.

In February, National Journal revealed that names of component TIA programs were simply changed and transferred to a research-and-development unit principally overseen by the National Security Agency. The unit, now under the control of the Office of the Director of National Intelligence, also runs Tangram.

The Tangram document cites several TIA programs -- by their new names -- as forming the latest phase of research upon which Tangram will build. In a prepared statement, the intelligence director's office said, "Tangram is addressing the problem that the intelligence community receives vast amounts of data a day and there are a wide variety of algorithms -- mathematical procedures -- for figuring out what is relevant. Different algorithms serve different purposes, but we believe that combining them will provide us new insights in detecting terrorist plans and activities. The project will allow analysts to mix and match various methods to connect the dots."

TIA was similarly envisioned as a vast combination of detection methods. In Tangram, "I see the system of systems that is essentially TIA about to be born," said Tim Sparapani, the legislative counsel on privacy issues for the American Civil Liberties Union. "TIA was designed to be one unified system," he said. "This is the vision, I think, made practical."

Robert Popp, who was the TIA program's deputy director, also saw parallels to Tangram. "They seem to be doing something very similar in concept," Popp said. "Taking data, doing all the sense-making and path-finding, and turning it into a form which a decision maker can act upon."

According to the document, Tangram "takes a systematic view of the [terrorist-detection] process, applying what is now a set of disjointed, cumbersome-to-configure technologies that are difficult for nontechnical users to apply, into a self-configuring, continuously operating intelligence analysis support system."

Tangram will be "aware" of the various patterns, relationships, and contexts expressed in data, and will automatically configure itself to choose the best algorithm for exploiting that data, the document explains. As envisioned, the system "can reason about how best to produce an answer" on its own.

"Conceptually, the approach would be to perform a succession of automated 'what if' scenarios that compute the expected value of acquiring additional information," the document states. The system would, effectively, suggest other questions for the analyst to ask, and perhaps where to look for answers.

Last month, the government awarded three contracts for Tangram research and design totaling almost $12 million. Total funding for the program is approximately $49 million. Two of the firms receiving awards -- Booz Allen Hamilton and 21st Century Technologies -- were principal contractors on the TIA program. The third company, SRI International, worked on one of TIA's predecessors, the Genoa program.

Spokeswomen for Booz Allen Hamilton and SRI declined to comment for this article. Repeated calls and e-mails to the Austin offices of 21st Century Technologies went unanswered.

The apparent lack of privacy protections in Tangram dismayed some experts. "Given the history of TIA and other programs, one would expect the proponents of a system like this would at least pay lip service to privacy issues," said David Sobel, senior counsel for the Electronic Frontier Foundation, a privacy watchdog. "The absence of that is a bit surprising."

The TIA program devoted more than $4 million to research aimed at ways to protect privacy while it was sifting databases, and former officials have said that although it was admittedly controversial, TIA was being designed all along with privacy protection and auditable logs to track those who used it. The privacy research, however, was abandoned when the program moved into the classified budget in the NSA.

Administration officials have singled out the importance of new technologies in the war on terrorism. President Bush said that the NSA's warrantless surveillance and analysis of phone calls and e-mails protects Americans from attack. Gen. Michael Hayden, the former NSA director, said that were such a system in place before the September 11 attacks, "we would have detected some of the 9/11 Al Qaeda operatives in the United States, and we would have identified them as such."

But the Tangram document presents a more pessimistic assessment of the state of terrorist detection. For instance, researchers want to find ways to distinguish individuals' innocuous activity from that which might appear normal but is really indicative of terrorist plotting. However, the document states that, in large measure, terrorism researchers "cannot readily distinguish the absolute scale of normal behaviors" either for innocent people or for terrorists.

The ACLU's Sparapani called that admission "a bombshell," because the government is acknowledging that current detection systems aren't sophisticated enough to separate terrorists from everyday people. Other outside experts were troubled that such shortcomings also mean that individuals intent on doing harm could be mistaken for innocent people.

Popp said that attempts to separate terrorists' activities from those of normal people are perilous. "When you try to capture what is normal behavior, and then determine non-normal, that's highly intractable," he said.

Several times, Popp said, TIA researchers discussed how to characterize nonterrorist behavior. "We avoided it. It was too hard. We had no idea how on God's earth you would characterize and capture normal behavior. We wouldn't know where to start." Instead, TIA researchers proposed looking for specific indicators of terrorist planning -- people purchasing airline tickets at the last minute with cash, for instance, or other transactions that fit the narrative of an attack.

Current detection techniques have raised the specter of what the Tangram document calls "runaway false detections." If analysts tie a terrorist suspect to five other individuals, say through phone calls, how can they be certain that these five people constitute a terrorist network and aren't simply people with whom the suspect has had innocuous, everyday interactions? The document says that research has been conducted on "the sensitivities of guilt-by-association models to runaway false detections."

Researchers have made other attempts to move beyond the guilt-by-association model, the document states. One technique, an obscure methodology known as "collective inferencing," in which the suspicion score of an entire network of people is computed at once, has apparently garnered some interest. But "existing techniques are far too simple" for real-world problems, the document acknowledges.

The Tangram document states that gaps in current detection techniques also owe to the difficulty of tracking terrorist behaviors, which are constantly changing.

"The underlying assumption of existing approaches is that behaviors are constant," the authors write. "Yet, behaviors are not constant.... How can we profile dynamic behavior well enough to be able to identify, with more-or-less confidence, entities who want to remain anonymous?" The answer to that question apparently eludes the researchers, who hope that Tangram might provide it.