Total Information Awareness official responds to criticism

By Shane Harris

January 31, 2003

The second-ranking official on the Defense Department's Total Information Awareness Project to predict terrorist attacks says critics of the effort have misinterpreted its goals and the nature of the technology it will use.

The TIA project has its roots in previous Pentagon programs to build machines that think like human beings. For more than a decade, the department has studied computers that could help the military predict enemy troop movements and better wage war. TIA would build on these efforts with a system that would inspect private information about people living in the United States to try to uncover patterns of terrorist activity. The project is under attack by privacy advocates and lawmakers who say it's tantamount to domestic spying.

Officials at the Defense Advanced Research Projects Agency (DARPA), which is running the TIA project, have been tight-lipped about it-in fact, they've been sued in federal court for refusing to release records. But in recent interviews with Government Executive, TIA Deputy Director Robert Popp showed how TIA would rely on the artificial intelligence work of earlier projects as well as the inspection of databases that has inflamed TIA's critics.

TIA's goal is to predict terrorist attacks before they happen. The system would scan private and public databases, as well as the Internet, for transactions that might be associated with a terrorist attack. Those transactions might be credit card purchases, airline reservations, purchases of chemicals or rental car records. TIA also would translate foreign Web pages into English to find key words and phrases that might suggest an attack.

Critics charge that TIA would search thousands of databases and collect records about mostly innocent people. They say this so-called data mining practice is of dubious scientific value because it would yield many false positive results, incorrectly fingering people as terrorists. Also, if TIA stores the data it collects, it would become a prime target for computer hackers and American adversaries.

But Popp said TIA would rely less on data mining than many opponents think. Terrorism experts, using studies of past attacks and confiscated terrorist writings, are developing a series of templates, or likely attack scenarios. Those templates would help TIA determine what databases to investigate. TIA would not cast a wide net, he said. Rather, using the templates as a guide for specific information, it might only search dozens of databases, he said.

As an example, the Sept. 11 template might look like this: Hijackers use civil aircraft to destroy an office building. TIA would use data mining to scan an "infospace," looking for, perhaps, airline ticket purchases, hotel reservations, visa applications and money transfers between bank accounts.

Popp's comments are the first of any DARPA official to counter the opposition to TIA. The project's director, John Poindexter, whose conviction for lying to Congress during the Iran-Contra investigation was later overturned, refuses to conduct interviews. Popp's response, as well as the history of the TIA project and its ancestors, shows the system is predicated on complex and controversial theories that DARPA has investigated for years. The officials who lead TIA also can be traced to the projects that gave rise to it.

Thinking Machines

In 1989, DARPA started working with the Air Force Research Laboratory in Rome, N.Y., to develop "automatic decision-making" practices to aid the military in times of crisis and planning, documents show. It was among the first in a series of DARPA projects aimed at teaching computers to think more like people, and to make analyses and decisions on their own. Doug Dyer, the project's manager, now works in the Information Awareness Office, the umbrella group for TIA and other projects Poindexter directs.

The Rome lab is a frequent contractor for DARPA research, a spokesman said, and is currently performing work on the TIA project. Officials at the lab wouldn't comment on the nature of their research. Popp was a visiting scientist at the lab several years after Dyer's project began.

In the late 1990s, DARPA's work on artificial intelligence picked up steam with a project called the High Performance Knowledge Base (HPKB). Its goal was to create technology to build databases of tens of thousands of rules and observations about a variety of subjects, and it focused on warfighting. Documents show that DARPA engineers believed that powerful data mining technologies already in use to detect credit card fraud and predict consumer purchasing behavior might be applicable to the project. An evaluation paper by Defense scientists used the phrase "knowledge is power" to describe HPKB's philosophy. That phrase became the official slogan of Poindexter's Total Information Awareness Office.

Several companies helped test the project-including software firm Alphatech, one of Popp's former employers. DARPA initiated a follow-on project to HPKB called Rapid Knowledge Formation (RKF), that tried to improve the interaction between thinking machines and the human beings that teach them.

A main component of RKF research was helping human database builders correct observations the machines made, in order to avoid errors and expand its knowledge. "Our vision is that scientific, technical and military experts would encode massive amounts of knowledge into reusable knowledge bases," Murray Burk, the RKF project manager, said at a DARPA conference in 1999. Burk, now assigned to DARPA's Information Exploitation Office, was a project manager at the Rome lab for 14 years.

Burk said RKF engineers wanted "reusable theories" that could be used "for answering a broad range of questions and solving a large number of problems." The terrorism templates TIA envisions also are reusable theories that try to answer questions about future attacks.

The RKF project is ongoing and is part of the Information Exploitation Office. However, it was housed in the Information Systems Office, which was closed in October 2001, after the Sept. 11 attacks, a DARPA spokeswoman said. Other programs in that office also looked at artificial intelligence and knowledge databases. Some of them were moved into Poindexter's group after the Sept. 11 attacks, when that division became DARPA's focal point for counterterrorism research. Some of them also are now TIA components.

TIA predates Poindexter's arrival at DARPA. At the 1999 conference where Burk described RKF, J. Brian Sharkey, then a DARPA program manager, delivered a presentation entitled "Total Information Awareness." He said the effort was not an approved program, but rather "a technology focus… and a starting point for reshaping the direction of existing programs and launching new efforts in the future."

The description of TIA's scope jibes with the goals of the other DARPA projects, according to Sharkey: "The primary question is: 'How far can we push automation through the development of intelligent search and inference agents, to take the burden of finding relevant evidence that the human can reason about?" TIA was also part of the Information Systems Office.

Sharkey also created the Genoa project, a decision-making tool to be used by the Defense Intelligence Agency. Poindexter says he has worked closely with the project for the past seven years. Genoa II, its heir, is now part of TIA. Prior to working on Genoa, Sharkey was a manager at Bolt, Beranek and Newman, a pioneering network technology firm. Popp was also a senior scientist at the firm overseeing DARPA projects and research.

DARPA rotates its projects on a regular basis, and reorganizes its different offices often. But many of the same companies and entities work on several agency projects. DARPA critics point to these connections and label the agency "an old boy's network." However, there are few scientists in the country with the expertise these projects require. Many of them have worked for DARPA. Much of that work now finds a home in TIA.

Beer and Diapers

Most of the scientific skepticism about TIA concerns data mining. The term is ill-defined, but is well illustrated by an often-cited case.

A number of convenience store clerks, the story goes, noticed that men often bought beer at the same time they bought diapers. The store mined its receipts and proved the clerks' observations correct. So, the store began stocking diapers next to the beer coolers, and sales skyrocketed.

The story is a myth, but it shows how data mining seeks to understand the relationship between different actions. It's used to detect credit card fraud by looking for anomalies, transactions that don't match a customer's usual habits.

TIA critics say data mining is an imperfect science, and that the more transactions that are scanned, the higher the probability there will be false positives.

Popp said the notion that false positives would impair TIA is a "red herring."

"We're not trying to define a norm and then find anomalous behaviors or outliers to that norm" as in fraud detection, Popp said. "Our approach in TIA is to not rely on data mining to look for anomalous patterns."

Instead, the attack templates narrow down the data to be mined. TIA would search a "sweet spot" of maybe a few dozen databases, not hundreds or thousands. Research by DARPA's terrorism experts, as well as a study being conducted by the Rand Corp., suggest this focused approach is possible, Popp said. TIA would rank its findings with designations such as possible, likely or certain terrorist activity, Popp said.

A computer scientist who specializes in data analysis, and who asked to remain anonymous, said this approach is flawed. It's important to remove human biases and use mainly data to make the templates, he said. Otherwise, the system "recognizes only those scenarios the [experts] had the creativity to come up with."

Because there is so little knowledge about how terrorists operate, TIA could only spot clusters of events that resemble what people think an attack looks like, the computer scientist said. "I am skeptical of the idea that anyone can paint a useful, predictive picture of the general class of terrorist attacks," he said.

Popp said new templates would be added as experts learn more about terrorists' tactics. "We hope to see over time…an 80/20 phenomena, where roughly 20 percent of the templates will provide 80 percent of the value," Popp said. Of course, that requires knowing what the terrorist behavior patters are.

One technologist familiar with TIA's work thinks that's impossible. When it comes to terrorism, "there is no pattern," said Chris Westphal, chief executive officer of Visual Analytics, a data mining software maker that worked with DARPA on TIA-related technologies in the late 1990s.

Westphal said data mining can spot patterns as long as the templates of behavior consist of stable variables, such as numbers. The approach works well for determining what people are at risk for certain diseases, he said.

But terrorists are dynamic human beings. Westphal's company created behavior templates to help the U.S. Customs Service catch drug smugglers at the Mexican border. The processes yielded "a ton of false positives," he said, because human behavior is hard to measure.

Westphal, who said his company trademarked the phrase "total information awareness" before DARPA appropriated it, said Visual Analytics won't work with the agency anymore because TIA has "blurred the lines" of data mining.

Seeing the Future

The TIA project is now in jeopardy. Sen. Ron Wyden, D-Ore., has introduced legislation to stop funding for the program until the administration makes a full report on its scope and budget.

DARPA won't build a TIA machine, but rather will hand over its design to some other agency. Defense's inspector general has reported that TIA officials have briefed the Defense, Homeland Security and Justice departments on the project. A spokesman at the FBI said officials also met with Poindexter's group, and that the bureau is open to creating a "memorandum of understanding" about the use of TIA.

A number of computer scientists contacted for this story doubted TIA's ability to pick out terrorists from the population, but noted the system might be very good at analyzing data about a specific person or group of people. For that reason, some fear TIA might be used by law enforcement agencies to monitor people deemed suspicious by the government.

TIA research is three years from completion. In the meantime, federal agencies such as the FBI, the Transportation Security Administration and the CIA have seen their powers to mine data and investigate individuals widened. Those agencies are using technology to conduct their work.

The argument over privacy and the science of TIA may never be settled. But as homeland security, law enforcement and intelligence agencies become more dependent on technology, the debate will grow more ferocious.


By Shane Harris

January 31, 2003

http://www.govexec.com/defense/2003/01/total-information-awareness-official-responds-to-criticism/13355/