Emergency preparedness exercises remain an imperfect science

The government runs scores of drills each year, but critics say responses to real-life events often show the tests weren’t adequate.

It is now an infamous hurricane. Category 3 upon landfall, it ripped through the Gulf Coast region, fulfilling worst fears by hitting low-lying New Orleans.

The storm "brought sustained winds of 120 [miles per hour], up to 20 inches of rain in parts of southeast Louisiana and a storm surge that topped levees in the New Orleans area," according to a press release from the Federal Emergency Management Agency. But Hurricane Pam existed only on paper as part of a training exercise conducted by FEMA and local officials in July 2004, a year and one month before Hurricane Katrina.

The real-life version of The Big One exposed widespread flaws in the government's ability to respond to disasters, making Hurricane Pam a symbol, in many minds, of the questionable effectiveness of preparedness exercises. The simulation illuminated some lessons later applied with successful results, but as a whole, the system seemed unready, despite the rehearsal.

Pam is one of scores of simulations, drills and exercises that take place every year at all levels of government and in the private sector with the intention of testing and preparing everyone from first responders to senior-level decision-makers for crises. Critics say real-life events often show the practices don't adequately prepare these critical players.

Michael J. Hopmeier, a former adviser to the surgeon general and now president of the consulting firm Unconventional Concepts Inc. in Mary Esther, Fla., points to the example in September of joy-riding teenagers driving past the guard posts at the Miami military base that houses U.S. Central Command, and a retired police officer using fake identification who entered Homeland Security Department headquarters in June. "When it wasn't a test, when it wasn't the [inspector general] going through, when it wasn't an assessment; when it was a real-world event, it turned out the security failed," he says.

Hopmeier and others say many preparedness exercises are not nearly as valuable as advertised, or as they could be. Simply holding them is not sufficient; the exercises must be evaluated to ensure they are testing the system enough to expose vulnerabilities and problems that must then be repaired.

"Exercises are not all created equal," says Michael Wermuth, director of homeland security programs at the nonprofit RAND Corp. "There are a lot of different kinds of exercises, a lot of different methodologies used to conduct exercises. There are exercises that sometimes seem to be destined to ensure success or at least a successful outcome in the exercise."

Big Business

Though police and fire departments have been running drills for years, historically there have been few disaster exercises involving those in the public health system, which has many nonemergency roles and has seen its funding levels decline substantially in recent decades. The terrorist attacks of Sept. 11, 2001, changed that, focusing more attention on the entire response spectrum.

The July 2002 National Strategy for Homeland Security instructed the Homeland Security Department to consolidate and expand existing systems and to create the National Exercise Program to train and evaluate response officials at all levels of government. And there has been a proliferation of simulations and drills from the federal level down to city and county offices, often using funding from DHS and elsewhere. The high-profile examples often sound like the names of post-apocalyptic football teams (Atlantic Storm, Dark Winter). Some are "tabletop exercises," which put hypothetical crises before government leaders, who must work together to make response decisions; others are field exercises, simulations conducted outside the office, often on mock sets, and meant to test the emergency response system's ability to handle a disaster.

In September, for example, first responders in Washington simulated a rescue of passengers-daubed with fake blood-from a real subway train stopped in a tunnel under the Potomac River; in July, volunteers in Mankato, Minn., counted and bagged Smartie candies-a substitute for pharmaceuticals-to see how quickly drugs could be distributed in a crisis.

The most high-profile exercise series began before Sept. 11. It is the Top Officials program, known as TOPOFF, a collection of seminars, planning events and large-scale national exercises designed to train and drill government leaders and others on preventing, responding to and recovering from terrorist attacks.

The scenarios are frightening to consider: pneumonic plague in Denver and a mustard gas release in New Hampshire for the first TOPOFF in May 2000; plague-struck Chicago and a dirty bomb detonation in Seattle three years later in TOPOFF 2; and releases of plague in New Jersey and mustard gas in Connecticut for TOPOFF 3 in April 2005. DHS' Office of Grants and Training, which now oversees the exercise series, has watched the scope expand precipitously from 18 federal agencies involved in the first drill to 27-plus dozens of state and local departments and 156 private sector organizations-in TOPOFF 3. The cost has escalated similarly, from $3 million in 2000, to more than $16 million for TOPOFF 2 and $21 million for TOPOFF 3.

June's TOPOFF 4 Command Post Exercise-a precursor to next year's full-scale, TOPOFF 4 field exercise-centered on a weapons of mass destruction threat to Washington and a detonation in a fictitious West Coast city. More than 4,000 people from all levels of government and other organizations took part in the simulation. The cost of the three-day event, a test of communications for the weeklong field simulation next year, was $3.5 million.

The price tag is one of the many reasons TOPOFF is a lightning rod for criticism.

Wermuth says exercises could be done in 30 cities for the price of a two-city TOPOFF. "If that is its main purpose, to send a signal . . . that we're putting a lot of resources behind [preparedness], that's a policy decision, and I'd be the last to argue," says Wermuth of RAND, which has helped run preparedness exercises for government agencies and is cataloging exercises conducted nationwide. "If the main purpose is to really engage what the name is supposed to indicate-to engage top officials in sitting down and having to make hard decisions about desperate situations-I'd say it's unnecessary to have this level of attention, this scale of an exercise and particularly the expenditure of many, many millions of dollars."

Disaster planning can be a big business. TOPOFF critics speak with derision about the consulting companies often contracted to run and evaluate disaster exercises, saying the simulations and the resulting plans can be cookie-cutter products from a central office, with little tailoring to a specific location. "If you're a consultant, it's pretty easy to go anywhere with these templates on [Microsoft] Word and scratch out 'Boise, Idaho,' and put in 'Orlando, Fla.,' " says Eric Noji, retired associate director of the Bioterrorism Preparedness and Response program at the Centers for Disease Control and Prevention and now director of the Pandemic Avian Influenza Preparedness program at the Uniformed Services University.

Noji and others complain that TOPOFF and other simulations don't seem to be designed to strenuously test the system. All but the first TOPOFF have occurred with advance warning. Emergency responders learn more valuable information for free from real-life false alarms-for example, a nerve gas scare in the Senate in February 2006 or a student pilot flying within three miles of the White House in May 2005-critics say.

"If you know there's going to be a test of everything, then you've already bought the answer, or you've excluded a lot of the problems," says William Bicknell, a professor at Boston University's School of Public Health and a former Massachusetts public health director.

No Follow-Up

Problems also can be excluded or hidden if participants are less than honest. Some say that's the result when response personnel are put through "tests" with cameras watching instead of in private, where they can fail without embarrassment.

That was a problem during Hurricane Pam, though the tabletop was conducted away from cameras. FEMA turned out to be unable to deliver on many of the promises it made during the simulation. "It happened a lot-the conversation would stop over something like generators or ice, and a FEMA guy would say, 'Look, don't worry about that, we've got contracts in place, you'll get your million gallons of water a day or whatever,' " recounted one Pam participant in Disaster: Hurricane Katrina and the Failure of Homeland Security by Christopher Cooper and Robert Block (Times Books, 2006).

Just funding and requiring agencies to perform crisis simulations isn't enough, critics say. Rigorous and independent evaluation is needed to ensure that exercises provide an accurate portrait of response capabilities and deficiencies. Hopmeier uses the analogy of someone taking his car to a mechanic who examines it extensively, runs exhaustive tests, provides documents detailing all the work that was done, and returns the car to the owner. Then on the way home, the brakes fail and the motorist hits a child.

"It doesn't matter how much money was spent, how many tests were done; they were wrong," says Hopmeier. "All the diagnostics don't matter. You've got a dead body."

Many say government agencies often don't do a good job of following up on lessons or taking corrective action. That information is supposed to be contained in a document called the after-action report, which in many ways is more important than the simulation.

"All these exercises don't mean anything unless there is some type of after-action report, [but] some people in some agencies see the exercise as the end in itself rather than a means to an end," says Carl Osaki, a clinical associate professor of environmental and occupational health at the University of Washington, who has designed several simulations. "A lot of times the findings of the after-action reports require additional training or policy. Sometimes [producing the reports is] time-consuming, or they're costly. So once they hit some of those barriers, the after-action report is sometimes seen as an academic exercise."

The after-action report typically is at least partially made public, so it's perhaps understandable that they omit some weaknesses identified in the exercise (to avoid broadcasting vulnerabilities that terrorists could exploit). "In good organizations, those things get taken care of," says Kerry Fosher, a research associate at the New England Center for Emergency Preparedness in Hanover, N.H. "In bad organizations, those things get swept under the rug."

That appears to have been a large problem with Hurricane Pam. Though many envisioned the tabletop simulation as the beginning of a conversation, according to Cooper and Block, FEMA canceled much of the follow-up work-including answering questions about moving emergency evacuees from short-term housing at the Superdome-for lack of funds.

Meet the Responders

Some vigorously dispute the criticisms. And government does appear to be improving its evaluations. RAND, under contract with Health and Human Services, has created the online Public Health Preparedness Database, including a searchable listing of nearly 40 simulations of terrorist attacks and infectious disease outbreaks. Each drill is graded on five elements. Wermuth says DHS has hired a contractor to evaluate its exercise program and HHS is testing emergency drills for Georgia and other states.

The DHS-funded nonprofit National Memorial Institute for the Prevention of Terrorism established the Lessons Learned Information Sharing Web site (www.llis.gov), which gives registered users access to substantial preparedness information, including after-action reports on various exercises.

DHS runs the Homeland Security Exercise and Evaluation Program (www.hseep.dhs.gov), which attempts to provide a standardized policy, methodology and language for designing, conducting and evaluating exercises. DHS, HHS and the CDC reportedly are beginning to require use of this model before funding exercises. "If all three of them are requiring the same exercise format, that's an enormous shift," Fosher says. "It forces people to build a standardized cohort of folks who can build exercises in their state using a common language."

DHS conducted five hurricane tabletop exercises this spring for regions along the Gulf Coast and East Coast to validate changes to preparedness plans after last year's hurricane season and to identify changes that still need to be made.

Despite the notorious response failures during Hurricane Katrina, Hurricane Pam had some success. According to Cooper and Block, contraflow schemes-reversing one side of a divided highway so all lanes lead out of the city-historically had worked poorly in Louisiana, but evacuations improved markedly after Pam.

Lynn Eden, associate director for research at Stanford University's Center for International Security and Cooperation, sat in on TOPOFF 2 and credits the exercise with DHS' recent practice of raising the security alert level for targeted sectors or areas, rather than uniformly across the country. The leaders during the simulation "really didn't want to go to a national state of red alert because it was completely paralyzing the country," she says. "I think that was a lesson learned."

Eden says many of the criticisms of TOPOFF and other simulations are off-base. Providing advance notice to participants is a practical necessity, they say. "The reality is if you're going to run an exercise that's going to involve 10,000, 15,000 people covering multiple states and facilities, there's no practical way to make that go off with no notice," says Paul E. Speer, vice president and director of domestic research at CNA Corp. in Alexandria, Va., which helped evaluate TOPOFF 2 and TOPOFF 3. Speer disputes criticisms, saying the evaluation methodology is detailed and rigorous.

And others say that in addition to raising general awareness of potential threats, these exercises are successful because enduring a simulated crisis is a more effective way of learning than reading a lengthy evaluation or report. The most important benefit, they say, is the opportunity for first responders and public health workers to meet and get to know those with whom they would be working during a crisis. "Ninety percent of what you learn in an exercise you learn before the day of the exercise," says Fosher. "Sitting down with sister agencies and going through each other's plans, expectations and assumptions: 'I thought you were bringing the N95 [gas] masks.' 'No, it says you bring the N95 masks.' "

Those relationships are paramount, because the official plan often gets jettisoned in the heat of a disaster, Noji says. Of the 82 disasters he was involved in, he never saw the formal plan used. "I was the disaster director at [Johns Hopkins University Hospital], and I didn't even know it," he says. "These things are the size of phone books."