How the CDC's Trailblazing Use of Analytics Fights Disease

Data helps detect foodborne illnesses faster than ever.

The Partnership for Public Service and IBM Center for The Business of Government this month issued “From Data to Decisions III: Lessons from Early Analytics Programs,” which examines successful early government users of data to see how they got started, what sustained them and how the data was used to improve mission-critical programs.

The report identifies lessons learned to help federal leaders and managers avoid pitfalls, instill analytics faster and move more efficiently and effectively to create data-driven cultures.

The following case study, the first in a series of three excerpted from the report, examines how the Centers for Disease Control and Prevention uses data to more quickly detect foodborne illness outbreaks.

CDC and State Health Teams Use DNA Fingerprints to Collar Bad Bacteria

Between Nov. 1992 and Feb. 1993, four children died and 732 people became ill after eating E. coli–contaminated hamburgers served at Jack in the Box restaurants in Washington,

Idaho, California and Nevada. The outbreak inspired a national effort to speed detection of foodborne illnesses.

In 1994, the Centers for Disease Control and Prevention (CDC) and the Association of Public Health Laboratories (APHL) began work on a database containing the DNA fingerprints of the E. coli bacterium.

By 1996, the database, known as PulseNet—the same name as the network of state laboratories working with CDC—was up and running, processing 154 bacterial DNA samples and identifying several multistate outbreaks in its first year.

In its first year, PulseNet interacted with four public health laboratories and tracked a single pathogen. By 2012, PulseNet included 87 labs and was tracking eight pathogens.

Using equipment purchased with CDC grants, the labs process fecal, blood or urine samples from sick patients and extract cultures of bacteria for DNA fingerprinting. The prints go to Pulsenet for analysis – if samples from different patients match, they form a cluster and indicate the possibility of an outbreak. The goal is to identify the source of the bacteria and stop the outbreak.

PulseNet-certified Food and Drug Administration (FDA) and Agriculture Department (USDA) laboratories also use PulseNet to track pathogens collected from food or animals in an attempt to catch illness-causing bacteria earlier, before they infect people.

These partnerships flourish because CDC meets regularly with FDA and USDA to discuss data standards. CDC branch chief Ian Williams reflected: “Like any good relationship or marriage, it requires working together to identify and resolve problems as they come up, and we’re good at that.”

In 2011, officials traced Listeria to cantaloupes from a farm in Colorado. It was the deadliest foodborne disease outbreak in the United States in almost 90 years—causing 29 deaths. In just 10 days, officials spotted an unusual increase in Listeria cases in local hospitals, identified contaminated cantaloupes as the source and issued a national consumer warning.

“Up to twice as many would have been infected had officials not had the tools, people and systems in place,” estimated CDC deputy director Robert Tauxe.

The CDC estimates that 47.8 million people a year get foodborne illnesses, resulting in 127,839 hospitalizations and more than 3,000 deaths. The annual economic burden of foodborne illness ranges from $51 billion to more than $77 billion. Ohio State University’s Robert Scharff estimated that PulseNet costs about $10 million a year and saves $291 million.

In an era of continuing fiscal uncertainty, PulseNet must continue to demonstrate its value. The CDC is working on developing sequencing technology for genetic material that doesn’t require a pure culture of bacteria.

Whole genome sequencing—mapping an entire strand of DNA—offers PulseNet a tantalizing opportunity. Genome sequencing produces more DNA data than the current testing, which would increase the speed and accuracy of PulseNet’s bacteria identification.

As sequencing becomes further automated and problems with transmitting and storing its large images are solved, it will allow CDC to include in PulseNet’s database all known pathogens instead of the eight currently tracked.

Lara Shane is VP for Research and Communications at the nonprofit Partnership for Public Service. To download the From Data to Decisions III: Lessons from Early Analytics Programs report, visit