Taking a Flier on Big Data

Airline passengers might soon be subjected to probes more controversial than body scans if the Transportation Security Administration pursues plans to profile passengers based on commercial analytics.

TSA is considering letting private data brokers calculate the threat-level of fliers. Agency officials say they expect to finish exploring this approach by the end of this year. Passengers whose digital footprints check out clean wouldn’t have to strip off shoes, overcoats and belts, or unpack laptops and liquids.

Set aside privacy fears about TSA peering into citizens’ gun shopping receipts, pharmacy purchases or online dating activities. Are commercial data aggregations even accurate? No one knows. Not the Federal Trade Commission, nor the Justice Department. Not the data brokers. And not the people being tracked—who typically can’t even see their own records.

Personal information gathered in commercial forums could prove valuable for public safety, authorities and privacy advocates agree. But they also say judging citizens based on outdated or inaccurate underlying data could do more harm than good for society.

TSA’s thinking is that a company would aggregate biographic and biometric “nongovernmental data elements to generate an assessment of the risk to the aviation transportation system that may be posed by a specific individual,” states a Jan. 8 request for strategy suggestions.

The system would have to provide a “reliable method that effectively identifies known travelers, based on a sound analysis and the application of an algorithm that produces dependable results,” the work requirements state.

Fliers and most TSA officials would be in the dark about the data those algorithms are munching on. “The specific sources and types of information employed for pre-screening purposes under this initiative may not be publicly disclosed,” agency documents state, adding that the data will not be disclosed to TSA except during audits. The quality requirements are vague: The vendor must use “specific sources of current, accurate and complete nongovernmental data.”

Spotty Track Records

Increasingly, big data taps the same kinds of digital evidence for authorities as it does for marketers: social media posts, voter registrations, credit reports and clickstreams—which are Web browsing histories—to name a few. 

The FTC in December 2012 ordered nine data brokers to report whether their company “monitors, audits, or evaluates the accuracy of personal data” used to target advertising. Commission officials, however, say they are not inquiring about the accuracy of personal data used to track criminals. “Our focus is on consumer privacy and commercial data practices—rather than the use of commercial data for law enforcement purposes,” says Peder Magee, senior attorney for FTC’s Division of Privacy and Identity Protection. 

It’s too late anyway. Justice’s Bureau of Alcohol, Tobacco, Firearms and Explosives already uses big data to predict gun violence. Justice officials would not comment on how they measure the integrity of this information.

The consequences of relying on dubious statistics and computations can vary. Some researchers suggest that a few mistakes won’t affect results because the scope of these analyses is so huge. “We can accept some messiness in return for scale,” Viktor Mayer- Schonberger and Kenneth Cukier write in their book, Big Data (Eamon Dolan/Houghton Mifflin Harcourt, March 2013). “We’re willing to sacrifice a bit of accuracy in return for knowing the general trend. Big data transforms figures into something more probabilistic than precise.”

However, the level of precision that satisfies marketers is very different from the exactitude required by government agencies, says Jennifer Granick, director of civil liberties at Stanford University’s Center for Internet and Society. “You can have 15 percent accuracy for advertising,” which might be better than other forms of behavioral analyses, “but if you are getting 85 percent of it wrong when you are denying people government benefits or sending out police to interview them, that would be completely wasteful and dangerous,” she says.

One major concern among some law enforcement experts is that most data warehouses store obsolete records. “The biggest problem is they don’t update,” says Paul Wormeli of the Integrated Justice Information Systems Institute, a federally funded organization. A citizen’s profile is not automatically adjusted if a credit report or human resources form turns out to have been mistyped. 

The Data Police

There’s no easy answer to the potential accuracy problem with big data. 

Directing a government agency, or even a bunch of agencies, to regulate data quality would be nearly impossible and futile, information management experts say. Plus, the private sector has a financial incentive to tidy up a person’s entry: the aggregator market competes on the sharpness of its databases. People should have the ability “to correct it and to remove it if the info is sensitive,” says Craig Wills, a computer science professor at Worcester Polytechnic Institute. Still, fixes made to one database don’t always carry over to other systems relying on the same information, Granick notes. “There’s no right to access the profile that whatever advertisers of the world have compiled on me,” she says. “Amazon has a profile on me and what they think I like, and I can refine it, but I can’t get a copy.”

Marketing firms argue new industry guidelines that let Internet users opt out of online tracking address many of these problems. Principles adopted by the Digital Advertising Alliance, whose members include Datalogix, Acxiom and other data wholesalers, prohibit browsing histories from being used to determine eligibility for employment, health care treatments and insurance coverage. “To date it’s proven to work. We have very broad reach and people are following it,” says Stuart Ingis, counsel for the alliance.

Unlike the data mining industry, credit bureaus are required by law to correct commercial data. 

Credit information is updated every 30 days, or each payment cycle, according to the Consumer Data Industry Association. Citizens are responsible for communicating name and address changes to lenders, who furnish those modifications to the bureaus. “The furnisher may have more up-to-date address information than the post office,” says Norm Magnuson, the association’s vice president of public affairs.  

So for TSA and other agencies, vetting the accuracy of big data will be nothing short of a big challenge.

Stay up-to-date with federal news alerts and analysis — Sign up for GovExec's email newsletters.
Close [ x ] More from GovExec

Thank you for subscribing to newsletters from GovExec.com.
We think these reports might interest you:

  • Forecasting Cloud's Future

    Conversations with Federal, State, and Local Technology Leaders on Cloud-Driven Digital Transformation

  • The Big Data Campaign Trail

    With everyone so focused on security following recent breaches at federal, state and local government and education institutions, there has been little emphasis on the need for better operations. This report breaks down some of the biggest operational challenges in IT management and provides insight into how agencies and leaders can successfully solve some of the biggest lingering government IT issues.

  • Communicating Innovation in Federal Government

    Federal Government spending on ‘obsolete technology’ continues to increase. Supporting the twin pillars of improved digital service delivery for citizens on the one hand, and the increasingly optimized and flexible working practices for federal employees on the other, are neither easy nor inexpensive tasks. This whitepaper explores how federal agencies can leverage the value of existing agency technology assets while offering IT leaders the ability to implement the kind of employee productivity, citizen service improvements and security demanded by federal oversight.

  • IT Transformation Trends: Flash Storage as a Strategic IT Asset

    MIT Technology Review: Flash Storage As a Strategic IT Asset For the first time in decades, IT leaders now consider all-flash storage as a strategic IT asset. IT has become a new operating model that enables self-service with high performance, density and resiliency. It also offers the self-service agility of the public cloud combined with the security, performance, and cost-effectiveness of a private cloud. Download this MIT Technology Review paper to learn more about how all-flash storage is transforming the data center.

  • Ongoing Efforts in Veterans Health Care Modernization

    This report discusses the current state of veterans health care


When you download a report, your information may be shared with the underwriters of that document.