Government faces a data deluge
The federal government is awash in data. And it's expanding at rates faster than chief information officers can count. No one knows exactly how much information agencies have stored in their far-flung databases, but experts say it's a lot. Consider this: By 2015, the world will generate the equivalent of almost 93 million Libraries of Congress-in just one year, according to Cisco's Internet Business Solutions Group.
The government is a big player in that information explosion, although how big is not certain. The cost to store and manage the growing mound of data is rising and eating up scarce information technology resources. It's no surprise the next big IT investment agencies will make in the coming years, if they haven't already started, is in something called virtualized storage, which uses software to connect multiple devices to create what simulates a single pool of storage capacity that can be controlled from a central console. The console makes it easier to back up, archive and retrieve data.
With agencies creating more data, storage virtualization is an inevitable part of their IT future. Many operations-from the Congressional Budget Office to the State Department to the U.S. military-are looking for ways to squeeze more efficiency out of their storage systems and drive down costs.
The Census Bureau is looking to virtualize storage to help it manage the 2.5 petabytes of data that ebbs and flows as it conducts the decennial census and vast economic surveys. The data, which amounts to more than the entire collection in all U.S. academic research libraries, is contained in a variety of storage platforms that multiple vendors supply. But maintaining so many disparate systems is driving up the cost of operating the data centers that house the information.
"We have a very diverse storage architecture, and that diversity doesn't lend itself nicely to be highly efficient from a cost perspective," says Brian McGrath, CIO at the Census Bureau. He says virtualization would create storage platforms that could be shared throughout the bureau to minimize unused capacity and lower operating costs.
Five years ago, the first wave of data center efficiency began with server virtualization. Agencies were able to consolidate 10 or more servers into one, increasing use of available computing power from about 30 percent to as much as 80 percent. But that placed demands on storage and backup systems, which require a lot of server capacity.
"Backing up a virtual server infrastructure becomes a big burden on data centers and their resources," says Fadi Albatal, vice president of marketing with FalconStor Software. "When server utilization rates were 20 percent, servers still had 80 percent available for heavy load processing such as backups. Now that servers have utilization rates of 80 percent it means there's only 20 percent left for all my backup processes."
In the Aug. 1 issue of Government Executive, Carolyn Duffy Marsan looks at the challenge of storing growing piles of federal data.