Let a Thousand Flowers Bloom
But until a rose is a rose in every database, government agencies won't be able to share pruning tips -- or much else.
What's in a name? A lot, it turns out. At least when it comes to sharing data across different computer systems unable to know that gender is the same as sex. Examples abound. I like defendant, you prefer offender. No calling the whole thing off, though -- information sharing's a must.
The problem of differing data definitions stored within different database structures isn't going to be solved soon. Conceptually, an answer exists, and has now for a few years. It's called the National Information Exchange Model.
Built on a Justice Department data sharing project for use by federal, state and local law enforcement officers, NIEM is more ambitious in scope. A joint project between the Homeland Security Department and Justice since 2005, it aims to enable information sharing between entities as far apart as first responders on Indian land and the federal intelligence community.
The how is through metadata, "an external description of a distinct data resource," according to metadata guru Michael Daconta. He had a large role in creating NIEM at the Homeland Security Department and now serves as vice president of enterprise data management at Manassas, Va.-based Oberon Associates.
Metadata provides a context for data, describing its content and adding information about its creation and its properties. A song, for example, has melody and harmony, rhythm and instrumentation, characteristics that can be described by metadata.
Look online, Daconta says, at www.pandora.com for an accessible example of how metadata can work. Enter your favorite song into the Pandora engine and out will come similar sounds, based on a search of its song library, which is cataloged by up to 400 distinct song attributes. Good metadata allows search queries to connect dots that otherwise would slip by each other.
Cataloging all government data is too enormous a project even to contemplate. Nor does NIEM attempt it. Instead, it divides the world into seven information domains, such as "emergency management" or "international trade." The heart of NIEM is where the domains intersect, the Venn diagramlike sweet spot at the center of overlap. There lies a set of common data components shared by at least two information domains. Deeper in is a universal set of data components, data so basic that every domain makes use of them.
"We think it's pretty cutting edge stuff," says Kshemendra Paul, Justice's chief enterprise architect and NIEM program manager. The idea is that domains voluntarily will use standardized metadata schemas to transmit information. Grants from Justice and DHS will be contingent on state and local adoption of NIEM metadata.
In practice, NIEM is easier discussed than done. The concept is great, but "there's a lot of concerns on the implementation," says Samuel Ceccola, federal chief technology officer for McLean, Va.-based BEA Systems.
A recent analysis by RABA Technologies, based in Columbia, Md., found software challenges ahead for NIEM and its predecessor program, the Global Justice Extensible Markup Language Data Model, which is known as GJXDM and is being folded into NIEM. Information sharing across government organizations requires an approach known as federated query.
In part, that's because restrictions exist against downloading and caching other organizations' data. Federated query allows users to distribute data searches across multiple data-bases, as opposed to logging on and manually searching each data source or crafting painful point-to-point interfaces. It also sidesteps the need to construct a massive centralized database brimming with storage, synchronization and security issues. But some of the federated query middleware that RABA tested had difficulty loading GJXDM metadata subsets.
The study found that implementing middleware known as an enterprise information integration query broker will not be enough. EII brokers have great difficulty converting the metadata format used by NIEM into the format on which most EIIs are based. What's more, EII brokers don't allow for asynchronous searches, a must for widespread information sharing. If a query only can return all the search results at once (i.e., in a synchronous manner), blockages will build up.
Solving the information sharing problem through NIEM, the study concludes, requires an enterprise service bus - service-oriented architecture middleware. It's a surprising finding, RABA acknowledges.
Usually ESBs aren't associated directly with information sharing but rather with Web services, which are modular software programs, albeit ones based on metadata exchanges. But that's precisely the level at which data aggregation from differently structured databases best take place. Web services make asynchronous returns possible, according to the study. Justice's Paul says Web services aren't an absolute necessity, but he acknowledges that NIEM "works very well" with them.
Technology probably is the least of the implementation challenges ahead for NIEM. The technology is doable. But ugly issues such as funding raise their heads, doubly so when it comes to sharing resources across agencies.
It's against federal law for one department to augment another's budget, so what happens when Justice builds an enormously popular NIEM service, only to be saddled with other agencies' demands on its infrastructure? Technologists argue that some degree of Web service redundancy, reciprocal funding agreements and other dodges can effectively make service oriented architecture work within government. But if NIEM founders, people, not bits and bytes, likely will be to blame.