Mix and Match Your Data

nferris@govexec.com

A

s everyone knows, a picture is worth a thousand words. But the picture won't go into your ordinary relational database, where it could replace or enhance the words and numbers. That's why software developers are turning to a newer generation of technology called object-oriented systems.

To a software expert, an object is almost any digital item-a map, a fingerprint, a video clip or a software program. All of these are readily stored in a computer system, but conventional databases do not allow users to store multiple types of data in a file where they can be retrieved with a single command. When you have digital photos of top agency officials in your database, what happens when you sort the database numerically?

"A picture is a new data type," says C. Forbes Dewey Jr., a Massachusetts Institute of Technology professor who is working on medical records storage for the Defense Department. Medical records are one of the applications causing mainstream database specialists to take a longer look at object technology. The technology has been in use for more than a quarter-century, but only recently have mainstream database software suppliers such as Oracle Corp., Informix Corp., IBM Corp. and Computer Associates International Inc. embraced it.

In contemporary medicine, patient records consist of much more than pieces of paper. There are X-ray films, reports from other diagnostic tools such as ultrasound and electrocardiograms (some of which are best recorded as video), voice records from various procedures and conferences, and so on. As a rule, each kind of record is stored in a separate system. Pulling together the data on a single patient involves checking several types of storage systems.

Because military men and women move often, their medical records typically reside in many hospitals and clinics worldwide. With multiple locations and kinds of systems, it's easy for the records to become lost or inaccessible. DoD is trying to create a single online repository of military medical records that can be accessed anytime, at a battlefield hospital or anywhere else they are needed.

Object technology is the key to a workable system, Dewey says, but purely object-oriented databases are not ready for prime time. Computer Associates, the first large software company to take the object database plunge, released its Jasmine product in late 1997. It's being tested in many organizations but has yet to be applied in a large-scale, production environment. A small Defense Advanced Research Projects Agency contractor, InfoPike Inc. of Norwich, Conn., is using Jasmine to store software performance information for an intricate modeling project.

Dewey is a fan of Informix Dynamic Server with the Universal Data Option, widely regarded as the most robust example of object database technology available for full-scale application today. That technology is a hybrid called the object-relational database. It uses the relational model and standard relational tools, such as Structured Query Language, but allows users to extend the database by plugging in new data types. With Informix, modules called DataBlades are the means of adding data types.

Diagnostic Breakthroughs

Although Dewey is using Informix, it cannot yet support all the kinds of analyses he envisions for medical data. For example, he says, suppose you could record a baby's crying over a period of minutes or hours and match the tone and intensity with other medical data, such as the baby's temperature. The result could be a diagnostic breakthrough for infants who are too young to speak with the doctor about their symptoms.

That kind of database is not yet practical, Dewey says, but soon half the data in a patient's medical records will consist of images kept online with the text and numerical records. Managing and retrieving the records won't be accomplished easily with the kinds of commercial software available today. Object-oriented technology will be the answer, he believes.

One of the most compelling reasons to find tools for handling new kinds of data is the World Wide Web. As Web sites get larger and audio and video clips become common there, webmasters need tools for tracking contents. Linking a Web site to a database engine no longer is rare, but keeping multiple data types in the same database still is not common.

Another likely application is map-based information. Geographic information systems are nothing new, but they usually cannot handle all the applications federal, state and local officials dream up. As a result, they often work at the periphery of the organization, rather than being a core information resource. Using a mainstream relational database and adding in geospatial data, along with images and other data, may increase the utility of geographic information, some experts say.

Until now, the cost of storing image data online has deterred agencies from including images and sound in their databases. But storage costs have dropped precipitously, making the prospect of storing big files online more attractive.

Big Databases at Work

Once this unconventional information is online, it's merely a matter of time until users want to link it with their everyday data, as DoD is aiming to do with its medical records. The fact that agencies already depend on their relational databases is one argument for sticking with a hybrid object-relational model, rather than converting to a purely object-oriented system.

"Once you build those terabyte databases," says Jackie McAlexander, a systems engineer with Informix Federal Systems, "you don't want to be changing them." She's referring to systems that hold a trillion letters and/or numbers-an order of magnitude that once seemed likely only at the Central Intelligence Agency but is not so farfetched today.

With the object-relational hybrid, new types of data can be added as needed to a conventional database. An example: NASA gathers and stores a lot of scientific data in a special format called Hierarchical Data Format (HDF). The agency developed an Informix DataBlade that lets it plug HDF data into its relational databases, according to McAlexander.

McAlexander is among those who believe that the hybrid object-relational database is here to stay. "We don't really foresee a pure object environment for the majority of business databases," she says. Purely object-oriented databases so far don't let the users search the data in ways not anticipated by the system designers, she explains. This feature, called ad hoc query capability, was one of the most important advantages offered by relational databases when they arrived on the scene in the early 1980s.

Besides Informix, IBM Corp. is-to the surprise of many-a leader of the move to object-relational technology. At this spring's FOSE trade show in Washington, IBM was demonstrating how its DB2 database, which originally ran only on mainframes, is competing to handle multiple data types in a modern client-server computing environment.

A Structured Query Language search found the same phrase in several songs stored in a DB2 Universal Database system. The software is available for computers ranging from Windows 95 notebooks to large-scale multiprocessor Unix systems, and it can connect PC users with mainframe data repositories.

Sybase Inc., another well-known database software maker, has taken a slightly different approach. Instead of working directly with customers to import their unconventional data types into the Sybase Adaptive Server, Sybase has partnered with smaller specialty companies to build modules that handle special data types. Network Imaging Corp. offers a Sybase add-on that handles video files, for instance, and Verity Inc. supplies it with full-text search capabilities. Sybase itself, meanwhile, touts its undivided commitment to improving the power and flexibility of the core database engine.

Where does Oracle fit in this picture? Oracle, by most accounts the world's third largest software company, made its fortune with its relational database and has been a little slow to update it with relational capabilities. At least, that's the industry buzz, but company officials like Timothy G. Hoechst, director of the Oracle Government Technology Group, say the company is giving its customers what they want.

"We made a conscious decision not to oversolve the problem" of multiple data types, Hoechst says, insisting Oracle is interested in delivering practical capabilities, "not someone's Ph.D. thesis." Oracle 8, the newest version of the database, preserves the strengths of the relational software while allowing users to add on Data Cartridges to handle spatial, image and video data, as well as functions such as credit card charges and verification. Like Sybase, Oracle has enlisted other companies as its cartridge-creating partners.

Hoechst distinguishes between Oracle's Data Cartridges and other vendors' approaches, saying that inserting new data types too far into the core of the relational database can compromise its integrity. "We've walked that useful line" between leading edge and bleeding edge, he says, adding that most Oracle customers aren't looking to use data objects in their databases. Except for users in research labs and other advanced environments, Hoechst says, "they haven't seen the great value of it."

Building Bridges

Nonetheless, Hoechst readily agrees that there's a need to get systems with dissimilar data types and architectures exchanging and aggregating information. That's why Oracle has embraced middleware, the object-oriented technology that establishes bridges between objects in otherwise incompatible systems. Middleware tends to operate at the level where the information is created, modified and moved, rather than deep within the repository.

That application level "is where we're focusing our energies," Hoechst explains. Instead of asking its customers to rebuild their information systems, he says, Oracle wants to teach those systems to communicate with each other and with new applications. This approach is less elegant technically than embedding the objects within the relational database, and performance may suffer somewhat. But it has the flexibility of modular design. New objects can be added and old ones updated easily.

Database marketers are hoping that the new object-oriented capabilities will lure customers to enhance their systems. "If you have an application that just does straightforward letters and numbers, relational technology still does the job," says Don LeClair, a Computer Associates vice president. But multimedia is becoming commonplace because it adds value in terms of marketing appeal and ease of use.

Marry multimedia with the Web and you get a new environment that is natural, intuitive and ready for the 21st century. "I think it's going to be 10 years of miraculous developments" in new kinds of information systems, MIT's Dewey predicts.