Search Tools

January 1997


Search Tools

Automated probes find data lost in cyberspace.

Although an exact count is impossible, experts estimate that there are at least several hundred million Web pages out in cyberspace-and thousands more being created every month. This abundance of information is useless unless Net surfers can find what they need quickly and easily. Fortunately, several search technologies are available for locating a lot of information in a small amount of time.

One of the best-known technologies in this young market is the search engine, which is an automated program designed to explore and catalog Web sites based on simple queries. Users can access search engines on the Net via Web browsers and, once there, type in words or phrases on specified topics. Software "spiders" then crawl through the Web and use algorithm-based search logic to retrieve requested data within a couple of seconds.

Some engines do exhaustive searches of thousands of Web sites while others only go to sites that have the most hypertext links pointing to them. Data content also can vary. Several search engines provide full text from every relevant Web page while others supply summaries, site titles or URLs (Universal Resource Locators, which serve as Web addresses). Most engines take simple English database queries, but a few require users to employ the cryptic Boolean logic language to express search conditions.

Several search engines, such as Excite and InfoSeek, employ a type of artificial intelligence known as fuzzy logic to find Web pages related to keywords even if those exact words are not located on the pages. Since search-engine services are so diverse, the best advice is to sample as many as possible to find the most suitable one. Most are subsidized by advertising so the services are usually free for the surfing, providing users can avoid the increasing number of busy signals caused by Net congestion.

A more expensive but thorough alternative is to use commercial software packages that run Web searches simultaneously on several engines. The idea behind multiple searches is to pick up sites from one database that another may have left behind. WebSeeker from the Forefront Group, for instance, compiles results from 20 search-engine databases, eliminates duplicate listings and indexes the results. Similar programs include Blue Squirrel's Squrl, Iconovex's EchoSearch and Quarterdeck's WebCompass. The software runs between $100 and $400, depending on the level of sophistication.

Another type of search tool is the Net directory, which is essentially an electronic Yellow Pages of Web sites. Directories such as Yahoo categorize Web sites based on descriptions submitted by organizations when the sites are registered. Like search engines, services at the various Net directories vary widely. Some simply list URLs under categories and subcategories while others also include some text in their listings. A few even rank sites according to reports on the number of hits they receive. And one, the Four11 Directory (, lists nothing but Internet e-mail addresses.

Search technology is being employed by those building internal enterprise networks known as intranets. Agencies with bulging Web sites, such as NASA, are using database search software to help surfers find information quickly by typing in keywords. Commercial utility packages from companies such as Architext Software, Fulcrum and Verity can be loaded on Web servers to make intranet explorations as swift as those on the Web. Once text and images are obtained, they can be stored in electronic filing systems, such as Excalibur's EFS Webfile, for easy access.

Alta Vista
Digital Equipment Corp.'s Alta Vista search engine features a database of more than 30 million Web pages.

Stay up-to-date with federal news alerts and analysis — Sign up for GovExec's email newsletters.
Close [ x ] More from GovExec

Thank you for subscribing to newsletters from
We think these reports might interest you:

  • Sponsored by G Suite

    Cross-Agency Teamwork, Anytime and Anywhere

    Dan McCrae, director of IT service delivery division, National Oceanic and Atmospheric Administration (NOAA)

  • Data-Centric Security vs. Database-Level Security

    Database-level encryption had its origins in the 1990s and early 2000s in response to very basic risks which largely revolved around the theft of servers, backup tapes and other physical-layer assets. As noted in Verizon’s 2014, Data Breach Investigations Report (DBIR)1, threats today are far more advanced and dangerous.

  • Federal IT Applications: Assessing Government's Core Drivers

    In order to better understand the current state of external and internal-facing agency workplace applications, Government Business Council (GBC) and Riverbed undertook an in-depth research study of federal employees. Overall, survey findings indicate that federal IT applications still face a gamut of challenges with regard to quality, reliability, and performance management.

  • PIV- I And Multifactor Authentication: The Best Defense for Federal Government Contractors

    This white paper explores NIST SP 800-171 and why compliance is critical to federal government contractors, especially those that work with the Department of Defense, as well as how leveraging PIV-I credentialing with multifactor authentication can be used as a defense against cyberattacks

  • Toward A More Innovative Government

    This research study aims to understand how state and local leaders regard their agency’s innovation efforts and what they are doing to overcome the challenges they face in successfully implementing these efforts.

  • From Volume to Value: UK’s NHS Digital Provides U.S. Healthcare Agencies A Roadmap For Value-Based Payment Models

    The U.S. healthcare industry is rapidly moving away from traditional fee-for-service models and towards value-based purchasing that reimburses physicians for quality of care in place of frequency of care.

  • GBC Flash Poll: Is Your Agency Safe?

    Federal leaders weigh in on the state of information security


When you download a report, your information may be shared with the underwriters of that document.