Part of The Alchemist's
Lair Web Site
Maintained by Harry E. Pence, Professor of Chemistry, SUNY Oneonta, for the use of his students. Any opinions are totally coincidental and have no official endo rsement, including the people who sign my pay checks. Comments and suggestions are welcome (firstname.lastname@example.org).
Last Revised March 17, 2000
YOU ARE HERE> Alchemist's Lair > Web Tutorial > Following Engine Changes
Following WWW Search Engine Changes ,Harry E. Pence, SUNY Oneonta, Oneonta, NY, email@example.com< /P>
It is becoming increasingly common for students to use the World Wide Web as a source of information, even in science courses. This creates several challenges for science faculty members. In addition to teaching students how to effectively use information to create papers and talks, it has become much more important to help students learn how to evaluate data. The WWW is a confusing combination of truth and falsehood, and it can require a discer ning eye to distinguish between the two.
In addition, faculty are finding that it is difficult to direct students to the best search engine. Since the search engine is the main tool for finding information on the web, search engine selection can be a key decision in student research. The most widely advertized engines are often relatively poor for scientific searches. Indeed, it sometimes seems that there is an inverse relationship between the amount of advertizing and the effectiveness of the engine.
As discussed in an earlier article , there are at least three important criteria that should be used to evaluate search engines, comprehensiveness, currency, and efficiency. Comprehensiveness is a measure of what fraction of the total web sites the search engine actually reviews. Currency measures how often the search engine revisits sites to determine whether or not there have been any changes. Efficiency is determined by whether the most useful sites are not just included but listed early in the search results. Comparing engines based on these criteria is problematic, because the characteristics of engines seem to be constantly changing.
During the past few months, several of the main search engines have been competing to attract more traffic to their sites by making claims about the effectiveness of their product. The mode of competition has varied from increasing the size of the engine index (a generally be neficial effort) to publicizing rationalizations of why an engine that has poorer metrics is still preferable (including blatant deception). The purpose of this article is to suggest where an individual can go to obtain non-biased and up-to- date information about search engines.
PRIMARY SOURCES FOR SEARCH ENGINE EVALUATION
A major source of data about the accessibility of science information on the Web is provided by Steve Lawrence and C. Lee Giles from the NEC Research Institute i n Princeton, NJ. Their three studies of web search engines are especially valuable because they are mainly concerned with scientific searches. All are available on the web, including the most recent . Since this is apparently an ongoing study, it would be a good idea to check back from time to time to learn if more recent results have become available.
Danny Sullivan's Search Engine Watch seems to be the most extensi ve source of general information about engines. This site includes a current listing of index sizes. Page down to see that even the largest index is less than half of the total web sites. Sullivan's site also links to many on-line and print evaluations. The only problem with this site is that there is so much information that it is easy to find that you have just click ed away from something that you can no longer find. Either drop bread crumbs as you surf or else use the excellent site-specific search engine that is provided. Sullivan also offers a subscription service that promises even more information, but the part of the site that is open to the general public is an excellent starting point for anyone who is trying to keep up with recent search engine developments.
These sections only scratch the surface of what Sullivan offers. Be sure to look at the in formative tutorial on how search engines work, as well as Web Searching Tips, which, as the name suggests, explains how the major engines work and how to maximize the possibility that a given engine will give the best possible results. The tutorials are aimed at all levels of experience, ranging from novice, through power searching and boolean algebra.
The third major resource on index size and number of dead links is Search Engine Showdown , a site maintained by Greg Notess. He factors the dead links data into the index size charts and gets results that look somewhat d ifferent than the numbers that are often cited by the search engines themselves . Another feature of Greg's site is Search Engine Inconsistencies. These pages list search engine problems, both temporary and long-term, for four of the most popular engines, AltaVista, Google, HotBot, and Northern Light.
SECONDARY SOURCES FOR SEARCH ENGINE EVALUATION
After the big three, there are a number of sites that may not be as generally useful, but do have some interesting features. Search IQ has a listing of more engines than you probably dreamed existed, complete with a rating number (the IQ) and some very frank criticisms.
The portal concept is still a hot topic on the net, and if you wish to compare the features of the various portals (which often appear to be search engines to those of us who think the net is primarily an information source instead of a place to catch suc kers) there is a site devoted to these comparisons, called Traffick.com. Even for those who don't care about portals, the set of articles under the general title, "Andrew's Metaguide" offer good analysis of topics related to searching.
About.com uses live guides, who write brief reviews, with appropriate links, about topics of general interest. There is a special section dedicated to web searching, and Chris Sherman, the guide, has gathered an impressive variety of useful articles and links.
It seems inevitable that the WWW will continue to evolve rapidly and in unpredictable directions. Search techniques for printed materials may remain relatively constant for years and even decades, but the internet world can change within months or even days. The only way to remain current in the new information environment is to use the environment itself to keep track of new developments. It is hoped that these net references will make this job a little easier.
Return to Web Tutorial Home Page.
Return to Chem
398 Assignments Home Page.
You are the visitor to the Alchemist's Lair site since Jan. 10,1997.