Re: Economic effects of link-based search engines on e-journals

From: David Goodman <dgoodman_at_PHOENIX.PRINCETON.EDU>
Date: Sun, 8 Oct 2000 23:45:06 -0400

Bill, I think you are taking a perhaps excessively rosy view of this.
People looking for information look in whatever source they can most
easily access, that they are accustomed to, and that they believe will
find at least something (in that order of importance). General web engines
fill these roles admirably.
They are rarely the best source--it can often take examining many more
than 50 items to find really relevant material, and distinguishing the
material worth following up from the rest is often not at all easy. They
certainly have their place, but their effective use requires knowing what
their place is, and knowing how to search efficiently. These are teachable
skills, and good librarians by now have learned them, and at least in my
experience some
patrons are more willing to learn these skills from us than they are to
learn the conventional library skills.

This is not a plea for a return to conventional indexes--this is a plea
for better web indexes. Except for citation indexing, automated
approaches to the construction of indexes have proven not particularly
useful, and the manual assignment of metadata (o.k.a. indexing terms) has
proven very unreliable and inconsistent
except in special fields (e.g. geographic coordinates, chemical names).
This applies both to conventional and web indexes. the web, of course,
possesses the greater potential for experimentation. Until then,
the human searchers' skills will remain necessary. (They are, to
oversimplify a little, the
combination of a knowlwedge of what's out there to be found, empathy
with the user to determine what is wanted, and common sense and experience
in finding it.)

David Goodman, Princeton University Biology Library
dgoodman_at_princeton.edu 609-258-3235

On Mon, 2 Oct 2000, William Y. Arms wrote:

> Eric,
>
> I found your posting about web links very interesting.
>
> My observation (based on conversations with my colleagues and questioning
> the students in my Cornell classes) is that most of them use general Web
> search engines (notably Google) as their first choice way of looking for
> information. I can only hypothesize about their motivation, but here are
> some possible reasons:
>
> 1. Instant gratification -- If something is identified through Google, a
> single click brings the actual item.
>
> 2. Recall is more important than precision -- Users do not mind scanning
> 10-50 items (many of which are clearly irrelevant), so long as they find
> something. Low precision does not matter with a good ranking algorithm.
>
> 3. Two-step coverage -- The facts that (a) Google indexes a billion items
> and (b) its ranking algorithm emphasizes general materials means that it
> usually finds a good introduction or a good overview of a topic, which
> often acts as a guide to more detailed information.
>
> However, perhaps the most instructive insight came from a senior memeber of
> Springer-Verlag. As a good marketing firm, Springer has observed that
> their potential customers are heavy users of web search services.
> Therefore, they are setting up web materials explicitly designed for the
> web crawlers to index.
>
> The potential advantage of additional metadata is to improve the precision
> of searching. This will be increasingly important as the volume of online
> information grows. For this reason, I am an advocate of the Open Archives
> approach. The Open Archives approach also provides a good way to provide
> access to materials that cannot be found by web crawlers, (e.g., they are
> formats other than text, dynamic, held in databases or restrocited access).
>
> Bill
>
>
> ==================================================================
> William Y. Arms
> Professor of Computer Science email: wya_at_cs.cornell.edu
> Cornell University web: http://www.cs.cornell.edu/wya
> 5159 Upson Hall telephone: 607-255-3046
> Ithaca, NY 14853 fax: 607-255-4428
> ==================================================================
>
Received on Mon Jan 24 2000 - 19:17:43 GMT

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:45:53 GMT