Вы находитесь на странице: 1из 6

Strategic Reading, Ontologies, and the Future of Scientific Publishing

Allen H. Renear and Carole L. Palmer


Science 325, 828 (2009);
DOI: 10.1126/science.1157784

This copy is for your personal, non-commercial use only.

If you wish to distribute this article to others, you can order high-quality copies for your
colleagues, clients, or customers by clicking here.

Downloaded from www.sciencemag.org on July 31, 2012


Permission to republish or repurpose articles or portions of articles can be obtained by
following the guidelines here.

The following resources related to this article are available online at


www.sciencemag.org (this information is current as of July 31, 2012 ):

A correction has been published for this article at:


http://www.sciencemag.org/content/326/5950/230.1.full.html
Updated information and services, including high-resolution figures, can be found in the online
version of this article at:
http://www.sciencemag.org/content/325/5942/828.full.html
A list of selected additional articles on the Science Web sites related to this article can be
found at:
http://www.sciencemag.org/content/325/5942/828.full.html#related
This article cites 28 articles, 4 of which can be accessed free:
http://www.sciencemag.org/content/325/5942/828.full.html#ref-list-1
This article has been cited by 4 article(s) on the ISI Web of Science
This article has been cited by 10 articles hosted by HighWire Press; see:
http://www.sciencemag.org/content/325/5942/828.full.html#related-urls
This article appears in the following subject collections:
Scientific Community
http://www.sciencemag.org/cgi/collection/sci_commun

Science (print ISSN 0036-8075; online ISSN 1095-9203) is published weekly, except the last week in December, by the
American Association for the Advancement of Science, 1200 New York Avenue NW, Washington, DC 20005. Copyright
2009 by the American Association for the Advancement of Science; all rights reserved. The title Science is a
registered trademark of AAAS.
REVIEW
tain changes in workflow and production pro-
cesses, software tools, user behavior, or business
models. STM publishers were already creating
Strategic Reading, Ontologies, and Adobe PostScript files for print production.
These could be automatically converted to the
the Future of Scientific Publishing Adobe page description language format
(PDF), which was suitable for distribution over
the existing Internet and could be browsed with
Allen H. Renear* and Carole L. Palmer existing free software applications. Users, in turn,
were presented with a printlike experience that
The revolution in scientific publishing that has been promised since the 1980s is about to take was at once familiar and yet had additional ad-
place. Scientists have always read strategically, working with many articles simultaneously to vantages, including Internet delivery, digital stor-
search, filter, scan, link, annotate, and analyze fragments of content. An observed recent increase age, full-text searching, and local printing, all of
in strategic reading in the online environment will soon be further intensified by two current which was easily realized with existing technol-
trends: (i) the widespread use of digital indexing, retrieval, and navigation resources and (ii) the ogies. Hence, as its value became apparent, PDF-
emergence within many scientific disciplines of interoperable ontologies. Accelerated and based digital STM publishing emerged relatively
enhanced by reading tools that take advantage of ontologies, reading practices will become even quickly, with few changes in production or exist-
more rapid and indirect, transforming the ways in which scientists engage the literature and ing infrastructure.

Downloaded from www.sciencemag.org on July 31, 2012


shaping the evolution of scientific publishing. Since 1992, processor speeds, memory, stor-
age, and bandwidth to the desktop have under-
gone enormous improvements, as have connectivity
he 1980s abounded in descriptions of a discovery applications (2), but by “ontology- and costs. Standard protocols for network com-

T coming new world of scholarly commu-


nication, predicting functionality that we
knew was possible and would soon be techno-
aware” strategic reading tools as well.

Why Will the Revolution Happen Now?


munication have been adopted, new software tools
and software engineering strategies have emerged,
and there is now a supporting infrastructure of
logically feasible. This imagined world, which was When it was launched in 1992, the Online information professions and institutions. The per-
never fully realized, predicted advanced naviga- Journal of Current Clinical Trials, jointly vasive use of the World Wide Web via intuitive
tion; discipline-specific intelligent tools for search- designed by the American Association for the Web browsers is an especially visible change,
ing, browsing, and analysis; reader-initiated Advancement of Science and the OCLC Online and the widespread use of Extensible Markup
hypertext linking; “live” data-driven diagrams; Computer Library Center, was seen by some as Language (XML), with its associated standards
computationally available information objects; the beginning of the long-awaited world of ad- and technologies, provides a foundational frame-
searchable indexed annotations; thorough-going vanced digital publishing. However, the journal work for storing, processing, and presenting in-
interoperability; and so on. Substantial improve- failed to flourish, and the new world did not ma- formation on the Web. In addition, an important
ments in hardware and software and an infrastruc- recent development is the convergence within
terialize. In retrospect, we can see that in the early
ture of networked communications now make this 1990s, none of the basic conditions required for the STM publishing community on a single
anticipated functionality possible. Lying at the an advanced scientific publishing system existed. XML schema for the representation of scien-
heart of the changes taking place is an escalation Not only was the basic technology and infra- tific articles: the National Library of Medicine
of strategic reading practices. structure inadequate, but the entire publishing (NLM)’s Journal Archiving and Interchange Tag
Scientists have always read strategically, system also would have required extensive co- Suite (3).
working with many articles simultaneously to ordinated changes.
search, filter, compare, arrange, link, annotate, Although there was no revo- 7000 Total MEDLINE abstracts
and analyze fragments of content. Now, however, lution, an important transforma- Papers published on cell cycle
two important trends are interacting to support tion did take place in the 1990s.
and intensify the effectiveness of these practices. In 1993, very few scientific, tech- 6000
The first is the wide-scale use by scientists of nical, and medical (STM) jour-
digital indexing, retrieval, and navigation re- nals had an electronic version, 5000
sources (such as PubMed, Web of Science, the and yet by 2003, virtually all of
ACM Digital Library, NASA’s Astrophysics Data them did. For the daily work rou-
System, CiteSeer, Scopus, and Google Scholar) tines of most scientists, that new 4000
Count

to exploit large quantities of relevant information format had already become more
without reading individual articles. The second is important than print. The system 3000
the emergence within many scientific disciplines of digital publishing that emerged
of ontologies for representing and linking scien- from 1993 to 2003 was impres-
tific data. This convergence of digital resources sive in some respects, but was still 2000
and data-linking ontologies will result in even more largely another case of new tech-
rapid and indirect use of the literature, supported nology compromised by imitation 1000
not only by text mining (1) and literature-based of the old.
The reasons a more radical
change failed to occur are under- 0
Center for Informatics Research in Science and Scholarship, standable in retrospect, and they 1950 1960 1970 1980 1990 2000
Graduate School of Library and Information Science, Year
University of Illinois at Urbana-Champaign, Champaign, also suggest why we are now on
IL 61820, USA. the cusp of a larger change. None Fig. 1. Increase in number of papers published each year in bio-
*To whom correspondence should be addressed. E-mail: of the developments during this medicine and in one specialized topic, the cell cycle. [Adapted with
renear@illinois.edu period required costly or uncer- permission from MacMillan Publishers, Nature (5), copyright 2006]

828 14 AUGUST 2009 VOL 325 SCIENCE www.sciencemag.org


REVIEW
The driving force for change remains the through resources, changing 300
Average number of articles read per year
same: the growing quantity and complexity of search strings, chaining refer- Average minutes spent reading per article
information in combination with limited time for ences backward and citations for-
250
reading. But in some disciplines, we seem to be ward, dodging integrator and
past the point where any further specialization of publisher sites to find open-access
research focus or elaboration of collaborative copies, continually working to re-
200
relationships are effective (4, 5) (Fig. 1). Just as duce the number of clicks required
the increased quantity of information and general for access. By note-taking or cut-

Count
intensity of scientific activity is reaching the point ting and pasting, scientists often 150
where it cannot be sustained with current prac- extract and accumulate bits of
tices, technology and user behavior are making specific information, such as find-
new practices feasible, and research scenarios that ings, equations, protocols, and data. 100
a decade ago were utopian are now widely antic- In this process, rapid judgments are
ipated by practicing scientists. P. Bourne, a Public made—such as assessments of
Library of Science journal editor, offers this vision relevance, impact, and quality— 50
of the near future while search queries are being
formulated and refined. (Fig. 3).
0
the scientific literature will seamlessly The goal often seems to be un-
1977 1980 1985 1990 1995 2000 2005
provide annotation of records in the bio- differentiated assimilation of in-

Downloaded from www.sciencemag.org on July 31, 2012


Year
logical databases. Imagine reading a formation about a domain or a
description of an active site of a bio- problem at hand, and the online Fig. 2. Increase in the number of papers read by scientists per
logical molecule in a paper, being able experience may be highly valu- year and decrease in minutes spent reading each paper, trends
to access immediately the atomic coor- able, even though no clear aim is based on a series of survey studies conducted by Tenopir et al.
dinates specifically for that active site, and met and no articles to read are between 1977 and 2005 (10, 34, 35).
then using a tool to explore the intricate located. In a compelling analogy,
set of hydrogen-bonding interactions Nicholas et al. (8) describe a “slightly irritated” mid-1990s. Furthermore, though the average read-
described in the paper.… Alternatively, if father watching his young daughter flick from ing time per article did not change much from 1977
you are starting with the data … viewing channel to channel while watching television to the mid-1990s (48 versus 47 min), it started fall-
the chromosome location of a human ing in the mid-1990s and is now just over 30 min
single-nucleotide polymorphism associ- [the] father asks … why she cannot make per article (Fig. 2). At the same time, identifying
ated with a neurological disorder, … im- up her mind and she answers that she is papers by searching online increased more than
mediately access a variety of papers not attempting to make up her mind but is fourfold between 1977 and 2005. These changes
ranked in order of relevance to your pro- watching all the channels. … gathering in journal use are far greater in STM disciplines
file … pinpointing the reference to the information horizontally, not vertically than the averages over all disciplines, suggesting
single-nucleotide polymorphism in the full- (8), p. 40. that as work with the literature has moved online,
text article (6), p. 179. scientists are scanning more and reading less.
And they conclude Early digital library research also showed how
This sort of information gathering goes scientists scan individual printed journal articles to
well beyond conventional digital publishing Now we see what the migration from identify key components—such as tables of con-
and reveals why the current state of affairs has traditional to electronic sources has tents, references, figures, formatted lists, equations,
failed to meet some expectations. As chemists meant in information seeking terms. We and scientific names—for quick review and ab-
P. Murray-Rust and H. S. Rzepa remarked in 2004, are all bouncers and flickers, and the sorption of information (11, 12). More recent
“The current transition to [PDF-based] e-journals success of Google is a testament to that, studies of the research process have emphasized
seems to be welcomed by many—but not us … a with its marvelous ability to enhance and the varied ways in which scientists work with
cultural change in our approach to information is amplify this flicking and bouncing (like a information (13, 14). The literature is scanned
needed” (7). really good remote)…. In the past, not only to position new findings in cognate
information seeking was seen to be the fields and learn about collaborators’ domains,
How Are Scientists Working with the Literature? first step to creating knowledge. Now … but also to monitor the progress of peers and
Scientists have always strived to avoid unnec- it is a continuous process (8), pp. 41–42. competitors. Information is collated to compare
essary reading. Like all researchers, they use index- measurement and instrumentation details; it is
ing and citations as indicators of relevance, Just as the aim of channel surfing is not to also used to compile personal collections in evolv-
abstracts and literature reviews as surrogates for find a program to watch, the goal of literature ing areas of interest and to extract the facts and
full papers, and social networks of colleagues and surfing, is not to find an article to read, but rather evidence needed to build databases. These are
graduate students as personal alerting services. to find, assess, and exploit a range of information all aspects of strategic reading, a robust, well-
The aim is to move rapidly through the literature by scanning portions of many articles. This be- entrenched behavior that is vastly more efficient
to assess and exploit content with as little actual havior is common among scientists (9). in the digital realm and is thus a promising target
reading as possible. As indexing, recommending, Longitudinal studies of e-journal use confirm for digital support.
and navigation has become more sophisticated in that scientists are indeed “reading” more papers
the online environment, these strategic reading at a faster pace (10). That is, the total time spent How Is Scientific Information Being Represented?
practices have intensified. reading journal articles has risen only a little, Structured terminologies for representing scien-
Now, as scientists search and browse, they are whereas the number of journal articles read per tific data, along with standard XML-based tech-
making queries and selecting information in year has gone up much faster and appears to be niques for defining and using these terminologies,
much tighter iterations and with many different growing still. The number of articles read (as are forming the basis for new types of scientific
kinds of objectives in mind, almost as if they distinguished from those merely browsed) by publishing. Although computer-processible scien-
were playing a fast-paced video game. They sweep scientists was ~50% higher in 2005 than in the tific terminologies range from simple standardized

www.sciencemag.org SCIENCE VOL 325 14 AUGUST 2009 829


REVIEW
vocabularies to sophisticated formal systems of” are transitive (e.g., if X is part of Y and Y is structure, even without the schema that defines the
with logical axioms, we have called all of them part of Z, then X is part of Z). It is easy to see not language (this is a virtue of XML). If a schema is
ontologies. only how the controlled vocabulary of a shared available, additional processing is possible, such
Ontologies are particularly prominent in the ontology can facilitate the integration of data from as verifying that the data are complete and cor-
biological sciences (15, 16). One example of multiple sources, but also how relationships such rectly organized; a schema can also configure edit-
rapid adoption is the Gene Ontology (GO) (17), as “is a”, “part of”, and “regulates” can support ing tools so that human coders are only offered
which started in 1998 to support the annotation of other information management tasks as well, in- legal coding options, making coding easier and
genes and gene products and is now very widely cluding information retrieval and text mining, error syntax errors impossible. In just 10 years, XML
used, containing more than 25,000 terms and 3.3 checking, and automated inferencing. and related supporting software and standards
million annotations. Although many biological Neither controlled vocabularies nor even have come to dominate information represen-
ontologies were originally developed indepen- logic-based ontologies are entirely new, although tation in networked environments—all popular
dently, the need for interoperability has driven the enormous increase in the amount and com- Web browsers support XML, and most major
collaboration, a good example being the Open plexity of biological data makes such organiza- database systems import and export XML-formatted
Biomedical Ontologies (OBO), which currently tional strategies increasingly urgent. Now, however, data.
has 54 participating projects (18), including we can make ontologies and their applications Although using XML to declare and apply a
Microarray Gene Expression Data (MGED), computationally available and interoperable through terminological vocabulary improves interoperabil-
BioPAX, for biological pathways data, and Foun- well-supported standards associated with the ity and access to software applications, it does
dational Model of Anatomy (FMA). Internet and World Wide Web. have some limitations. XML schemas specify
Although the size, complexity, and logical de- In 1998, as work began on GO, the World syntax, not semantics (23). An XML schema does

Downloaded from www.sciencemag.org on July 31, 2012


sign of scientific ontologies may vary, a partial Wide Web Consortium (W3C) released XML not itself indicate how to interpret portions of a
description of GO, drawing on examples from the (21), a metalanguage for defining markup lan- particular XML tree structure in terms of scientific
GO introductory material, will illustrate some of guages (22) for representing information on the assertions, nor is it, alone, suitable for defining
their general features (19, 20). GO consists of three World Wide Web. Originally designed for document- logical relationships among terms. That informa-
separate ontologies: (i) molecular function, (ii) bio- oriented languages, XML was soon used for other tion must be recorded in the natural language
logical process, and (iii) cellular component. Within kinds of information as well. XML languages are documentation for the schema, but then it is un-
each of these, terms are uniquely identified, defined, defined by a computer-readable schema, which available for computer processing. To address this
and related in a network of “is a” relationships specifies, among other things, the terms of the problem, the semantic Web languages Resource
(e.g., a nuclear chromosome is a chromosome). markup language and the ways those terms can be Description Framework (RDF), Resource Descrip-
GO also contains the relationship “part of” (e.g., arranged in valid documents. XML organizes infor- tion Framework Schema (RDFS), Web Ontology
periplasmic flagellum is part of periplasmic space), mation as a hierarchical structure (an “ordered tree”) Language (OWL), and Semantic Web Rule Lan-
and recently, the relationship “regulates” and sub- of labeled nodes and attribute/value pairs and rep- guage (SWRL) were developed (24). These are
type relationships “positively regulates” and “neg- resents that structure in a linear format readable by computer-processible knowledge representation
atively regulates” were added. These relationships both humans and computers. Software can read languages that provide a standard technique for
have logical features; for instance, “is a” and “part data in this format and construct the correct tree defining ontologies and expressing assertions that
use terms from those ontologies. Although tech-
nically independent of any particular computer-
encoding format, RDFS and OWL each have a
standard XML syntax that is now well-supported
Searching, browsing, chaining, linking by software applications and widely used for on-
tology representation.
Today, an emerging infrastructure of edu-
cation, research, conferences, organizations, and
software tools is sustaining the development and
Scanning, assessing adoption of scientific ontologies and providing
opportunities for coordination to improve inter-
1 operability and share best practices. Particularly
Scanning, gathering important for biology are the National Center for
Biomedical Ontology, OBO, and the International
Society for Biocuration, as well as more broadly
defined organizations such as the National Center
2
for Biotechnology Information and the European
Bioinformatics Institute. One notable software ap-
plication for ontology development is the widely
3 Filtering used and well-supported Protégé ontology editor.

How Can Ontologies Help Scientific Publishing?


Originally motivated by the need for data inte-
gration, scientific ontologies are now being ex-
plored for STM publishing to support information
retrieval and text mining, with applications for
hypothesis generation and knowledge discovery
well underway. Nevertheless, reading-like engage-
Extracting, comparing, arranging, analyzing, annotating ment with scientific articles is not likely to dis-
appear entirely: The natural language prose of
Fig. 3. Current work with digital resources. scientific articles provides too much valuable

830 14 AUGUST 2009 VOL 325 SCIENCE www.sciencemag.org


REVIEW

Downloaded from www.sciencemag.org on July 31, 2012

Fig. 4. Two examples of ontology-aware text mining/retrieval systems that support strategic reading. (Top) Textpresso (www.textpresso.org/) and
(Bottom) iHOP (www.ihop-net.org/).

www.sciencemag.org SCIENCE VOL 325 14 AUGUST 2009 831


REVIEW
nuance and context to be treated only as data results from another discipline can have con- one hand, and the narrower objectives of text
(25). Scientists may have moved well beyond siderable impact on progress or the direction of mining on the other, responding directly to the
traditional reading, but they still remain engaged research. These are the kinds of information be- entrenched necessity and value of strategic read-
with the narrative of scientific articles and need haviors that we need to understand more fully to ing in the daily work of today’s scientists.
tools to help them read, and not only mine, that design tools that go beyond search and retrieval
narrative. to support creative strategic reading. References and Notes
The integration of ontologies into the scien- For ontology-aware reading tools to function 1. I. Spasic, S. Ananiadou, J. McNaught, A. Kumar,
tific literature has been recommended by leading well, terminological annotations must be included Brief. Bioinform. 6, 239 (2005).
2. D. R. Swanson, N. R. Smalheiser, Artif. Intell. 91, 183
scientists (26–28), and the current generation of in, or mapped to, the XML encoding of articles (1997).
ontology-based text mining and retrieval tools during the publishing production process, to con- 3. National Library of Medicine, http://dtd.nlm.nih.gov/.
in the biomedical sciences is already taking nect names and phrases in narrative text with ap- 4. B. Mons, BMC Bioinform. 6, 142 (2005).
advantage of natural language processing and propriate standard terminology. The emergence of 5. L. J. Jensen, J. Saric, P. Bork, Nat. Rev. Genet. 7, 119
(2006).
databases of annotations (5, 29, 30). One ex- the NLM schema as a standard XML encoding for 6. P. Bourne, PLoS Comput. Biol. 1, e34 (2005).
ample is Textpresso, an ontology-based mining scientific articles provides a promising shared con- 7. P. Murray-Rust, H. S. Rzepa, J. Digit. Inform. 5, issue 1 (2004).
and retrieval system that works with prepared text for terminological annotation; however, we 8. D. Nicholas, P. Huntington, P. Williams, T. Dobrowolski,
collections of articles, split into sentences and also need specific strategies that are economically J. Doc. 60, 24 (2004).
9. D. Nicholas, P. Huntington, H. R. Jamali, T. Dobrowolski,
annotated with terms from 33 ontology cate- sustainable within the current context of STM
Inf. Process. Manage. 43, 1085 (2007).
gories, three of which correspond to the GO publishing workflows, as well as remedies for 10. C. Tenopir, D. W. King, D-Lib Mag. 14, issue 11/12
ontologies (31). Results screens present a ranked “legacy data,” the articles already published and (2008).

Downloaded from www.sciencemag.org on July 31, 2012


list of sentences within a ranked list of articles, stored in repositories. To exploit terminological 11. B. Schatz et al., Computer 32, 51 (1999).
with term highlighting, and links to articles and annotations across the Internet, reading tools will 12. A. P. Bishop, Inf. Process. Manage. 35, 255
(1999).
external databases (Fig. 4, top). Reading the have to operate in real time to take advantage of 13. C. L. Palmer, Work at the Boundaries of Science:
sentences of an article in relevance order rather the ontologies that define and relate terms and Information and the Interdisciplinary Research Process
than narrative order is an example of strategic connect terms with relevant databases with the use (Kluwer, Dordrecht, Netherlands, 2001).
reading within an article. An example of strategic of “service-oriented architectures” (33). Finally, 14. C. L. Palmer, M. H. Cragin, T. P. Hogan, Inf. Process.
Manage. 43, 808 (2007).
reading across a collection is provided by Infor- the development of ontology languages with ad- 15. O. Bodenreider, R. Stevens, Brief. Bioinform. 7, 256
mation Hyperlinked over Proteins (iHOP), which ditional expressive power is needed, as well as (2006).
uses genes and proteins to create a network of continued support for evolving, coordinating, and 16. L. Strömbäck, D. Hall, P. Lambrix, Proteomics 7, 857
sentences and abstracts for searching and nav- harmonizing ontologies. (2007).
17. M. Ashburner et al., Nat. Genet. 25, 25 (2000).
igating MEDLINE abstracts (32). The iHOP 18. B. Smith et al., Nat. Biotechnol. 25, 1251 (2007).
database processes abstract sentences using Na- How Will Scientists Work with the Literature 19. Gene Ontology, www.geneontology.org/.
tional Center for Biotechnology Information taxon- in 2019? 20. The examples are from “An Introduction to the Gene
omy identifiers and the Medical Subject Headings Scientists will still read narrative prose, even as Ontology” www.geneontology.org/GO.doc.shtml.
21. World Wide Web Consortium, www.w3.org/XML/.
(MeSH) thesaurus and supplies pages of con- text mining and automated processing become
22. J. H. Coombs, A. H. Renear, S. J. DeRose, Commun. ACM
figurable results, in ranked lists of sentences re- common; however, these reading practices will 30, 933 (1987).
trieved from many abstracts (Fig. 4, bottom). become increasingly strategic, supported by en- 23. A. Renear, D. Dubin, C. M. Sperberg-McQueen,
Unlike similar explorations in the 1980s and hanced literature and ontology-aware tools. As C. Huitfeldt, in Proceedings of the 2002 ACM Symposium
on Document Engineering, R. Furuta, J. I. Maletic,
1990s, these are not computer science experi- part of the publishing workflow, scientific ter-
E. Muson, Eds. (Association for Computing Machinery
ments or pilot projects requiring substantial invest- minology will be indexed routinely against rich Press, New York, 2002), pp. 119–126.
ment and large upfront changes in infrastructure ontologies. More importantly, formalized asser- 24. World Wide Web Consortium, www.w3.org/2001/sw/.
and practices to scale them up for general use. tions, perhaps maintained in specialized “struc- 25. J. A. Blake, C. J. Bult, J. Biomed. Inform. 39, 314 (2006).
These are projects that are already producing tured abstracts” (27), will provide indexing and 26. J. Blake, Nat. Biotechnol. 22, 773 (2004).
27. M. R. Seringhaus, M. B. Gerstein, BMC Bioinform. 8, 17
practical and widely used tools. browsing tools with computational access to causal (2007).
and ontological relationships. Hypertext linking 28. D. Sholton, Learn. Publ. 22, 85 (2009).
How Do We Support and Shape These Changes? will be extensive, generated both automatically 29. M. Krallinger, A. Valencia, Genome Biol. 6, 224 (2005).
The infrastructures and services to support stra- and by readers providing commentary on blogs 30. J.-j. Kim, D. Rebholz-Schuhmann, Brief. Bioinform. 9,
452 (2008).
tegic reading practices will no doubt be promoted and through shared annotation databases. At the 31. H. M. Müller, E. E. Kenny, P. W. Sternberg, PLoS Biol. 2,
by open access and alternative publishing mod- same time, more tools for enhanced searching, e309 (2004).
els, which are already being widely discussed in scanning, and analyzing will appear and exploit 32. R. Hoffmann, A. Valencia, Nat. Genet. 36, 664
the academic community. However, research on the increasingly rich layer of indexing, linking, (2004).
33. I. Foster, Science 308, 814 (2005).
information behavior and the use of ontologies is and annotation information.
34. P. Boyce, D. W. King, C. Montgomery, C. Tenopir,
also needed. There are no technical obstacles to this tra- Ser. Libr. 46, 121 (2004).
Traditional approaches to evaluating infor- jectory, and it is already under way. The changes, 35. D. W. King, C. Tenopir, C. Montgomery, S. E. Aerni,
mation systems, such as precision, recall, and sat- as always, will be incremental: Scientists, who D-Lib Mag. 9, issue 10 (2003).
today already make extensive use of existing 36. We thank G. Bilder (Crossref), M. Sperberg-McQueen
isfaction measures, offer limited guidance for
(Black Mesa Technologies), M. Smith (MIT Libraries), and
further development of strategic reading technol- indexing and retrieval services, will encounter a W. J. MacMullen and L. C. Smith (University of Illinois) for
ogies. Finer-grained methods that analyze what steady stream of new enhancements and adopt helpful comments on an earlier version of this paper and
scientists actually do and value are required if we those that allow rapid and productive engage- L. Teffeau (University of Illinois) for assistance with the
want to understand the nearly subconscious tac- ment with the literature. The new functionality examples and illustrations. Earlier versions were presented
at the 2006 STM Innovations Seminar (London), the 2007
tics that govern second-by-second interactions will sometimes be provided as part of the appli- Annual Meeting of the American Chemical Society
with the literature and the nuances of intention cation interface (new features in PubMed, for (Chicago), the 2007 STM Spring Conference (Cambridge,
and use. We know, for instance, that scientists instance) or as shared external tools that users can MA), and the 2007 annual meeting of the American
often have trouble locating very problem-specific add to their Web browsers. These developments Library Association (Washington, DC). This research was
supported in part by NSF grant 0222848.
information (on methods and protocols, for chart a middle course between the already ob-
instance) and that the occasional exploration of solete activity of finding an article to read on the 10.1126/science.1157784

832 14 AUGUST 2009 VOL 325 SCIENCE www.sciencemag.org