Google Scholar will soon be 10 years old. It is amazing how time flies. Seems like it was just yesterday that Alex and I were scrambling to put everything in place for the launch. To help celebrate the anniversary, we have invited friends and colleagues in scholarly communication to share their thoughts. About Scholar, about scholarly communication and about future directions. These will appear in a 10th anniversary blog post series. The first post in the series is by John Sack, the founding director of HighWire Press. - Anurag Acharya
Helping Researchers See Farther Faster
John Sack, Founding Director, HighWire Press
HighWire
Press started at Stanford University almost 20 years ago -- we
launched the Journal of Biological Chemistry Online in early 1995 --
about the same time that Google's founders were working in the same
Stanford Quadrangle on the foundations for Google. It took
until 2002 to get our two efforts together and index HighWire-hosted
scholarly articles in Google. This project increased usage of
the articles by one to two orders of magnitude, even though their
abstracts had been fully indexed in PubMed right from the start.
Two years later, in 2004, Google Scholar arrived.
In the twenty years since HighWire began, and in the ten years since
Google Scholar beat a path to the door of scholarship, what have we
achieved? We know the answer to that question from interviews
we did in 2002 and again in 2012-2014 with over sixty researchers.
Back in 2002, people still used the word "e-Journal" to describe
the electronic version of a "print journal". Researchers
told us they needed better ways to locate content across all the
different sources of full-text – publisher sites each had their own
separate search engines, and PubMed searched only abstracts.
We collectively solved that problem -- publishers took a big leap in
providing the Google indexer with access to subscriber-only content.
So when HighWire asked Stanford researchers in 2012 interviews
about the challenges of searching, they said:
"Finding is easy..."
....but reading is hard."
We had so well-solved the search problem that people found more than
they could handle. This wasn't just a relevance-ranking problem
-- useless stuff showing up in search results. There was important
material in those results and it needed to be evaluated to satisfy a
researcher's sense of thoroughness.
Reading Faster
To “read” many articles in a short period of time, researchers want
to be able to absorb the gist of an article quickly, and be able to
judge its quality and relevance. In our interviews with
researchers, we heard strong support for adding visual abstracts to
articles (as the American Chemical Society has been doing for years
in all of its journals); for adding "take home messages" to
articles indicating the significance of an article in the context of
what is known and what the article adds (often found in clinical
journals, like the BMJ, but now also appearing in basic-science
journals such as PNAS and the JBC); and for a contextualized 'figure
reading' experience (such as is found in the Lens viewer introduced
in eLife).
All of these help researchers take in an article faster. None of these
aids is available from Scholar search results, so readers must
visit the sites where the full text is found. This “pogo-sticking”
from search result to article and back and forth may seem normal and
natural to us in the publishing industry. But as consumers we rely
on Google showing augmented search results: if Google results stopped
showing movie and restaurant “star” ratings, and restaurant price
range “$$$” in its search results we’d think there was a bug!
How can Google Scholar meet this "read faster" challenge? How
search evolves on this front will affect how researchers and
publishers do their work of finding audiences.
Contextualization of References
One way to speed scholarly literature research would be to improve the
“directedness” of search results -- don't just give me a list of
articles, but give me or get me to paragraphs in context. Clearly Scholar knows the context for matching a query's criteria
since it shows a snippet from the text. Why not have Scholar
and publisher sites collaborate a bit more to help readers get
quickly from a result list to the first paragraph that matches a
search, then on to the next matching paragraph, and so on.
And if Scholar can do that with search results, perhaps it can also help
us with the too-arduous task of going from a citation embedded in an
article, to the specific part of the cited article that is being
referenced. Book references contain page numbers; why should journal
articles be less specific?
Perhaps we can see how unhelpful this is by stepping out of our
scholarly-publishing tradition and shifting to the consumer context:
Imagine if a Google search provided you with a link to only the web
site (i.e., home page) rather than to the specific page on a site
that matched your search! That's what we settle for with
scholarly journal references.
Searching For Images
We know from researcher interviews that in some fields people don't
start by reading the article text per se, they "read" the
images and then look at the narrative around the images for context.
In some fields, figures tell the story -- just as in graphic
novels and comic books, I suppose! -- and an article is figures woven
together by text. This isn’t only for disciplines that are
visual in the traditional sense, but perhaps as true for equations in
a physics article, structures in a chemistry article, or tables in a
clinical-trial article.
So why not make it possible to search images by searching the figure
legend, or text in a figure or table, or closed caption in a video.
Google already provides a basic image search. Perhaps if publishers
would provide Scholar with rights to display low-resolution article
images – the visual equivalent of a snippet – we could have a
scholarly version of image search.
There are great opportunities for innovation ahead of us. We will need to take some
risks, build experiments and collaborate across boundaries between
stakeholders. That’s what we have done for the past decade, and
look how far we have come --
“finding is easy”!