A Quick Review of Google Scholar

Just as with Google Web Search, Google Scholar orders your search results by how relevant they are to your query, so the most useful references should appear at the top of the page. This relevance ranking takes into account the full text of each article as well as the article’s author, the publication in which the article appeared and how often it has been cited in scholarly literature. Google Scholar also automatically analyzes and extracts citations and presents them as separate results, even if the documents they refer to are not online. This means your search results may include citations of older works and seminal articles that appear only in books or other offline publications. — About Google ScholarA Quick Review of Google Scholar (Jerz’s Literacy Weblog)

Google Scholar just went live yesterday. Thanks to my former student, Matt Hoy, for sending me the link. Here are my immediate thoughts.

First Usenet archives, then Blogger, then G-mail, then desktop search, and now academic research. [Update, 19 Nov: How could I forget Google News? –DGJ] Google Scholar is just as simple as Google; the familiar brand name will probably be irresistible once the word gets out. One often hears laments from faculty who note that students are reluctant to go to the library. Some are just as reluctant to use the library’s official research databases, preferring instead the simplicity of findarticles.com, so of course they will be attracted to Google Scholar.

Google Scholar Looks Promising

In general, I like Google’s new service. While it offers only some of what you would expect from a library database, the convenience of the one-box-fits-all interface is very attractive. Like most Google services, what really makes the best impression is the added features that you never thought would be useful, but that soon make you wonder how you lived without them. Google scrapes the content of online articles, which means that if the online articles mention offline resources (such as old-fashioned books or anything else that is not a web page), Google will learn about them.

It looks like Google’s bots are capable of reading bibliographical information in many different formats and generating database entries for resources that are cited in web-accessible documents, but that are not accessible on the web directly. This may help make up for the fact that regular Google is naturally skewed towards serving up information that is readily available online. For some search topics, the highest-ranked hits are older academic books that are not available online.

A search for “Grand Theft Auto” brings up references to the game itself, not just articles about the game. So new media objects that are not web pages and books that are not online can also accumulate page rank. This is good.

Google’s academic research database isn’t perfect.

Nothing is perfect, of course. Naturally I first checked to see which of my own publications were indexed. Not all of them are, but I did find the full text of one of my articles that I thought wasn’t available online. Not bad.

My first academic article, which has been online since 1997, isn’t in the database. I did a search for the York Corpus Christi play, and followed a link to what was supposed to be an article on “Signifying God” in the play, but turned out to be the abstract of an article on the trial of Charles I (the full text of which was not available).

Search results also include references to offline books, with a link to a service that permits you to find nearby libraries that own copies. It gave me a pleasant little Thursday morning ego boost to see how many copies of my book there are in university libraries in Pennsylvania (for example). More important, the tally of number of links to a particular source offers a quick guide to texts that one assumes to be influential.

Google’s bots can’t improve on the accuracy of the information they index, and since human researchers have been known to cite sources without reading them, the page-ranking algorithm is not a substitute for real peer-review.

Scholars in technical fields refer to me as “Jerz, DG”, while scholarship in humanities fields uses my full name. A human editor would catch that kind of thing, but a bot cannot.

On my second query I saw results taken from what look like papers published on a course website. One assumes that the countless undergraduate papers that will be indexed by the service won’t themselves be cited by anyone other than other undergraduates, but the sheer numbers of students citing other student work is going to skew the search results. In this area, Google Scholar is no worse than regular Google, and somewhat better (since even those authors whose work is mistaken for an academic paper will have done some filtering of their source material).

In a regular Google search for “internet addiction,” the top hits are commercial sites selling books and tapes that offer cures. The Google Scholar search returns the publications of the leading proponent of “internet addiction disorder,” who also happens to be the same person behind the top commercial site (but that’s a different story). And fairly high on the list of the Google Scholar results is an article that questions the validity of internet addiction disorder.

Students and Google Scholar

I just got back from proctoring a “Research Skills Quiz,” where I brought my freshman comp students to a computer room, gave them a random topic (such as “health issues on university campuses”) and told them they could leave the room after they showed me three peer-reviewed academic articles. A few finished the exercise within minutes, most finished it by the end of the period, and a few will have to do a make-up homework assignment. Several were unable to distinguish between letters to the editor that appeared in academic journals, editorials and position statements, and full-length articles in which scholars present their original research.

Google Scholar is not going to help those students develop research strategies. Students who are determined to muddle through without actually learning may be distracted slightly less, but that will probably give them a false sense of relief that will only delay the inevitable jolt that will shake them out of the high school mindset.

One of the features that I like — the fact that Google Scholar indexes offline materials – is likely to frustrate the entry-level student using Google for an assignment with a pressing deadline. Having been conditioned to think that Google is the easy way to access all information known to human kind, they may be confused by the frequent dead ends and links to printed books that are only available in libraries.

Conclusion

I think Google is a step or two away from extending its reach to yet another area of information technology… and Google is now that much closer to fulfilling Vannevar Bush’s dream of the Memex.

When it comes to the kind of searching that I might do in order to refresh my lecture notes as I prepare to teach a class (when I’m looking for fresh ideas, rather than any particular answer), as I feel my way though a subject that I don’t know terribly well, or for the casual academic research that I do when pursuing inspiration, backtracking serendipity, or just kicking around a topic for a new proposal, I’ll probably turn to Google Scholar first – but that’s because I’m confident that I will be able to filter the search results as I see them. Students who don’t have that skill won’t be served by Google’s coolness.

Librarians and teachers who don’t educate themselves about how to use Google Scholar’s strengths, and make a strong case for when and where alternatives are preferable, will do their students a disservice.