Blog Archive: 2009

The many faces of PubMed search

by Gene Golovchinsky on September 24, 2009 Comments (2)

The number of third-party tools for searching PubMed data seems to be increasing recently. As the NLM is about to roll out a new search interface, companies are starting to offer alternative interfaces for searching this important collection. The attraction is obvious: a large, motivated group of searchers, an important information need, and a manageable collection size. A decade ago, over 20 million searches were done monthly through the NLM site, and the numbers are surely higher today; the collection is large but not huge — currently over 17 million entries (some with full text), occupying somewhat more than 60GB of disk space. Thus we see an increasing number of sites offering search over this collection, including PubGet, GoPubMed, TexMed, and HubMed. The offerings range from basic to flashy, and appear to be aiming at different groups of searchers.

Search is not Magic

by Gene Golovchinsky on September 10, 2009 Comments (7)

A discussion among commenters on a post about PubMed search strategies raised the issue of how people need to make sense of the results that a search engine provides. For precision-oriented searches a “black box” approach may make sense because as long as the system manages to identify a useful document, it doesn’t matter much how it does that. For exploratory search, which may be more recall-oriented, having a comprehensible representation of the system’s computations is important to assess coverage of your results. This suggests the need to foster useful mental models, rather than relying on the system to divine your intent and magically produce the “right” result.

Have queries, want answers

by Gene Golovchinsky on September 2, 2009 Comments (9)

Sarah Vogel’s comment on yesterday’s post got me thinking about recall-oriented search. She wrote about preferring Boolean queries for complex searches because they gave her a sense for when she really had exhausted a particular topic, something that’s often required for medical literature reviews. But we really have multiple problems here, that it may be useful to decouple: one is the issue of coverage (did we find all there was to find?) and the other is ranking (the order in which documents are shown).

Open-source queries

by Gene Golovchinsky on September 1, 2009 Comments (7)

Every once in a while a Twitter query turns up something completely unexpected. I suppose that’s one reason for having them. My query on all things PubMed recently turned up the following gem: a blog entitled PubMed Search Strategies. What is it? A list of queries. What? PubMed Queries, in all the Boolean glory. The latest pair of posts are pharmacoepidemiology — keywords, and its paternal twin, pharmacoepidemiology — MeSH. The queries run for 39 and 13 terms, respectively. No average 2.3 word Web searches these.

Updating PubMed

by Gene Golovchinsky on August 28, 2009 Comments (5)

I just watched an interesting webcast by David Gillikin, Chief of NLM’s Bibliographic Services, about the upcoming changes to the PubMed interface, followed by extensive Q&A. There was some confusion about how existing functionality would be mapped to the new interface, and understandable concern that the familiar interface would become dramatically less so. From an outsider’s perspective, the changes that were implemented looked reasonable, reducing the clutter of the existing design with some simplified controls and a more modern look and feel.

What a tangled MeSH we weave

by Gene Golovchinsky on August 26, 2009 Comments (14)

William Webber recently wrote an interesting analysis of the reports of the original Cranfield experiments that were so influential in establishing the primacy of evaluation in information seeking, and in particular a certain kind of evaluation methodology around recall and precision based on a ground truth. One reason that the experiments were so influential was that they provided strong evidence that previously-held assumptions about the effectiveness of various indexing techniques were unfounded. Specifically, the experiments showed that full-text indexing outperformed controlled vocabularies. While this result was shocking in the 1950s, 50 years later it seems banal. Or almost.

Categories

Archive

Blog Archive: 2009