Blog Archive: 2009

What were we thinking?

on Comments (6)

Preservation is a branch of library science dedicated to the maintenance of physical artifacts. Digital preservation, its modern offspring, concerns itself with the preservation of digital artifacts such as documents, movies, audio recordings, etc. But the challenges of digital preservation are complicated by interactivity characteristic of many digital artifacts. It’s not enough to save the bits, if the goal is to understand the experience of using something in its original form. I have in mind such things as interactive fiction, video and computer games, and other similar artifacts.

Continue Reading

Have queries, want answers

on Comments (9)

Sarah Vogel’s comment on yesterday’s post got me thinking about recall-oriented search. She wrote about preferring Boolean queries for complex searches because they gave her a sense for when she really had exhausted a particular topic, something that’s often required for medical literature reviews. But we really have multiple problems here, that it may be useful to decouple: one is the issue of coverage (did we find all there was to find?) and the other is ranking (the order in which documents are shown).

Continue Reading

Open-source queries

on Comments (7)

Every once in a while a Twitter query turns up something completely unexpected. I suppose that’s one reason for having them.  My query on all things PubMed recently turned up the following gem: a blog entitled PubMed Search Strategies. What is it? A list of queries. What? PubMed Queries, in all the Boolean glory. The latest pair of posts are pharmacoepidemiology — keywords, and its paternal twin, pharmacoepidemiology — MeSH.  The queries run for 39 and 13 terms, respectively. No average 2.3 word Web searches these.

Continue Reading

The Library of Google

on Comments (1)

In “The Library of Babel“, Jorge Luis Borges describes a library “…composed of an indefinite, perhaps an infinite, number of hexagonal galleries… ” lined with shelves of books. Unfortunately, the books are not organized in any predictable manner, causing librarians to travel “… in search of a book, perhaps of the catalogue of catalogues…” The searches, though, are in vain, given the improbability of finding what you seek in an infinite collection.

Continue Reading

Updating PubMed

on Comments (5)

I just watched an interesting webcast by David Gillikin, Chief of NLM’s Bibliographic Services, about the upcoming changes to the PubMed interface, followed by extensive Q&A. There was some confusion about how existing functionality would be mapped to the new interface, and understandable concern that the familiar interface would become dramatically less so. From an outsider’s perspective, the changes that were implemented looked reasonable, reducing the clutter of the existing design with some simplified controls and a more modern look and feel.

Continue Reading

JoDI is a teenager

on Comments (3)

Well, almost.  JoDI, the Journal of Digital Information, founded by Wendy Hall and Gary Marchionini, has been publishing papers online since 1997 with Cliff McKnight as the Editor-in-Chief.  JoDI is a peer-reviewed online journal organized into several themes, including digital libraries, hypermedia systems, hypertext criticism, information discovery, information management, social issues of digital information, and usability of digital information.

Continue Reading

What a tangled MeSH we weave

on Comments (14)

William Webber recently wrote an interesting analysis of the reports of the original Cranfield experiments that were so influential in establishing the primacy of evaluation in information seeking, and in particular a certain kind of evaluation methodology around recall and precision based on a ground truth. One reason that the experiments were so influential was that they provided strong evidence that previously-held assumptions about the effectiveness of various indexing techniques were unfounded. Specifically, the experiments showed that full-text indexing outperformed controlled vocabularies. While this result was shocking in the 1950s, 50 years later it seems banal. Or almost.

Continue Reading

In search of data

on

Having seen the recent news of gun-toting protesters at health reform meetings, I got into a discussion with my wife about gun control, and you know where that can lead. Yes, that’s right, to exploratory search. I had some hypotheses about the relationship between gun control and crime, and wanted to find some data to test them. I needed to find some crime statistics by state, and to cross-reference it with some aspects of states, including the degree of urbanization, population density, laws, etc. While I thought the odds of finding a canned analysis of my hypotheses was small given the amount of time I was willing to devote to the problem, I did try a few obvious queries. No luck.

Continue Reading

WebNC, a VNC for Web Applications

on

Recently, we presented our work on WebNC at several venues, including WWW 2009 in Madrid, Hypertext 2009 in Turin, Italy, and at a very interesting SF Bay Area Google App Engine Developers meetup in Palo Alto, CA.

WebNC is a tool for sharing your browser window in real-time with someone else. It’s similar to screen sharing tools like VNC or WebEx, except it’s built for sharing only web pages. This sounds limiting, but since a lot of work is done inside web browsers these days (browsing, editing documents, watching videos, booking reservations, vacations, reading email), we thought it would be useful. For example, my wife always calls me when she rents a car online: what car model should she pick? With WebNC, she can easily show me her browser window and we can talk more efficiently as I can see what she sees on her screen.

Continue Reading

Musings on spam

on Comments (3)

I get a fair bit of spam. Every day I delete about 400 messages that my spam filter catches; this blog has amassed over 7,000 spam comments in six months or so; and now, Twitter is getting spammy too. I’ve noticed a rash of twitter-spam-bot followers recently, and am quite confused as to what they are trying to achieve.

Continue Reading