Blog

Updating the Wikipedia

by Gene Golovchinsky on July 30, 2009

I’ve made some changes and additions to the Collaborative Search Engine page in the Wikipedia to expand the entry and add more references. The entry is by no means complete! Please contribute more prose and references to make it a useful resource. In particular, more prose would be useful around the task vs. trait distinction and the division of labor vs. the sharing of knowledge.

Building the Ivory Tower

by Gene Golovchinsky on July 29, 2009

I recently read on Jeff Dalton’s blog that a new open-source search engine, called Ivory, has been released by Jimmy Lin. Ivory is based on Hadoop, and is designed to handle terabyte-sized collections. Unlike Lucene, this is a research project, Jimmy Lin writes,

aimed at information retrieval researchers who need access to low-level data structures and who generally know their way around retrieval algorithms. As a result, a lot of “niceties” are simply missing—for example, fancy interfaces or ingestion support for different file types. It goes without saying that Ivory is a bit rough around the edges, but our philosophy is to release early and release often. In short, Ivory is experimental!

Living Laboratory

by Gene Golovchinsky on July 28, 2009 Comments (1)

In her talk at the IR Eval workshop at SIGIR 09, Sue Dumais called for an experimental platform for conducting research in information seeking (thanks Sakai-san!). She called it a Living Laboratory. This is a tremendous idea, the high tide that lifts all boats. Whether you’re interested in doing log analysis, interface design evaluation, building new indexing algorithms, or other kinds of research, having real data sets with real users and real information needs can move the field forward in ways that Cranfield-style experiments do not.

Can’t find that symbol?

by Eleanor Rieffel on July 26, 2009

Via Dave Bacon’s blog, I came across Detexify, a cool tool that enables you to find the LaTeX command for a symbol by drawing the symbol. LaTeX is the standard typesetting system for researchers in the mathematical sciences. One indication of its popularity is that Scott Aaronson lists “The authors don’t use TeX” as the first of his “Ten Signs a Claimed Mathematical Breakthrough is Wrong.” Unfair I know, but so it is.

SIGIR Twitter Archives

by Gene Golovchinsky on July 24, 2009

We’ve created some archives of twitter conversations for SIGIR 2009 and for some of the workshops associated with the conference. These archives are useful because Twitter messages tend to evaporate after a while.

I know of the following archives:

SIGIR09 proper, based on the #sigir09 and #sigir hashtags
The IREval workshop based on the #ireval09 hahstag
The SSM workshop based on the #ssm09 hashtag

If the other workshops had significant traffic, I am happy to archive & update the list above. TwapperKeeper is a service that archives twitter searches based on a specified hashtag. The data is then available through the web site and for download in tab- or semicolon-separated format. Saving your own copy means that you can refer to it later, and also makes it easier to do data mining or other research on the use of Twitter. I encourage people to download archives (although as new tweets come in the archives will get updated on TwapperKeeper) to make sure they persist even if TwapperKeeper doesn’t. Archive early, archive often.

SIGIR09: An aspectual interface for supporting complex search tasks

by Gene Golovchinsky on July 23, 2009 Comments (2)

Faceted search interfaces for metadata-rich datasets such as product information have been around for a while. e-Bay and Amazon are two obvious examples. Faceted search for textual data is only slowly making its way into the commercial realm (see NewsSift, for example) but have been receiving increasing attention in research. Villa et al. presented an interesting paper at SIGIR09 in which they compared different interface layouts for handling aspects, and compared the effectiveness of aspectual search with a conventional interface for different tasks.

File under: inconceivable

by Gene Golovchinsky on July 22, 2009 Comments (5)

Why doesn’t the SIGIR web site have a search box?

Query suggestion vs. term suggestion

by Gene Golovchinsky on July 22, 2009 Comments (2)

Diane Kelly presented an interesting (and much tweeted-about) paper at SIGIR this week. The paper, “A Comparsion of Query and Term Suggestion Features for Interactive Searching,” co-written with Karl Gyllstrom and Earl Bailey, looks at the effects that query and term suggestions have on users’ performance and preferences. These are important topics for interactive information seeking, both for known-item and exploratory search.

Sue Dumais, HCIR Poster Child

by Gene Golovchinsky on July 21, 2009 Comments (2)

At CHI 2007, in a workshop on exploratory search, we had a long discussion of the definition of exploratory search, during which Sue Dumais kept challenging the room to look broadly, bringing in examples and counter-examples not only from full text search, but from more structured datasets that were also fair game.

Exploratory search is just one part of HCIR; her work on adapting systems to users’ vocabulary (not vice-versa) that led to LSI, innovative search interfaces (“If in 10 years we are still using a rectangular box and a list of results, I should be fired.” ), finding and re-finding information on your personal computer, and personalization of search results all fit squarely into the HCIR space.

Those who attended the HCIR’08 workshop organized by Daniel Tunkelang (Endeca), Ryen White (MSR), and Bill Kules (CUA) got a great overview of Sue’s research. This week, during her opening keynote at SIGIR (see notes from Jeff Dalton and Jonathan Elsas, who, unlike me, were actually there!) Sue described the course of her career as an IR researcher, first at Bellcore and at Microsoft Research. In her career, she has consistently focused on the user both for inspiration for design, and for evaluating the systems.

“If you have an operational system and you don’t use what your users are doing to improve, you should have your head examined” (from Jeff Dalton)

I expect we’ll be seeing more interesting and innovative results from her group, both at SIGIR and at the HCIR workshop series.

Which future of search?

by Gene Golovchinsky on July 19, 2009

Alex Iskold recently wrote on the ReadWriteWeb about potential improvements in search that could be derived from incorporating evidence from one social network to affect the ranking of documents. The idea is that people you know, people with similar interests, friends-of-friends, authorities, and “the crowd” could all contribute to change the ranking on documents that a search engine delivers to you because the opinions or interests of all these people can provide some information to help disambiguate queries.

Categories

Archive

Blog