Blog

Is TREC good for Information Retrieval research?

by Gene Golovchinsky on July 6, 2009 Comments (1)

In his comment to an earlier post, Miles Efron reiterated the usefulness of the various TREC competitions to fostering IR research. I agree with him (and with others) that TREC has certainly been a good incubator both in its annual competition and in follow-on studies that use its data in other ways. And, as Miles points out,we have seen a proliferation of collections: everything from the original newspaper articles to blogs, video, large corpora, etc.

On the Science of IR

by Gene Golovchinsky on July 1, 2009 Comments (2)

Miles Efron posted recently on his take on the progress of the IR field in response to a question posted by Andrew Dillon at the last ASIST conference. Miles’ take was that progress was indeed being made for two reasons: the SIGIR conference has become more competitive over the years, and the diversity of corpora in the TREC umbrella has also increased. Unfortunately, I wasn’t there to hear the question or the subsequent discussion, but my guess as to what Andrew Dillon actually meant was not a question of statistical significance, but rather one of magnitude.

If you build it, they will spam

by Gene Golovchinsky on June 30, 2009

Tim O’Reilly and John Battelle have written an interesting opinion piece on recent trends in collective intelligence on the Web, something they and others have called Web 2.0. The article covers a lot of ground, touching on everything from medical imaging to politics to Twitter. It is a vision, and one that isn’t so far off: we can see the technological dots forming recognizable patterns. Emboldened by the success of Google, Twitter and Mechanical Turk, the authors call for similar engagement in healthcare, energy policy, and financial regulation, among others.

While what they describe is not exactly a technological Utopia, their picture is somewhat rosy. Continue Reading

Search User Interfaces

by Gene Golovchinsky on June 29, 2009 Comments (5)

Marti Hearst‘s new book, Search User Interfaces, is out, as Daniel Tunkelang reported earlier. The book covers a range of topics related to interaction around information seeking, including topics such as design, evaluation, models of information seeking, query reformulation, etc. It also discusses emerging trends: Mobile Search Interfaces, Multimedia (although this field has arguably been around long enough to no longer be emerging), Social Search, and natural-language queries. The Social Search section discusses collaborative filtering, recommendation systems, and collaborative search, describing several systems along the full range of depth of mediation.

Pattern matching

by Gene Golovchinsky on June 26, 2009

Once a month I drive up to Oakland to attend the SF Bay Groovy and Grails Meetup organized by Chris Richardson. It’s a fun group of people and conversation covers a lot of ground. During Monday’s meeting we chatted about Scala, among other things, and how it was good for pattern matching in exactly the way that object-oriented solutions weren’t.

Chris gave the example of dispatching requests in a web server by matching URI patterns to discover what internal methods to call to handle requests. Scala’s switch statement allows a URI to be split into a list of tokens (as do many other languages) and then makes it easy to match against this list, even if the list is not homogeneous. It furthermore makes the matching values available to the closure handling the case, making it more expressive than (for example) Java.

This made me reflect on when pattern matching is used versus more explicit representations of structure. For example, when making a function call, we rely on an explicit (simple) declared structure rather than on a sequential examination of possible patterns to decide what function to call. (Overloaded methods don’t really require serious pattern matching.) So pattern matching is typically used when the input is poorly structured, natural language processing being a good example. While the utility of using plain text URIs as a lingua franca among heterogeneous systems is compelling, but I found it ironic that, in a sense, that the URI, the basis of internet communication, winds up being treated more like natural language than a function call.

Ubuntu. Ugh.

by Gene Golovchinsky on June 25, 2009

[“ubuntu”] describes humanity as “being-with-others” and prescribes what “being-with-others” should be all about. Ubuntu emphasises sharing, consensus, and togetherness.

according to ubuntu.com. Over the last two days I have experienced a palpable lack of togetherness with this beast in an attempt to explore the world beyond Windows. I have some Grails/Java code that I wanted to try running on a linux box rather than on Windows (on which I had no problem getting it to work). Grails is a dynamic language that requires run-time compiler support, so it must use the JDK rather than the JRE.

Call center collaboration

by Gene Golovchinsky on June 24, 2009

In their JCDL 2009 paper titled “Cost and Benefit Analysis of Mediated Enterprise Search” Wu et al. described a cost-benefit analysis of call center activity. The goal was to understand when an experts should help “consultants” who are handling phone calls from customers. The idea was that experts could make improvements in search results of queries run by consultants by identifying useful documents; the challenge is to make effective use of the more expensive experts’ time.

This seems like a great opportunity to implement a collaborative search interface that would mediate the collaboration between the people handling the phone calls and the technical experts. In addition to screen sharing (to help the expert understand the problem), the system might provide the expert with additional tools to facilitate searches and to reuse previously-found results.

Tangibles Day at FXPAL

by Maribeth Back on June 23, 2009 Comments (1)

Last week we had two interesting visitors who each gave talks in the area of tangible computing. (Briefly, tangible computing explores ways of interacting with computers using real-world physical objects; much more info can be found online including at the Tangible Media Group at the MIT Media Lab). FXPAL has done a number of tangible interface projects over the years, including the PostBits project, the Convertible Podium, and others.

Expanding query expansion

by Gene Golovchinsky on June 22, 2009 Comments (2)

Looks like I missed a good paper at JCDL 2009: A Polyrepresentational Approach to Interactive Query Expansion by Diriye, Blandford and Tombros. As with many good ideas, this paper describes an approach that is obviously useful once described, but one I had not come across before.

Manual query expansion can be useful when relevance feedback fails because it doesn’t know why a person found a document relevant, but people are often reluctant to use the suggestions offered by information seeking systems. This paper offers a new twist on these recommended terms: When suggesting query terms for expanding a user’s queries, they show terms with some representation of the context in which they occur. Evaluation showed that this contextual information allowed users to understand query terms better, and that it improved their ability to make relevance judgments with respect to documents that contained the suggested terms.

In Cerchiamo, we offered users term suggestions based on relevance judgments made by search partners. While the suggested terms were useful for identifying other relevant documents, they weren’t always used. It’s likely that term recommendation in collaborative search situations would benefit from these techniques even more than in the standalone search because in the collaborative search case term recommendations may be based on documents that a searcher has never seen.

Tweeting at JCDL

by Gene Golovchinsky on June 19, 2009 Comments (4)

I attended JCDL 2009 this week, and had the opportunity to do some live tweeting of several papers and panel sessions. It was an interesting experience that I thought was worth summarizing here. Overall, it was difficult to get the messages right, it was a challenge to listen and type at the same time, the 140 character constraint was an issue some of the time, and my tweeting had a couple of effects on my Twitter network. And of course there is the question of utility of this endeavor.

Blog

Categories

Archive

Blog

Is TREC good for Information Retrieval research?

On the Science of IR

If you build it, they will spam

Search User Interfaces

Pattern matching

Ubuntu. Ugh.

Call center collaboration

Tangibles Day at FXPAL

Expanding query expansion

Tweeting at JCDL