Blog Category: Research

Tracking

on

Imagine the (legitimate) outcry if a local municipality, a State government, or the Federal government in the US deployed an infrastructure that would systematically identify and track people as they went about their daily lives, without a viable option to opt out. While the US has laws that govern when and how data about individuals could be used, the mere availability of such data would lead to temptations that would be irresistible in practice, yet not necessary for the functioning of this society.

Continue Reading

Collaborative search on the rise?

on

I am seeing an interesting not-quite-yet-a-trend on the emergence of collaborative search tools. I am not talking about research tools such as SearchTogether or Coagmento, but of real companies started for the purpose of putting out a search tool that supports explicit collaboration. The two recent entries in this category of which I am aware are SearchTeam and Searcheeze. While they share some similarities, they are actually quite different tools.

Continue Reading

A quick study of Scholar-ly Citation

on Comments (1)

Google recently unveiled Citations, its extension to Google Scholar that helps people to organize the papers and patents they wrote and to keep track of citations to them. You can edit metadata that wasn’t parsed correctly, merge or split references, connect to co-authors’ citation pages, etc. Cool stuff. When it comes to using this tool for information seeking, however, we’re back to that ol’ Google command line. Sigh.

Continue Reading

Recall vs. Precision

on Comments (3)

Stephen Robertson’s talk at the CIKM 2011 Industry event caused me to think about recall and precision again. Over the last decade precision-oriented searches have become synonymous with web searches, while recall has been relegated to narrow verticals. But is precision@5 or NCDG@1 really the right way to measure the effectiveness of interactive search? If you’re doing a known-item search, looking up a common factoid, etc., then perhaps it is. But for most searches, even ones that might be classified as precision-oriented ones, the searcher might wind up with several attempts to get at the answer. Dan Russell’s a Google a day lists exactly those kinds of challenges: find a fact that’s hard to find.

So how should we think about evaluating the kinds of searches that take more than one query, ones we might term session-based searches?

Continue Reading

HCIR 2011 keynote

on Comments (4)

HCIR 2011 took place almost three weeks ago, but I am just getting caught up after a week at CIKM 2011 and an actual almost-no-internet-access vacation. I wanted to start off my reflections on HCIR with a summary of Gary Marchionini‘s keynote, titled “HCIR: Now the Tricky Part.” Gary coined the term “HCIR” and has been a persuasive advocate of the concepts represented by the term. The talk used three case studies of HCIR projects as a lens to focus the audience’s attention on one of the main challenges of HCIR: how to evaluate the systems we build.

Continue Reading

Looking for volunteers for collaborative search study

on Comments (2)

We are about to deploy an experimental system for searching through CiteSeer data. The system, Querium, is designed to support collaborative, session-based search. This means that it will keep track of your searches, help you make sense of what you’ve already seen, and help you to collaborate with your colleagues. The short video shown below (recorded on a slightly older version of the system) will give you a hint about what it’s like to use Querium.

Continue Reading

What made you (continue to) want to write a book?

on

Many people have asked me why I decided to write a book. A better questions is: “When you realized that writing the book was going to be orders of magnitude harder and take much longer than you thought it would, what made you decide to continue writing the book?”

My co-author, Wolfgang Polak, and I recently received a book review of the sort that is the dream of every author. A dream review is, of course, positive. But more importantly, it praises the aspects of the book that were most important to the author – the reasons the author kept going after other books on the subject came out and the author had a more reasonable (but still too optimistic) estimate of the vast amount of  effort it would take to finish it. (The review appeared in Computing Reviews, but is behind a paywall. Excerpts appear on the book’s Amazon and MIT press web pages.)

In our case,  one of the things that kept us going Continue Reading

CFP: 3rd Workshop on Collaborative Information Retrieval

on

We are organizing a third workshop on collaborative information retrieval, this time in conjunction with CIKM 2011. The first workshop, held in conjunction with JCDL 2008, focused on definitional issues, models for collaboration, and use cases. The second workshop, held in conjunction with CSCW2010, explored communication and awareness as related to collaborative search. This third workshop will focus on system building, algorithms, and user interfaces for collaboration.

Continue Reading

A Gentle Introduction

on Comments (6)

Quantum Computing: A Gentle Introduction by Eleanor Rieffel and Wolfgang PolakOur book, Quantum Computing: A Gentle Introduction, has been out for a little over a month. So far, it has received as much attention from weaving blogs as science blogs, due to the card-woven bands on the cover.

MIT press takes pride in their cover designs, but warns authors that  “schedules rarely allow for individual consultation between designers and authors.” They do, however, ask authors to fill out a detailed questionnaire that includes questions asking for the authors’ thoughts with respect to a cover. It was the third question “What would you like the viewer to think or feel when they see the cover?” that prompted me to think that a fabric with abstract, colorful designs would suggest a “gentle” introduction to an abstract and colorful subject. Continue Reading

How much does time weigh?

on Comments (3)

As Miles wrote yesterday, our paper was accepted to SIGIR 2011. The idea that time has an impact in ranking documents is not new; the problem seems to be to know when to take it into consideration. For example, while Li and Croft showed improvements in ranking when incorporating the notion of recency, we found that the algorithm degrades performance of non-temporal queries. (This is obvious, in a sense: if a ranking algorithm is biased toward more recent documents, and recency is not important for a given query, it will de-emphasize otherwise well-matching documents, thereby reducing MAP.)

Continue Reading