Blog Archive: 2010

Listen to the students

by Gene Golovchinsky on July 12, 2010 Comments (4)

Recently, I came across an interesting article on students’ attitudes to reading online vs. in textbooks. The article appeared in the Nieman Reports, published by the Nieman Foundation for Journalism at Harvard. Esther Wojcicki, a teacher, relates her students’ reactions to being asked to read online. She reports that

…early in the school year many of these students had written a fiery editorial about e-textbooks in their social studies classes. In part it read, “… online textbooks hinder study habits and force the use of computers. … and are detrimental to learning and inconvenient.” The editorial concluded with these words: “If the school wishes to cultivate the use of e-books, it should at the very least offer students the option to continue using the old, hardcover books.”

The teacher thought that six months of use of online reading devices (she doesn’t say which, but I am assuming that a Kindle device was involved, since she says that this happened before the iPad was released) would accustom students to the new medium. She was wrong.

Facebook UX, an analogy

by Gene Golovchinsky on July 9, 2010 Comments (1)

This may be old news to some of the true social media junkies, but thanks to Gentry Underwood’s PARC forum today, I saw a great video analogy for the Facebook interaction style. Enjoy.

The video is made by a British comedy group called Idiots of Ants; the pun becomes evident when the group’s name is pronounced with a British accent.

Session-based search

by Gene Golovchinsky on July 8, 2010 Comments (14)

Exploratory search often takes place over time. Searchers may run multiple queries to understand the collection, to refine their information needs, or to explore various aspects of the topic of interest. Many web search engines keep a history of a user’s actions: Bing makes that history readily available for backtracking, and all major search engines presumably use the click-through history of search results to affect subsequent searches. Yahoo Search Pad diagnoses exploratory search situations and switches to a more elaborate note-taking mode to help users manage the found information.

But none of these approaches makes it easy for a searcher to manage an on-going exploratory search. So what could be done differently? We explore this topic in a paper we’ll be presenting at the IIiX 2010 conference this August. Our paper reviews the literature on session-based search, and proposes a framework for designing interactions around information seeking. This framework uses the structure of the process of exploratory search to help searchers reflect on their actions and on the retrieved results. It treats queries, terms, metadata, documents, sets of queries, and sets of documents as first-class objects that the user can manipulate, and describes how information seeking context can be preserved across these transitions.

Parsing patents, take 2

by Gene Golovchinsky on July 7, 2010 Comments (8)

Working on parsing and indexing the patent collection that Google made available has been an interesting education in just how noisy allegedly clean data really is, and in the scale of the collection. I am by no means done; in fact, I’ve had to start over a couple of times. I have learned a few things so far, in addition to my earlier observations.

Continue Reading

Smooth ink on the iPad

by Gene Golovchinsky on July 6, 2010 Comments (4)

To try to understand the software limitations of inking on the iPad, I had earlier described an ad hoc writing experiment I had conducted on some free iPad applications designed for drawing. The goal was to understand whether the software imposed any fundamental limitations on marking on an iPad using a finger or a stylus. Because the device is designed to be operated with a finger, there seem to be some hardware-based limitations on the size of the tip of the stylus that prevent the kind of fine-grained visual feedback one needs to write. My conclusion at the time was that there was something wrong with the way applications got stroke data from the device that made all of them track so poorly.

It appears that I was over-generalizing. First, given the capabilities of the iPad platform to download and render video, it seems unlikely that the hardware is not capable of providing events fast enough; the question was really about the software. A reader of this blog pointed out that I had missed the Penultimate app, and this app was apparently quite good at handling ink. I had indeed not tested it because at the time I was testing only free apps.

CACM

by Gene Golovchinsky on July 1, 2010 Comments (1)

Recently I joined the editorial board (web site) for Communications of the ACM (CACM), ACM’s flagship magazine. While ACM members are certainly familiar with the glossy, printed copies delivered through the mail, some of you might not know that issues (and their individual articles) are also available digitally in several different formats.

In the list that follows, I link to the resource home pages when those are available, and to exemplar articles when that’s more convenient.

Good Hypertext

by Gene Golovchinsky on June 30, 2010 Comments (3)

J. Nathan Matias pointed me to Mark Bernstein’s paper (‘paper’ is an inadequate label for the work) on literary criticism and hypertext, which Mark presented at the recent Hypertext 2010 conference. It’s a great piece of writing that ably defends the literary tradition from the Barbarians of mechanical evaluation. My summary of the paper cannot do it justice, what with its 93(!) references, quotes from Pope’s An Essay on Criticism, and Mark’s typical wit (“We cannot make feature films about vertebrate paleontology or test-driven software development; too few people are interested. The same audiences profitably support numerous books.”) The paper has a visual companion in the form of a SlideShare deck; the one aspect that appears not to have been preserved (for posterity and for those with limited travel budgets) is a recording of the actual presentation.

The comment that lead me to the paper seemed to offer it as a counter-example to Andrew Dillon’s thesis about methodological failures of the hypertext community to assess its impact on education. I don’t see it.

ai

by Matt Cooper on June 29, 2010 Comments (1)

Artificial intelligence has always struck me as a fittingly modest name, as I emphasize the artifice over the intelligence. Watson, a question-answering system has recently been playing Jeopardy against humans to test the “DeepQA hypothesis”:

The DeepQA hypothesis is that by complementing classic knowledge-based approaches with recent advances in NLP, Information Retrieval, and Machine Learning to interpret and reason over huge volumes of widely accessible naturally encoded knowledge (or “unstructured knowledge”) we can build effective and adaptable open-domain QA systems. While they may not be able to formally prove an answer is correct in purely logical terms, they can build confidence based on a combination of reasoning methods that operate directly on a combination of the raw natural language, automatically extracted entities, relations and available structured and semi-structured knowledge available from for example the Semantic Web.

As a researcher, I’m excited at the milestone this represents.

Reading on Papers

by Gene Golovchinsky on June 28, 2010 Comments (2)

I am trying to understand the capabilities of existing iPad applications with respect to active reading. In this spirit, I have reviewed iAnnotate, and have written about e-books in general. Mekentosj Papers is a Mac application for managing academic papers; a version of it has been ported to the iPad. The idea is that you can use it to find papers you need to read, read them, and also manage their re-finding. The app fails on all accounts.

Continue Reading

Patent Search workshop at CIKM 2010

by Gene Golovchinsky on June 25, 2010

The 3rd workshop on Patent Information Retrieval (PAIR 2010) will be held in conjunction with CIKM 2010 on October 26th. Patents pose specific challenges with respect to information retrieval, and thus it’s unsurprising that the topic should receive focused attention in a series of workshops. What’s particularly interesting about this workshop is that rather than focusing solely on technical issues, its CFP specifically invites participation from patent retrieval practitioners:

We encourage IP professionals to present their special information needs and IR&KM researchers to present relevant technical ideas, for example for high recall search in prior art searching.

I really like this grounded approach to a complex problem space. Bringing together researchers are domain experts should benefit both groups: researchers should be able to draw on specific use cases and get a better understanding of searchers’ information needs, while patent search domain experts can get exposure to new tools and interfaces. I would love to see this approach repeated for other domains that involve information seeking such as medicine, law, and intelligence analysis, etc.

Now all I have to do is figure out how to attend it and the BooksOnline’10 workshop at the same time.

Categories

Archive

Blog Archive: 2010