Blog Category: Research

Patent Search workshop at CIKM 2010

by Gene Golovchinsky on June 25, 2010

The 3rd workshop on Patent Information Retrieval (PAIR 2010) will be held in conjunction with CIKM 2010 on October 26th. Patents pose specific challenges with respect to information retrieval, and thus it’s unsurprising that the topic should receive focused attention in a series of workshops. What’s particularly interesting about this workshop is that rather than focusing solely on technical issues, its CFP specifically invites participation from patent retrieval practitioners:

We encourage IP professionals to present their special information needs and IR&KM researchers to present relevant technical ideas, for example for high recall search in prior art searching.

I really like this grounded approach to a complex problem space. Bringing together researchers are domain experts should benefit both groups: researchers should be able to draw on specific use cases and get a better understanding of searchers’ information needs, while patent search domain experts can get exposure to new tools and interfaces. I would love to see this approach repeated for other domains that involve information seeking such as medicine, law, and intelligence analysis, etc.

Now all I have to do is figure out how to attend it and the BooksOnline’10 workshop at the same time.

Inking Rennaisance?

by Gene Golovchinsky on June 24, 2010

In a recent post, James Landay compared Dan Bricklin’s note-taking app with a research project called NotePals done at FXPAL during a summer internship by Richard Davis, James’ student. The idea behind both is that writing on a small device (or with poor spatial resolution) is hard, but if you write large and then scale down the ink, you get much more legible results.

Dan’s iPad app works great for this purpose, and with only a little practice one can get really proficient with it. I’ve used it as my primary sketching tool on the iPad, including for sketching interface designs. I wish I could import background images into it for sketching on, but otherwise it’s a nice basic tool. The same idea — write on a zoomed out image & then shrink the ink — works great on the iAnnotate app as well, although the interaction is not really optimized for that the way that Briklin’s app is.

Link & Learn

by Gene Golovchinsky on June 23, 2010 Comments (10)

The Memex concept

One of the highlights of this year’s Hypertext conference (which I missed) was Andrew Dillon‘s opening keynote. He is a great speaker—the Irish accent doesn’t hurt—and it would have been great to see it. Perhaps a recording will materialize eventually. In the meantime, there is the written version that reviews the state of Hypertext research 65 years after some of its tenets were articulated by Vannevar Bush in the famous “As We May Think” article in the Atlantic.

Achieving impact

by Gene Golovchinsky on June 22, 2010 Comments (4)

The impact of academic computer-human interaction research on the real world has been debated repeatedly over the last few years. The criticism is that HCI research isn’t that relevant, and that really innovative interfaces (such as Apple’s iPhone) are designed by outsiders, without input from HCI researchers. My sense is that things are not so dire, that there is a trickle-down effect, and that practitioners do pay attention to research results when those results are packaged effectively.

But the criticism is not completely without merit, and only a few systems described in the CHI and UIST literature (to take two academic examples) actually make it into product. On the other hand, one finds examples of transformative work (e.g., Tim Berners-Lee’s framework for the World Wide Web) being rejected by top-tier conferences.

Thus it gives me great pleasure to point to an academic success that is also succeeding in the real world. I am talking about ShapeWriter.

Google’s Patent Search “feature”

by Gene Golovchinsky on June 16, 2010 Comments (1)

While poking around on the USPTO and Google to try to figure out how to get single PDF documents for my indexing project, I discovered that the Google advanced search interface won’t retrieve any documents based on the inventor field. I run the searches three ways: by typing an author’s name into the Google patent search box, by typing it into the advanced search form on Google, and by entering it into the USPTO’s advanced search form. I expect the first set of results to be the largest as it may include hits where the inventor is referenced by some other patent, but the second two should return the same number of hits. The results for a couple of searches are shown below; you can run your vanity search yourself.

Inventor	Google	Google advanced	USPTO
Gene Golovchinsky	41	0	21
Andreas Girgensohn	52	0	29
Daniel Tunkelang	9	0	8

I don’t know if this is a metadata problem (along the lines of the Google books metadata issues that came up in the context of Google Books), or if it is a UI/front end issue. In any case, it seems odd that testing didn’t catch this bug.

CFP: BooksOnline ’10

by Gene Golovchinsky on June 15, 2010 Comments (3)

The BooksOnline ’10 workshop will be held on October 26, 2010 in Toronto, Ontario, Canada, in conjunction with the CIKM 2010 conference. The goal of the workshop is to bring together researchers with interests related to various aspects online reading, including digital collections, user experience, and design and technology. See the Call for Papers for a more detailed description of relevant topics. The workshop is organized by Gabriella Kazai (Microsoft Research, UK) and Peter Brusilovsky (University of Pittsburgh).

Parsing patents

by Gene Golovchinsky on June 14, 2010 Comments (5)

Since Google announced its distribution of patents, I have been poking around the data trying to understand what’s in there and starting to index it for retrieval. The first challenge I’ve had to deal with is data formats. The second is how to display documents to users efficiently.

The full text of the patents is available in ZIP files, one file per week, based on the date patents were granted. The files cover patents issued from 1976 to (as of this writing) the first week of 2010. In addition to the text, they contain all manner of metadata such as when the patent was filed, who the inventors and assignees were, etc. Interestingly, the zipped up files are in two different formats: patents from 2001 on are in XML, while earlier ones are in a funky ad hoc text format.

Intended to deceive

by Gene Golovchinsky on June 10, 2010 Comments (2)

The ‘sphere is a-twitter about BP’s buying keywords (e.g., “oil spill”, “BP”, “gulf disaster”, etc.) to place links to their versions of the story at the top of the search results. ABC News writes:

According to Kevin Ryan, the CEO of California-based Motivity Marketing, research shows that most people can’t tell the difference between a paid result pages, like the ones BP have, and actual news pages.

So we have two issues: one related to BP, and one related to the search engines.

Searching for a Houzz

by Gene Golovchinsky on June 9, 2010

Miles Efron and I have written about micro-IR in the past (see here, here, and here), and I recently came across another interesting example in the form of the Houzz App for the iPad. Houzz is an interface that fronts a collection of photographs of house interiors, the kind of stuff you might find in magazines and interior design/decoration books. It provides (an imperfect) browsing and search interface to find images by geographic area, by room function, etc. It also has a mode which brings together sets of images on a theme, curated by a designer with a blog. Each set of images comes with an introduction by the blogger, a bit of background on the person, commentary on each image, and even blog-like discussions among readers and designers associated with each theme.

Kindle’s fate

by Gene Golovchinsky on June 7, 2010 Comments (13)

Last week I made a handshake bet that Amazon will stop selling the Kindle device in a year’s time. Today I am putting it in writing. Amazon will stop selling its devices for several reasons: because the margins are higher on books, because ultimately people won’t want to have multiple, specialized devices with significantly-overlapping functions, and because the devices themselves are quite limited.
Continue Reading

Blog Category: Research

Categories

Archive

Blog Category: Research

Patent Search workshop at CIKM 2010

Inking Rennaisance?

Link & Learn

Achieving impact

Google’s Patent Search “feature”

CFP: BooksOnline ’10

Parsing patents

Intended to deceive

Searching for a Houzz

Kindle’s fate