Blog Archive: 2010

Links and chains

on Comments (1)

Hereford Cathedral Chained Library, Hereford, England

I came across an article in The Chronicle of Higher Education on St. Leo’s University, whose library is investing heavily in electronic titles for its students. This makes sense for them because a large number of their students are off-campus (and perhaps even off-continent). The article didn’t go into much detail on how students would actually read these books (other than to mention “computers, smartphones, and iPads”). I expect that most of the interaction with the books will consist of clicking on links in a browser, without the benefit of interfaces for active reading.

What intrigued me more were the comments, particularly the one by zenbrarian, who pointed out that the way these e-libraries are typically implemented is by the library obtaining electronic access to titles without actually hosting the books themselves. It makes sense if a library doesn’t want to get too deeply into the IT business, but it does mean that the publish not only retains the right to jack up the fees at will, but also maintains control over who gets to read the books.

Continue Reading

Maintaining relevance

on Comments (14)

Large companies often find it difficult to innovate, but not for lack of trying. Most major corporations have regular means by which new product or product line ideas are vetted. Unfortunately, such processes are designed to select incremental improvements to existing products and services, rather than to introduce radically new offerings. One significant reason for this is that when senior management considers a new idea, they often look at the revenue stream, and compare it, implicitly, with revenue streams from existing (successful) products. Proposed products that are not immediately comparable in their revenue streams with existing product lines are often not approved.

This is the trap of success. A company learns how to do something well, and then gets stuck in that rut. When the market changes, few companies are able to systematically get out of that rut and regain their past levels of success in new areas. IBM’s reliance on mainframe computing, Xerox’s on xerography patents, and Microsoft’s ignoring the Internet for a long time are examples of companies that actually managed to survive these painful transitions; many other companies did not.

This analogy applies to research as well.

Continue Reading

Pivot

on Comments (6)

Not having gone to SIGIR 2010, I missed Gary Flake’s keynote address, in which he described and demonstrated Microsoft Pivot, a zoomable, faceted search interface that his group built. Jeff Dalton has a good summary of the talk, which parallels Gary’s previous presentations, including a TED talk (video below). The demos are pretty slick, and the scale at which the system operates is impressive.

In some ways, his emphasis on rich clients and interactive control over large, pre-computed datasets, is a great illustration of HCIR principles. The user is encouraged to explore by making fluid, immediate, reversible operations over large data sets with the goal of finding useful information.

Continue Reading

Exploring diversity of SIGIR

on Comments (1)

I have been curious about the evolution of research interests in the IR community for a while, and have recently decided to do something quantitative about it. My plan is to track how different aspects of the field wax and wane throughout the conference series. To start off, I decided to compare SIGIR 2010 with SIGIR 2000. This is an arbitrary starting point, but I wanted to do something topical (relevant?) to start.

Continue Reading

A concept by any name

on Comments (1)

Miles Efron wrote about a research project he is starting on statistical processing of 17th and 18th century English texts with the goal of establishing similarities between passages written with different spelling and vocabulary. This is a problem that humanities scholars might have when applying modern information retrieval tools to historical texts, as accepted English spelling and vocabulary was considerably more varied that it is now. (For a fun read about some of the issues, see Bill Bryson’s The Mother Tongue on the history of the English language.)

Continue Reading

Promoting TunkRank

on Comments (3)

Early last year, Daniel Tunkelang proposed a way to measure people’s influence on Twitter; this metric was dubbed TunkRank, and Jason Adams put up an implementation of it that people could use to calculate their (and others’) scores. The site has been evolving, and getting slicker. It even has an API for incorporating these scores into other applications.

The basic premise of the algorithm is that its not how many followers you have, but how influential they are. Your influence flows from them. For those interested in more details and rationale about algorithm, Daniel’s slides from a recent talk offer a nice overview. What’s also interesting, as pointed out in the comments on his post, is that this model, proposed on the blog and never published in a peer-reviewed forum, has become quite influential.

Streaming media and users

on

Just a short note to point at two articles on Facebook that discuss issues relating to streaming media and the home.  It is a continuing frustration that the vendors are not building the open environment we all want.  No surprise there.  But it is interesting that even when a vendor (Apple) has many of the required pieces it does not put them together well.

First note: my posting about the Sonos system I have installed at home.  I am a big fan of Sonos – we now use our iPad sitting by the TV to control it.  Now, if we could just get control and inter-operation between more devices.

Second note: Surendar Chandra has an interesting take on how Apple has all the pieces needed to make for a better environment but they don’t seem to do it.

Can you patent a page turn?

on Comments (5)

In a recent Bits column, Nick Bilton wrote about a Microsoft patent application that claims a curling page transition when flipping pages on a touch display. Very much the sort of thing you find on the iBooks app on the iPad, and on other applications. Very much the sort of thing that Ian Witten’s group has been writing about for years. I am not an expert on patents, but it seems to me that various aspects claimed by the Microsoft patent can be found in the following papers:

  • Chu, Y., Witten, I. H., Lobb, R., and Bainbridge, D. 2003. How to turn the page. In Proceedings of the 3rd ACM/IEEE-CS Joint Conference on Digital Libraries (Houston, Texas, May 27 – 31, 2003). International Conference on Digital Libraries. IEEE Computer Society, Washington, DC, 186-188.
  • Liesaputra, V., Witten, I. H., and Bainbridge, D. 2007. Lightweight realistic books: the greenstone connection. In Proceedings of the 7th ACM/IEEE-CS Joint Conference on Digital Libraries (Vancouver, BC, Canada, June 18 – 23, 2007). JCDL ’07. ACM, New York, NY, 502-502.

Continue Reading

Computing with Secrets

on Comments (1)

Tom Simonite of Technology Review interviewed me about the breakthrough in fully homomorphic encryption that I blogged about here. I very much enjoyed talking with him, and was pleased to see that he wrote a good article on the subject: Computing with Secrets, but Keeping them Safe: A cryptographic method could see cloud services work with sensitive data without ever decrypting it. He quotes me a couple of times on the second page of the article and generously gives me the last word.

I’ve been surprised at how little has been written about this breakthrough, little enough that my blog post continues to be among the top 20 hits for a number of related queries. The field is definitely hot, with DARPA recently announcing two related solicitations, DARPA-RA-10-80 and DARPA-BAA-10-81, on PROgramming Computation on EncryptEd Data (PROCEED). The first solicits research proposals for development of new mathematical foundations for efficient computation on encrypted data via fully homomorphic encryption. The second solicitation is broader, with the goal of developing practical methods for computation on encrypted data without decrypting the data and modern programming languages to describe these computations.

Computing with Secrets, but Keeping them Safe

Computing with Secrets, but Keeping them Safe

Boolean illogic

on Comments (5)

I am trying to understand how Google patent search works, and am encountering some quite odd behavior. I am not talking about the inventor search bug (which is still un-fixed), but about Boolean logic.

If I run the query [“information retrieval”], the system retrieves 323 documents. Similarly, [“dynamic hypertext”] retrieves 368 documents. The combination, [“information retrieval” “dynamic hypertext”] yields 16. Putting a plus in front of either quoted phrase does not affect the results. So far, this seems reasonable.

Continue Reading