Blog Category: Research

How to compute without knowing anything

on Comments (5)

In my post on quantum inspired classical results, I gave as one example Gentry’s recent discovery of a fully homomorphic encryption scheme. His beautiful work deserves its own blog post. Initially I approached his work with trepidation, worried that it would be so technical I would not understand anything without a lot of work. Others have mentioned not  having looked at his work for the same reason. That is a shame! While the details are technical, the key idea, bootstrappable encryption, is both a non-obvious approach and an easily understandable concept.  I remember smiling while I read the first couple of pages of his paper in response to the elegance and surprising simplicity of his approach.

Continue Reading

Exploring workplace communication

on Comments (1)

Modern work is a collaborative enterprise. As such, it depends on communication among the collaborators to reach successful outcomes. An increasing number of communication tools are based on somewhat recent computer technologies, such as email, blogs, wikis, social networking, and Twitter.While there have been many studies of single communication tools in the workplace (IM, wikis, blogging, etc.) we believe that we are one of the first to take a broad view of the communication landscape since the introduction of these new technologies.

In our paper, to be presented at CHI 2010, we explored the communication ecology of a small business. We examined the work communication practices of our participants, including what methods people used to communicate and why, how they viewed the various methods and how they adopted them.

Continue Reading

What do we mean by “Search in Social Media”?

on Comments (3)

Jeremy and I have been busy preparing for the Search in Social Media (SSM2010) workshop. We thought we would start at the beginning and ask what people understood by the term “search in social media.” Workshops often spend a bunch of time on definitions, and we thought we’d jump in early. We’ve talked about social search before, but that was without reference to social media.

We think the phrase ‘search in social media’ has been used to refer to both the information being searched, and to the process for doing so. The information is standard user-generated content — tweets, blog posts, comment threads, tags, etc. The process, however, seems less well understood.

Continue Reading

Finding facets

on Comments (6)

I’ve been messing around with Twitter search, which (on a small scale) led me to store structured tweet, people and document data. I used a relational database to store the data I got from Twitter, and everything worked just fine. (That is, performance was limited by the Twitter API and Twitter search API, not by my database.) But say you have lots of data, and it includes text and structure, and you want to search it. What if you’re Twitter or LinkedIn? Can you still use MySQL or Oracle or whatever to store your data and serve up search results?

At a recent SDForum talk on the search capabilities of LinkedIn, John Wang described how LinkedIn handles its faceted search. The talk covered a wide range of topics around managing scalability that are undoubtedly shared by many web companies: how to handle real-time updates, how to scale to millions of users, etc. LinkedIn uses Lucene and other related tools, and to their credit has made contributions to the Lucene open source tool set, including Bobo and Zoie.

Continue Reading

What If Everyone Were Number One?

on

I’ve been doing a bit of thinking lately about search engines, algorithmic openness, and spammers.  I suppose this was all prompted by a blog post recently on the Meaning of Open: http://googleblog.blogspot.com/2009/12/meaning-of-open.html

In this post, it is claimed that openness is good: open systems, open source, open data.  This claim is held forth as true…for everything except for search algorithms.   In the case of algorithms, the secret sauce must be kept exactly that: secret.  Spammers would otherwise have too much power.

That claim makes me want to play around with a little thought experiment.  What if the search algorithm were indeed fully open?  What if everyone in the world knew exactly how rankings were done, and could modify their web pages so as to adapt themselves to whatever the ranking function is.  In short, what if everyone were number one?  Continue Reading

SSM2010 panel: Research Directions for Search in Social Media

on Comments (3)

The third workshop on Search in Social Media (SSM2010) will held in conjunction with WDSM 2010 in early February. The workshop, organized this year by Eugene Agichtein (Emory University), Marti Hearst (University of California, Berkeley), Ian Soboroff (NIST), and Daniel Tunkelang (Google), will bring together academics and people from industry (including the major search engines). The keynote will be given by Jan Pedersen, who is now Chief Scientist for Core Search at Microsoft. It will address issues of what the big players are doing, what the more specialized social media companies are up to, and will also tackle important research problems in the field.

Continue Reading

#Google #search for #Twitter? #fail!

on Comments (9)

For a while now, Google has been serving up tweets related to searches as part of its real-time search effort. Now they are making it possible to search the Twitter stream in exactly the way Twitter doesn’t allow — that is, to search for tweets older than a few days. A query like

cyberwarfare site:twitter.com

will return a bunch of tweets, formatted as Google search results. As of the time I ran this query, it identified 1,380 hits from Twitter. Twitter’s search yielded about 250 tweets, going back to no more than 10 days ago. So far, so good.

Continue Reading

Position papers for Collab Info Seeking workshop

on

We had a record crop of position papers for the Collaborative Information Seeking (CIS) workshop we’re organizing at CSCW 2010. Underscoring the ubiquity of collaboration in information seeking, the position papers address everything from health care to emergency response to SecondLife to the information seeking ecology within the enterprise. The papers clustered out into several broad categories, although some papers could have been easily classified in more than one way.

Continue Reading

CFP: IIiX 2010

on Comments (3)

If you are doing research in interactive information retrieval, information seeking, collaborative search, and the like (that is, you’re concerned with what users do when they look for information), you might consider submitting  paper to IIiX 2010.

IIiX will explore the relationships between the contexts that affect information retrieval and information seeking, how these contexts impact information behavior, and how knowledge of information contexts and information behaviors can help design truly interactive information systems.

Continue Reading

Summer Intern Position in HCIR

on Comments (3)

This is one in a series of posts advertising internship positions at FXPAL for the summer of 2010. A listing of all internship positions currently posted is available here.

The focus of Human-Computer Information Retrieval (HCIR) is to help people find and make sense of the information that satisfies their evolving information needs, and to do so with an emphasis on interaction and not just on clever algorithms that attempt to approximate users’ intent. Over the past couple of years, we have developed some novel information retrieval algorithms such as collaborative search. While we have evaluated the work in various ways (e.g., evaluating algorithms offline and testing with people on artificial information needs), we have not tested them on people with real information needs.

Continue Reading