Blog Archive: 2010

Exploring workplace communication

on Comments (1)

Modern work is a collaborative enterprise. As such, it depends on communication among the collaborators to reach successful outcomes. An increasing number of communication tools are based on somewhat recent computer technologies, such as email, blogs, wikis, social networking, and Twitter.While there have been many studies of single communication tools in the workplace (IM, wikis, blogging, etc.) we believe that we are one of the first to take a broad view of the communication landscape since the introduction of these new technologies.

In our paper, to be presented at CHI 2010, we explored the communication ecology of a small business. We examined the work communication practices of our participants, including what methods people used to communicate and why, how they viewed the various methods and how they adopted them.

Continue Reading

What do we mean by “Search in Social Media”?

on Comments (3)

Jeremy and I have been busy preparing for the Search in Social Media (SSM2010) workshop. We thought we would start at the beginning and ask what people understood by the term “search in social media.” Workshops often spend a bunch of time on definitions, and we thought we’d jump in early. We’ve talked about social search before, but that was without reference to social media.

We think the phrase ‘search in social media’ has been used to refer to both the information being searched, and to the process for doing so. The information is standard user-generated content — tweets, blog posts, comment threads, tags, etc. The process, however, seems less well understood.

Continue Reading

Finding facets

on Comments (6)

I’ve been messing around with Twitter search, which (on a small scale) led me to store structured tweet, people and document data. I used a relational database to store the data I got from Twitter, and everything worked just fine. (That is, performance was limited by the Twitter API and Twitter search API, not by my database.) But say you have lots of data, and it includes text and structure, and you want to search it. What if you’re Twitter or LinkedIn? Can you still use MySQL or Oracle or whatever to store your data and serve up search results?

At a recent SDForum talk on the search capabilities of LinkedIn, John Wang described how LinkedIn handles its faceted search. The talk covered a wide range of topics around managing scalability that are undoubtedly shared by many web companies: how to handle real-time updates, how to scale to millions of users, etc. LinkedIn uses Lucene and other related tools, and to their credit has made contributions to the Lucene open source tool set, including Bobo and Zoie.

Continue Reading

Does IP matter?

on Comments (1)

Panos Ipeirotis recently wrote about the confusing state of affairs with respect to intellectual property at his University. In some sense, this is ironic, since the whole point of a University is to produce intellectual property. But I suppose the question isn’t really one of production, but rather of distribution and of consumption. It’s clear that the faculty and students who develop the ideas should own (i.e., receive credit for) those ideas. But once an idea is published, how it gets used is a different story.

With others (e.g., Christopher Browne) I have often wondered why a public university (or a private one that receives significant federal funding for research) has any rights to patent the results of its research. After all, government employees are not allowed to patent the results of their work done for the government; why should government-funded work at universities be different?

Furthermore, does it matter to a University to hold patents, particularly software patents?

Continue Reading

What If Everyone Were Number One?

on

I’ve been doing a bit of thinking lately about search engines, algorithmic openness, and spammers.  I suppose this was all prompted by a blog post recently on the Meaning of Open: http://googleblog.blogspot.com/2009/12/meaning-of-open.html

In this post, it is claimed that openness is good: open systems, open source, open data.  This claim is held forth as true…for everything except for search algorithms.   In the case of algorithms, the secret sauce must be kept exactly that: secret.  Spammers would otherwise have too much power.

That claim makes me want to play around with a little thought experiment.  What if the search algorithm were indeed fully open?  What if everyone in the world knew exactly how rankings were done, and could modify their web pages so as to adapt themselves to whatever the ranking function is.  In short, what if everyone were number one?  Continue Reading

SSM2010 panel: Research Directions for Search in Social Media

on Comments (3)

The third workshop on Search in Social Media (SSM2010) will held in conjunction with WDSM 2010 in early February. The workshop, organized this year by Eugene Agichtein (Emory University), Marti Hearst (University of California, Berkeley), Ian Soboroff (NIST), and Daniel Tunkelang (Google), will bring together academics and people from industry (including the major search engines). The keynote will be given by Jan Pedersen, who is now Chief Scientist for Core Search at Microsoft. It will address issues of what the big players are doing, what the more specialized social media companies are up to, and will also tackle important research problems in the field.

Continue Reading

#Google #search for #Twitter? #fail!

on Comments (9)

For a while now, Google has been serving up tweets related to searches as part of its real-time search effort. Now they are making it possible to search the Twitter stream in exactly the way Twitter doesn’t allow — that is, to search for tweets older than a few days. A query like

cyberwarfare site:twitter.com

will return a bunch of tweets, formatted as Google search results. As of the time I ran this query, it identified 1,380 hits from Twitter. Twitter’s search yielded about 250 tweets, going back to no more than 10 days ago. So far, so good.

Continue Reading

Position papers for Collab Info Seeking workshop

on

We had a record crop of position papers for the Collaborative Information Seeking (CIS) workshop we’re organizing at CSCW 2010. Underscoring the ubiquity of collaboration in information seeking, the position papers address everything from health care to emergency response to SecondLife to the information seeking ecology within the enterprise. The papers clustered out into several broad categories, although some papers could have been easily classified in more than one way.

Continue Reading

Summer Internship in Virtualization (IT Group)

on Comments (1)

This is one in a series of posts advertising internship positions at FXPAL for the summer of 2010. A listing of all internship positions currently posted is available here.

FXPAL’s information technology team is looking for an intern who can work with the team to help evaluate and develop a number of virtualization technologies for deployment within the lab as well as in select research prototypes.

We currently have two projects, one to evaluate VMWare View Desktop Virtualization, and the other to develop user-facing VM Management & Monitoring applications. (You can apply for one or for both positions.) Continue Reading

Twitter and disasters waiting to happen

on Comments (5)

The recent earthquake in Haiti has attracted attention from Twitter users and researchers. Twitter has been used to collect donations, to contact people on the ground, to coordinate relief efforts, etc. Recently, U. Colorado’s EPIC Group proposed a hash-tag-based syntax on top of Twitter messages to help automate the parsing of actionable messages, and to do so effectively and reliably. This is a noble effort, but as Manas Tungare points out, the proposed syntax is too complex for its intended users, who have more pressing issues than dealing with hash tags.

Continue Reading