{"id":2948,"date":"2010-02-10T07:41:48","date_gmt":"2010-02-10T15:41:48","guid":{"rendered":"http:\/\/palblog.fxpal.com\/?p=2948"},"modified":"2010-02-08T11:42:05","modified_gmt":"2010-02-08T19:42:05","slug":"making-sense-of-twitter-search","status":"publish","type":"post","link":"https:\/\/blog.fxpal.net\/?p=2948","title":{"rendered":"Making sense of Twitter search"},"content":{"rendered":"<p>Last week Jeremy and I attended the SSM2010 workshop held in conjunction with WSDM2010. In addition to <a title=\"SSM2010 | FXPAL Blog\" href=\"http:\/\/palblog.fxpal.com\/?p=2927\" target=\"_blank\">chairing<\/a> one of the panels, I got an opportunity to demonstrate an interface that I built to browse Twitter search results, to which Daniel alluded in his <a title=\"Third Workshop on Search and Social Media (SSM 2010) | CACM Blog\" href=\"http:\/\/cacm.acm.org\/blogs\/blog-cacm\/71444-third-workshop-on-search-and-social-media-ssm-2010\/fulltext\" target=\"_blank\">summary<\/a> of the workshop. The system is described in a <a title=\"Golovchinsky, G. and Efron, M. (2010) Making sense of Twitter Search. In Proc. CHI2010 Workshop on Microbloggging. April 2010.\" href=\"http:\/\/www.fxpal.com\/?p=abstract&amp;abstractID=555\" target=\"_blank\">position paper<\/a> (co-authored with Miles Efron) that has been accepted to the <a title=\"Microblogging: What and How Can We Learn From It? Workshop held at CHI 2010. \" href=\"http:\/\/www.cs.unc.edu\/~julia\/chi2010.html\" target=\"_blank\">Microblogging workshop<\/a> held in conjunction with <a title=\"CHI 2010 Conference\" href=\"http:\/\/www.chi2010.org\/\" target=\"_blank\">CHI 2010<\/a>.<\/p>\n<p>The idea behind this interface is that Twitter displays its search results only by date, thereby making it difficult to understand anything about the result set other than what the last few tweets were. But tweets are structurally rich, including such metadata as the identity of the tweeter, possible threaded conversation, mentioned documents, etc. The system we built is an attempt to explore the possibilities of how to bring <a title=\"Human-Computer information retrieval | Wikipedia\" href=\"http:\/\/en.wikipedia.org\/wiki\/Human%E2%80%93computer_information_retrieval\" target=\"_blank\">HCIR<\/a> techniques to this task.<\/p>\n<p><!--more-->Each tweet is classified as an &#8220;original&#8221; tweet or a re-tweet. Retweets  and replies are grouped into conversations. Retweets are detected through a number of heuristics, including whether they share URLs, include patterns such as RT @xx or via @xx (and a few other variants), contain similar text, etc. Detecting retweets is non-trivial in some cases due to lack of a structural representation of retweets. The new retweet API may mitigate this problem, but its usage is not widely adopted\u00a0 because it does not allow the person doing the retweeting to <a title=\"RT done wrong | FXPAL Blog\" href=\"http:\/\/palblog.fxpal.com\/?p=2270\" target=\"_blank\">comment on the original tweet<\/a>.<\/p>\n<p>The system organizes the results into people, tweets, and documents, each displayed in a separate tab.\u00a0 The people view is further split into people who tweeted and those who retweeted; a person may appear in each pane.<\/p>\n<div id=\"attachment_2950\" style=\"width: 494px\" class=\"wp-caption aligncenter\"><a href=\"http:\/\/palblog.fxpal.com\/wp-content\/uploads\/2010\/02\/tweet-analysis-people.bmp\"><img aria-describedby=\"caption-attachment-2950\" decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-2950 \" title=\"Tweet Analysis UI showing people view\" src=\"http:\/\/palblog.fxpal.com\/wp-content\/uploads\/2010\/02\/tweet-analysis-people.bmp\" alt=\"Tweet Analysis UI showing people view\" width=\"484\" height=\"313\" \/><\/a><p id=\"caption-attachment-2950\" class=\"wp-caption-text\">People view with tweeters (left) and retweeters (rights) and tweets for @jeremyhylton (tweets) and @dtunkelang (retweets)<\/p><\/div>\n<p style=\"text-align: left;\">People in the view are currently sorted by the number of tweets they contributed to the results set; other sort orders such as the number of followers, recency of tweet, <a title=\"A Twitter analog to PageRank | The Noisy Channel\" href=\"http:\/\/thenoisychannel.com\/2009\/01\/13\/a-twitter-analog-to-pagerank\/\" target=\"_blank\">TunkRank<\/a>, etc.<\/p>\n<p>The tweet view (shown partially below) groups tweets by the size of the conversation.<\/p>\n<div id=\"attachment_2952\" style=\"width: 423px\" class=\"wp-caption aligncenter\"><a href=\"http:\/\/palblog.fxpal.com\/wp-content\/uploads\/2010\/02\/tweet-analysis-tweets.bmp\"><img aria-describedby=\"caption-attachment-2952\" decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-2952 \" title=\"Tweets grouped by conversation\" src=\"http:\/\/palblog.fxpal.com\/wp-content\/uploads\/2010\/02\/tweet-analysis-tweets.bmp\" alt=\"Tweets grouped by conversation\" width=\"413\" height=\"354\" \/><\/a><p id=\"caption-attachment-2952\" class=\"wp-caption-text\">Partial shot of the tweet view showing tweet conversations<\/p><\/div>\n<p style=\"text-align: center;\">\n<p style=\"text-align: left;\">Finally, the document view shows documents mentioned by the tweets in the search results. Documents can be sorted by the number of mentions, by the first mention, or by the last mention. Mentions are identified by comparing URLs; shortened URLs are expanded prior to comparison, and some attributes tacked on by twitter clients are stripped out to determine canonical URLs.<\/p>\n<p>Ordering by the number of mentions allows the discovery of important documents through the tweeters&#8217; consensus. Drilling into each document shows the tweets (and people) that mentioned the document. Clicking on the document name opens the document in an adjacent pane. This interface allows the user to explore the documents and tweets that comment on them in an integrated way.<\/p>\n<div id=\"attachment_2953\" style=\"width: 411px\" class=\"wp-caption aligncenter\"><a href=\"http:\/\/palblog.fxpal.com\/wp-content\/uploads\/2010\/02\/tweet-analysis-documents.bmp\"><img aria-describedby=\"caption-attachment-2953\" decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-2953 \" title=\"Document view\" src=\"http:\/\/palblog.fxpal.com\/wp-content\/uploads\/2010\/02\/tweet-analysis-documents.bmp\" alt=\"Document view showing popular tweets\" width=\"401\" height=\"442\" \/><\/a><p id=\"caption-attachment-2953\" class=\"wp-caption-text\">Partial shot of the document view showing popular tweets and a document fragment. The timeline is not yet fully debugged!<\/p><\/div>\n<h3 style=\"text-align: left;\">More to come<\/h3>\n<p style=\"text-align: left;\">The system we built just scratches the surface with respect to potentially-useful ways to browse Twitter search (and other) results. For small and medium-sized query sets, it makes sense to display all results and let the user browse them directly; for larger collections that may contain thousands of tweets, a hierarchical browsing interface may be more appropriate. It should be possible to group tweets topically base on content or geographically,when geocoded. People can be grouped based their location, based on the strength of relations as determined by social network analysis, or based on <em>ad hoc<\/em> categories created by the user.\u00a0 Results may also be filtered to remove people who only contributed a single tweet, etc. The exact criteria for grouping will obviously depend on the specific tasks that the interface is designed to support, but the above set of criteria seems reasonably general.<\/p>\n<p style=\"text-align: left;\">Another useful direction to explore is to use full-text search on the collection of tweets and on the document set referred to by them to help browse and filter the results. Using this technique, we could, for example, find people based on the contents of the documents they tweet about. In reasonably large collections, this may be a viable means of finding key people for a particular topic.<\/p>\n<p style=\"text-align: left;\">In the months to come, Miles and I will continue to explore the possibilities suggested by this interface, and will try to deploy it in a more public way. Some of the challenges to overcome include making the system sufficiently responsive. The current prototype is hampered by certain <a title=\"Talking with Twitter | FXPAL Blog\" href=\"http:\/\/palblog.fxpal.com\/?p=2846\" target=\"_blank\">inefficiencies<\/a> in the Twitter API.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Last week Jeremy and I attended the SSM2010 workshop held in conjunction with WSDM2010. In addition to chairing one of the panels, I got an opportunity to demonstrate an interface that I built to browse Twitter search results, to which Daniel alluded in his summary of the workshop. The system is described in a position [&hellip;]<\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[24,31,15],"tags":[105,94,166,82],"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/posts\/2948"}],"collection":[{"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=2948"}],"version-history":[{"count":12,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/posts\/2948\/revisions"}],"predecessor-version":[{"id":2963,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/posts\/2948\/revisions\/2963"}],"wp:attachment":[{"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=2948"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=2948"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=2948"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}