{"id":3001,"date":"2010-02-18T07:24:51","date_gmt":"2010-02-18T15:24:51","guid":{"rendered":"http:\/\/palblog.fxpal.com\/?p=3001"},"modified":"2010-02-18T07:38:33","modified_gmt":"2010-02-18T15:38:33","slug":"whats-private-on-the-web","status":"publish","type":"post","link":"https:\/\/blog.fxpal.net\/?p=3001","title":{"rendered":"What&#8217;s private on the Web?"},"content":{"rendered":"<p>Hillary Mason of <a title=\"bit.ly, a simple url shortener\" href=\"http:\/\/bit.ly\" target=\"_blank\">bit.ly<\/a> wrote a nice <a title=\"Conference: Search and Social Media 2010 | Hi. I'm Hillary Mason.\" href=\"http:\/\/www.hilarymason.com\/blog\/conference-search-and-social-media-2010\/\" target=\"_blank\">summary<\/a> of some key issues raised in the recent <a title=\"Search in Social Media 2010 Workshop\" href=\"http:\/\/ir.mathcs.emory.edu\/SSM2010\/\" target=\"_blank\">Search in Social Media 2010<\/a> workshop. (For other commentary, see Daniel Tunkelang&#8221;s <a title=\"Third Workshop on Search and Social Media (SSM 2010) | ACM Blogs\" href=\"http:\/\/cacm.acm.org\/blogs\/blog-cacm\/71444-third-workshop-on-search-and-social-media-ssm-2010\/fulltext\" target=\"_blank\">post<\/a> and our pre-workshop <a title=\"What do we mean by \u201cSearch in Social Media\u201d? | FXPAL Blog\" href=\"http:\/\/palblog.fxpal.com\/?p=2814\" target=\"_blank\">comments<\/a>.) Hillary asked several important questions, that break out into two main topics: what and how can we compute from social data on one hand, and what are the implications of those computations. Aspects such as computing relevance, how to architect social search engines, and how to represent users&#8217; information needs in appropriate ways all represent the what and how category. We can be sure that adequate\u00a0 engineering solutions will be found these problems.<\/p>\n<p>The second topic, however, is more problematic because it deals more with the impact that technology has on the individual and on society, rather than about technology <em>per se<\/em>.<\/p>\n<p><!--more-->Hillary asks<\/p>\n<blockquote><p>What data is available to social search? There are many kinds of  social data, from e-mail (private) to blogs (public) and tweets (mostly  public) \u2014 what is and should be searchable? How do we handle issues of  privacy and identity management?<\/p>\n<p>How do we evaluate accuracy and <em>truthiness<\/em> of social data?<\/p>\n<p>How do we characterize social connections, around concepts like  strong vs weak ties, and friend-of-a-friend vs  friend-of-a-friend\u2019s-friend? Can we converge on a single social graph  representation?<\/p>\n<p>Finally, how do we deal with the chasm between the industry  participants (who have LOTS of data) and the academic participants, who  suffer from a lack of public (and publishable) data?<\/p><\/blockquote>\n<p>This is a fascinating list permeated by issues of privacy. Despite <a title=\"Google CEO On Privacy (VIDEO): 'If You Have Something You Don't Want Anyone To Know, Maybe You Shouldn't Be Doing It' | Huffington Post\" href=\"http:\/\/www.huffingtonpost.com\/2009\/12\/07\/google-ceo-on-privacy-if_n_383105.html\" target=\"_blank\">assertions<\/a> that privacy is a thing of the past and we should get over it, the public <a title=\"Google's social side hopes to catch some Buzz | CNET News\" href=\"http:\/\/news.cnet.com\/8301-30684_3-10449662-265.html\" target=\"_blank\">reaction<\/a> to Google Buzz&#8217;s fizzy debut argues against that position. In fact, privacy maybe a particularly thorny problem for searching and aggregating social media. People leave extensive traces of their online activity on social sites (and on <a title=\"What I saw during the Superbowl | FXPAL Blog\" href=\"http:\/\/palblog.fxpal.com\/?p=2965\" target=\"_blank\">search engines in general<\/a>), and a range of Social Network Analysis algorithms originally developed by sociologists to analyze research populations can be brought to bear at web scale on the problems of federating partially-overlapping social networks.<\/p>\n<p>The danger, of course, is to do it too well! We have seen cases where the release of public data has serious consequences for the vulnerable, including <a title=\"fuck you, google | Fugitivus\" href=\"http:\/\/fugitivus.wordpress.com\/2010\/02\/11\/fuck-you-google\/\" target=\"_blank\">women<\/a> and <a title=\"Wrong kind of buzz around Google Buzz |  Foreign Policy\" href=\"http:\/\/neteffect.foreignpolicy.com\/posts\/2010\/02\/11\/wrong_kind_of_buzz_around_google_buzz\" target=\"_blank\">political dissidents<\/a>.<\/p>\n<p>While the tools we create are neutral, they enable both positive and negative activities. <a title=\"Twiangulate - Making Connections\" href=\"http:\/\/twiangulate.com\/search\/\" target=\"_blank\">Twiangulate.com<\/a> can be used to find prospective clients and like-minded individuals, or it can be used to piece together networks of political dissidents. Google Buzz can make it easy to keep track of your friends, or to engage in verbal abuse and sexual harassment.<\/p>\n<p>Finally, the issue of what constitutes legitimate use of public social network data (such as the Pete Warden&#8217;s <a title=\"The Man Who Looked Into Facebook's Soul | ReadWriteWeb\" href=\"http:\/\/www.readwriteweb.com\/archives\/facebook_user_data_analysis.php\" target=\"_blank\">Facebook crawl<\/a>) needs to be discussed and understood. Last week the CSCW2010 conference saw a lively debate (ironically <a title=\"CSCW2010 | Twapper Keeper\" href=\"http:\/\/twapperkeeper.com\/cscw2010\/\" target=\"_blank\">captured<\/a> through Twitter) on the role of <a title=\"Institutional Review Board | Wikipedia\" href=\"http:\/\/en.wikipedia.org\/wiki\/Institutional_review_board\" target=\"_blank\">IRB<\/a>s for collecting public data for research. There are interesting points on both sides, and we as researchers need to work through the issues and understand what is appropriate and inappropriate use of this data, and the public at large needs to understand better the implications of making this data publicly available. At the moment, I don&#8217;t think we have a good grasp on either aspect.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Hillary Mason of bit.ly wrote a nice summary of some key issues raised in the recent Search in Social Media 2010 workshop. (For other commentary, see Daniel Tunkelang&#8221;s post and our pre-workshop comments.) Hillary asked several important questions, that break out into two main topics: what and how can we compute from social data on [&hellip;]<\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[48,58,122],"tags":[119,166],"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/posts\/3001"}],"collection":[{"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=3001"}],"version-history":[{"count":11,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/posts\/3001\/revisions"}],"predecessor-version":[{"id":3014,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/posts\/3001\/revisions\/3014"}],"wp:attachment":[{"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=3001"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=3001"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=3001"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}