{"id":418,"date":"2009-03-18T08:24:21","date_gmt":"2009-03-18T15:24:21","guid":{"rendered":"http:\/\/palblog.fxpal.com\/?p=418"},"modified":"2009-03-17T15:26:56","modified_gmt":"2009-03-17T22:26:56","slug":"recall-and-precision-revisited","status":"publish","type":"post","link":"https:\/\/blog.fxpal.net\/?p=418","title":{"rendered":"Recall and precision, revisited"},"content":{"rendered":"<p>In his recent post, Daniel Tunkelang issued a <a title=\"Precision and Recall | The Noisy Channel\" href=\"http:\/\/thenoisychannel.com\/2009\/03\/17\/precision-and-recall\/\" target=\"_blank\">call<\/a> for renewed interest in recall as a measure of performance of information retrieval systems, particularly for exploratory search tasks. It is interesting to note that there are several possible ways to measure recall and precision for interactive tasks, and which measure you should use depends on what aspect of the entire human-computer system you are interested in.<\/p>\n<p><!--more-->Consider a (ranked) set of documents identified by a search engine in response to a query. The set could contain hundreds, thousands, or even millions of results, which could (given some ground truth) be used to compute recall and precision. But is that a sensible thing to do? If you are interested in how well the search algorithm worked, the answer is probably yes. If, however, you&#8217;re interested in how well the user interface worked, this measure is flawed because so many (quite often most) documents identified in response to a query are never shown to the user, and are thus irrelevant. In my <a title=\"From Information Retrieval to Hypertext and Back Again: The Role of Interaction in the Information Exploration Interface | FXPAL\" href=\"http:\/\/www.fxpal.com\/?p=genethesis\" target=\"_blank\">PhD thesis<\/a>, published in part <a title=\"Golovchinsky, G. 1997. Queries? Links? Is there a difference?. In Proc. CHI97, ACM Press pp. 407-414 | FXPAL\" href=\"http:\/\/www.fxpal.com\/?p=abstract&amp;abstractID=84\" target=\"_blank\">here<\/a>, I proposed modified measures of recall and precision to reflect the interactive search experience.<\/p>\n<p>I modified recall and precision measures by normalizing not by the total number of documents <em>retrieved<\/em>, but by the total number of documents <em>viewed<\/em> by the user. While at first blush this may seem like equating precision with recall, this is not the case when documents are presented as clusters or other non-linear ways as appropriate for sets.<\/p>\n<p>Furthermore, if you are interested in measuring something about the user&#8217;s effectiveness in using a system, you might also want to measure <em>selected<\/em> recall and precision, where this scores are normalized by the number of documents the person marked or judged as relevant.<\/p>\n<p>For example, in an information seeking <a title=\"Golovchinsky, G. and Chignell, M.H. (1997) The Newspaper as an Information Exploration Metaphor, IP&amp;M 33(5), pp. 663-683\" href=\"http:\/\/www.fxpal.com\/?p=abstract&amp;abstractID=155\" target=\"_blank\">experiment<\/a> that varied the presentation style of search results, I found that viewed recall and precision increased with the increase in the number of documents whose contents were displayed automatically in response to queries.<\/p>\n<p>Thus we can unpack the recall and precision measures into three distinct measurements: retrieved (traditional) recall and precision for system performance, viewed recall and precision for interface perfromance, and selected recall and precision for user behavior. Choose your measure wisely!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>We can unpack recall and precision into three measures that capture system performance, interface effectiveness, and user behavior.<\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[15,7],"tags":[],"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/posts\/418"}],"collection":[{"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=418"}],"version-history":[{"count":8,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/posts\/418\/revisions"}],"predecessor-version":[{"id":425,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/posts\/418\/revisions\/425"}],"wp:attachment":[{"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=418"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=418"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=418"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}