{"id":3277,"date":"2010-03-26T10:15:30","date_gmt":"2010-03-26T17:15:30","guid":{"rendered":"http:\/\/palblog.fxpal.com\/?p=3277"},"modified":"2010-03-26T08:24:17","modified_gmt":"2010-03-26T15:24:17","slug":"sigir-reviews-as-pseudo-relevance-feedback","status":"publish","type":"post","link":"https:\/\/blog.fxpal.net\/?p=3277","title":{"rendered":"SIGIR Reviews as Pseudo-Relevance Feedback"},"content":{"rendered":"<p>Some ACM conferences such as CHI offer authors an opportunity to flag  material misconceptions in reviewers&#8217; perceptions of submitted papers  prior to rendering a final accept\/reject decision. SIGIR is not one of  them. Its reviewers are free from any checks on their accuracy from the  authors, and, to judge by the reviews of our submission, from the  program committee as well.<\/p>\n<p>Consider this: We wrote a paper on a novel IR framework which we  believe has the potential to greatly increase the efficacy of  interactive Information Retrieval systems. The topic we tackled is (not surprisingly) related to issues we often discuss on this and on the <a href=\"http:\/\/irgupf.com\/\">IRGupf blog<\/a>, including HCIR, Interactive IR,  Exploratory Search, and Collaborative Search.\u00a0 In short, these are all  areas that could be well served by an algorithmic framework  that supports greater interactivity.<\/p>\n<p><!--more-->So in our paper, we chose to evaluate our framework through  experiments that involved relevance feedback.\u00a0 Relevance feedback is a  long-studied, traditionally well-accepted interaction paradigm.\u00a0 The  user runs a query, judges a few documents for relevance, and any  relevant documents that are found during this process are saved or  marked and fed back into the system to produce even better results on subsequent queries.\u00a0 Our  results showed that the proposed framework is not only more effective  than a robust, well-understood baseline, but that algorithms involved  are up to an order of magnitude more efficient than  traditional baselines.\u00a0 And speed is of utmost importance to interactive  IR systems!<\/p>\n<p>We received three reviews&#8230;<\/p>\n<h3><img decoding=\"async\" title=\"More...\" src=\"http:\/\/palblog.fxpal.com\/wp-includes\/js\/tinymce\/plugins\/wordpress\/img\/trans.gif\" alt=\"\" \/>Reviewer #1<\/h3>\n<p>The first review, after summarizing our contribution, read in its  entirety:<\/p>\n<blockquote><p>The paper is well written and  both the idea and the  experimental part are sound.<\/p><\/blockquote>\n<p>This was accompanied by a 4\/6 recommendation score.\u00a0 Not much help.<\/p>\n<h3>Reviewer #2<\/h3>\n<p>The second review&#8217;s worst criticism of the work was that the  evaluation was incomplete:<\/p>\n<blockquote><p>The idea is new &amp; interesting, especially that it can  make use of  non-text query logs. One drawback of the paper, in this  reviewer&#8217;s  opinion, is incompleteness: why pseudo-relevance-feedback  not considered  as well, which is easy to do?  Asking a user to judge  documents until  one gets 5 relevant may not be realistic. Even if PRF  does not work,  paper should present the results.<\/p>\n<p>&#8230;<\/p>\n<p>The idea &#8230; is new &amp; interesting, especially that it can make  use of non-text query logs. One drawback of the paper, in this  reviewer&#8217;s opinion, is incompleteness: no study of employing  pseudo-relevant docs. The impact would be small if one requires judged  relevant docs.<\/p><\/blockquote>\n<p>This criticism is flawed on three counts:<\/p>\n<ol>\n<li>First, it was flat out  wrong.\u00a0 We were not asking people to find five <em>relevant documents<\/em>,  we were asking them to make five <em>judgments of relevance<\/em>. This  was made very clear in the paper.\u00a0 Furthermore, even if making explicit  judgments is difficult, there are many techniques for eliciting implicit  (but not pseudo!) judgments of relevance (e.g., see <a title=\" Kelly,  D. and Belkin, N. J. (2001) Reading time, scrolling and interaction:  exploring implicit sources of user preferences for relevance feedback.  In Proc. SIGIR 2001\" href=\"http:\/\/doi.acm.org\/10.1145\/383952.384045 \" target=\"_blank\">Kelly  and Belkin, 2001<\/a>).<\/li>\n<li>The second reason for not doing pseudo-relevance  feedback is that it is more likely to introduce noise and topic drift.<\/li>\n<li>The third reason for avoiding PRF entirely is that it is unnecessary  for interactive systems.\u00a0 Indeed, if a user is reading or saving  (marking) documents, i.e. if a user is giving explicit judgments of  relevance, decades of research have already shown those judgments to be  much more effective than pseudo-judgments.<\/li>\n<\/ol>\n<p>The argument that somehow an  evaluation is incomplete or meaningless &#8212; or that the impact is small  &#8212; if it does not involve pseudo-relevance feedback is offensive in its  narrow-mindedness.\u00a0 What it reflects, we believe, is the current bias of  the field as a whole toward non-interactive web search-like  experiences.\u00a0 In the web IR world, the commonly held understanding is  that users are too lazy to engage in explicit relevance feedback, or  else are engaged in a type of information seeking activity, such as  navigation, that does not require any feedback, pseudo-relevant or  otherwise.\u00a0 But web information retrieval is not all of information  retrieval.<\/p>\n<p>This second reviewer gave us a 3\/6 recommendation score.<\/p>\n<h3><a title=\"Scientific Peer Review, ca. 1945 | YouTube\" href=\"http:\/\/www.youtube.com\/watch?v=-VRBWLpYCPY\" target=\"_blank\">Reviewer  #3<\/a><\/h3>\n<p>The third reviewer, while stating that the work is novel, had two  main concerns:<\/p>\n<ol>\n<li>Our chosen query expansion technique (selection and weighting of  terms) was not convincing because many  others were possible, using our  framework.<\/li>\n<li>We did not evaluate using pseudo-relevance feedback.<\/li>\n<\/ol>\n<p>For the first point, we intentionally chose the simplest  implementation of our framework to show its strength and to  intentionally make the fairest comparison possible.\u00a0 If our naive  approach beats the quite reasonable baseline (20-30% increase in  effectiveness, 10x speedup in efficiency) that should be enough; it is  beyond the scope of a conference paper to exhaustively demonstrate its  effectiveness for arbitrary schemes a reviewer might dream up. The naive  approach worked. That&#8217;s a publishable result.<\/p>\n<p>That brings us again to the second point, which actually sounded like  the stronger reviewer criticism: Pseudo-relevance feedback.\u00a0 Reading  between the lines of the reviews, one gets the impression that  the  reviewers are well-versed in traditional web search: they mention  log  mining (a minor aside in our paper), and are obsessed with   pseudo-relevance feedback. Those of us who were doing IR research before   the late 1990s remember a time when intellectual efforts were not   judged by standards applicable only to web search. The diversity of   approaches, of metrics, and\u00a0 of applications of that era seem to have   been reduced to the bleak outlines of precision-oriented page-at-a-time   results lists, where<em> interactivity is looked upon as a burden rather  than an opportunity<\/em>.<\/p>\n<p>It is ironic, then, to <a title=\"Google Goes Explicitly Collaborative  | FXPAL Blog\" href=\"http:\/\/palblog.fxpal.com\/?p=3252\" target=\"_blank\">note<\/a> that just two days ago, <em>on the same day that our rejection  reviews arrived<\/em>, Google rolled out an interface that allows people  to make explicit relevance judgments through bookmarking, which Google&#8217;s  algorithms then use as a form of relevance feedback!\u00a0 The old web maxim  of users being too lazy or unwilling or unengaged to mark documents for  relevance &#8212; thus necessitating pseudo-relevance feedback at the  expense of real relevance feedback &#8212; was busted by a major web search  engine!<\/p>\n<p>The third reviewer&#8217;s score was 4\/6.<\/p>\n<h3>So what? What are we going to do about it?<\/h3>\n<p>Given the discussion on Twitter and in e-mail in the aftermath of  this  round of rejection decisions, I think it is safe to say that we  are not  alone in our dissatisfaction in the reviewing process.\u00a0 What  will happen is what always happens; the paper will be resubmitted  elsewhere and life moves on.<\/p>\n<p>But what about the SIGIR conference, and  the community it represents? Are we unhappy with the misreadings and  misunderstandings, with the reviewer who could not tell the difference  between &#8220;5 judgments of relevance&#8221; and &#8220;5 relevant documents&#8221;?\u00a0 Yes, of  course. And a conference review system that allows for anonymous  feedback from the author to the reviewers, as does CHI, could go a long  way in rectifying these misunderstandings.\u00a0 Misconceptions of one&#8217;s work  are, to a certain extent, completely  understandable.\u00a0 Even if a paper  is written clearly, the reviewers have  not grappled with the ideas  anywhere near as much as the author(s) have.<\/p>\n<p>But what about the more basic problem, the one of narrow thinking in  the reviews themselves?\u00a0 The idea that a paper on interactivity and  relevance feedback is not acceptable unless it also includes experiments  on and evaluations of the non-interactive pseudo-relevance feedback  approach is one that we have a difficult time accepting.\u00a0 Non-interactive  approaches, and the web search world which thrives on them, are popular  right now.\u00a0 Pseudo-relevance feedback epitomizes that  non-interactivity, yet multiple reviewers suggested that the lack of PRF was the paper&#8217;s  biggest weakness.\u00a0 We feel that it isn&#8217;t; PRF is a different problem,  solving a different kind of need, in a different kind of scenario.\u00a0 Not all of information retrieval is web search.\u00a0 So  what is one to do when a review not only mis-perceives a paper, but  actively tries to impose its own values onto the paper, asking it to  solve a different kind of problem than the one it is trying to solve?<\/p>\n<p>Given the non-interactivity in the SIGIR review process, the  inability to discuss and correct mis-perceptions and biases, one  is tempted to label reviewer comments themselves as a form of  pseudo-relevance feedback.\u00a0 I.e. contrary to appearances, no explicit  judgments of relevance to the conference had actually been actually made.\u00a0 ;-)<\/p>\n<p>Comments and lively discussion are welcome.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Some ACM conferences such as CHI offer authors an opportunity to flag material misconceptions in reviewers&#8217; perceptions of submitted papers prior to rendering a final accept\/reject decision. SIGIR is not one of them. Its reviewers are free from any checks on their accuracy from the authors, and, to judge by the reviews of our submission, [&hellip;]<\/p>\n","protected":false},"author":6,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[18],"tags":[188],"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/posts\/3277"}],"collection":[{"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=3277"}],"version-history":[{"count":11,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/posts\/3277\/revisions"}],"predecessor-version":[{"id":3287,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/posts\/3277\/revisions\/3287"}],"wp:attachment":[{"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=3277"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=3277"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=3277"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}