{"id":4528,"date":"2010-09-07T07:53:44","date_gmt":"2010-09-07T14:53:44","guid":{"rendered":"http:\/\/palblog.fxpal.com\/?p=4528"},"modified":"2010-10-20T11:29:53","modified_gmt":"2010-10-20T18:29:53","slug":"talkminer","status":"publish","type":"post","link":"https:\/\/blog.fxpal.net\/?p=4528","title":{"rendered":"TalkMiner"},"content":{"rendered":"<p>While many of the systems we build at FXPAL are either deployed internally or transferred to our parent company, in some cases we get to deploy them in the real world. This week, we released <a title=\"TalkMiner\" href=\"http:\/\/www.talkminer.com\" target=\"_blank\">TalkMiner<\/a>, a system for indexing and searching video of lecture broadcasts. We&#8217;ve indexed broadcasts from a variety of sources, including the U.C. Berkeley <a title=\"Broadcasts from UC Berkeley | TalkMiner\" href=\"http:\/\/talkminer.com\/searcher.jsp?q=author%3A%22ucberkeley%22\" target=\"_blank\">webcast.berkeley<\/a> site, the <a title=\"Blip TV | TalkMiner\" href=\"http:\/\/talkminer.com\/searcher.jsp?q=author%3A%22blip.tv%22\" target=\"_blank\">blip.tv<\/a> site, and various channels on YouTube, including <a title=\"Google Tech Talks | TalkMiner\" href=\"http:\/\/talkminer.com\/searcher.jsp?q=author%3A%22googletechtalks%22\" target=\"_blank\">Google  Tech Talks<\/a>, <a title=\"Stanford University | TalkMiner\" href=\"http:\/\/talkminer.com\/searcher.jsp?q=author%3A%22stanforduniversity%22\" target=\"_blank\">Stanford University<\/a>, <a title=\"Massachusetts Institute of Technology | TalkMiner\" href=\"http:\/\/talkminer.com\/searcher.jsp?q=author%3A%22mit%22\" target=\"_blank\">MIT Open Courseware<\/a>, <a title=\"O'Reilly Media | TalkMiner\" href=\"http:\/\/talkminer.com\/searcher.jsp?q=author%3A%22oreillymedia%22\" target=\"_blank\">O\u2019Reilly Media<\/a>,  <a title=\"TED Talks | TalkMiner\" href=\"http:\/\/talkminer.com\/searcher.jsp?q=author%3A%22tedtalksdirector%22\" target=\"_blank\">TED Talks<\/a>, and <a title=\"NPTEL | TalkMiner\" href=\"http:\/\/talkminer.com\/searcher.jsp?q=author%3A%22nptelhrd%22\" target=\"_blank\">NPTEL Indian Institute of Technology<\/a>.<\/p>\n<p>But all of these videos are already indexed by web search engines, you say; why do we need TalkMiner?<\/p>\n<p><!--more-->While web search engines index the text of the page in which the video is embedded, TalkMiner indexes the contents of the slides in the video, making more fine-grained retrieval of video possible. Is this useful?<\/p>\n<p>Well, it turns out the deployment of The Berkeley webcasting system (developed by our president Larry Rowe while he was a professor there) showed that<\/p>\n<blockquote><p>&#8230; students almost always watched the lectures  on-demand rather than in real-time, and they rarely watched the entire  lecture.\u00a0 Students use the webcasts to study for exams \u2013 we could see  this clearly by patterns of usage \u2013 and, they primarily wanted to review  selected material covered by the instructor.\u00a0 In one class we  discovered that for over 50% of the lectures, students watched less than  10 minutes from a 50-minute lecture and students watched the entire  lecture only 10% of the time.\u00a0 Consequently, for using the system,  effective search is a big issue.<\/p><\/blockquote>\n<p>To solve this problem, TalkMiner recognizes images of presentations in lecture video, and applies OCR to these regions to extract the slide text. This text is indexed along with the associated time codes, and can then be used to search for specific content. The video is divided into segments corresponding to slides; thumbnails of slides are shown when a video is selected. The video can then be watched end-to-end, or you can skip to a particular slide and listen from there. To help find topics of interest, slides that contain keyword matches to the query are highlighted.<\/p>\n<p>The current index contains over 12,200 talks on a range of topics, and additional talks are indexed daily. Take a look at the system and let us know what you think!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This week, we released <a href=\"http:\/\/www.talkminer.com\" target=\"_blank\">TalkMiner<\/a>, a system for indexing and searching video of lecture broadcasts.<\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[15],"tags":[256,274],"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/posts\/4528"}],"collection":[{"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=4528"}],"version-history":[{"count":11,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/posts\/4528\/revisions"}],"predecessor-version":[{"id":4858,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=\/wp\/v2\/posts\/4528\/revisions\/4858"}],"wp:attachment":[{"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=4528"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=4528"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.fxpal.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=4528"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}