Blog

NSF-funded digital library?

by Gene Golovchinsky on October 6, 2009 Comments (2)

According to a press release from Johns Hopkins (via @pentcheff), its library “received a $300,000 grant from NSF to study the feasibility of developing, operating and sustaining an open access repository of articles from NSF-sponsored research.” This grant was inspired by NLM’s PubMedCentral repository of NIH-funded medical research. This is interesting from the perspective of HCIR because if the precedent holds, this collection will be publicly searchable and downloadable, making it a good candidate as a research collection.

Another interesting–and more controversial–implication is the issue of copyright: given that a large chunk of research published by the ACM, over which ACM currently claims copyright. Will the NSF require these to be included? Will the ACM release the publications, or will it just provide metadata and keep the full papers hidden in the ACM DL? It will be interesting to see how JHU goes about identifying the stakeholders for this activity.

Workshop: Virtual Worlds in 2020

by Maribeth Back on October 5, 2009 Comments (2)

A quick pointer to a workshop sponsored by SDForum’s Virtual Worlds SIG (which I co-chair along with Bob Ketner of The Tech and Eilif Trondsen of SRI-BI):

The “Virtual Worlds in 2020” Workshop
Palo Alto, CA
Tuesday, Oct. 13, 2009

From the program description:
This is the 3rd annual “Future of Virtual Worlds” session – the Virtual Worlds in 2020 Workshop. This year it’s an interactive workshop where you can bring ideas, input, and questions for a rare, long term view of virtual worlds, at the Virtual Worlds SIG.

In just a few weeks we enter a new decade equipped with abilities that existed only in science fiction a few years ago. Although plans for using using graphical, collaborative virtual worlds predate the internet itself by many years, many advances in productivity remain unclaimed. It’s time now to take a look ahead. This workshop will produce a set of inputs showing what might be possible – along with a list of challenges to be overcome along the way over the next decade. Continue Reading

Hacking in the Humanities

by Gene Golovchinsky on October 2, 2009 Comments (3)

Miles Efron’s latest blog post about humanities computing reminded me of a breakout discussion we had at the BooksOnline’08 workshop about expectations of humanities scholars with respect to computation. I don’t remember everyone who was at that table, but we talked about the need to build tools for specific analyses, and how that might take someone several months to do. My take is that while we cannot (and should not) expect researchers in the humanities to create complex systems (we don’t even expect some CS types to do it!), a certain proficiency with scripting should be a desirable (if not required) part of any Masters’ program, along side philosophy and ancient languages.

It doesn’t matter if students learn how to use perl, Ruby, Groovy, or some other language du jour; what’s important is that they gain the problem-solving skills and the confidence to apply them to problems that interest them. Modern programming languages can be much more expressive, and modern computers are more forgiving of unoptimized code, making it easier to get stuff to work. Giving students the ability to express themselves in a new medium should improve both the scholar and the scholarship. And this applies to iSchools, too.

Lack of progress as an opportunity for progress

by Gene Golovchinsky on October 1, 2009 Comments (2)

Timothy G. Armstrong, Alistair Moffat, William Webber, and Justin Zobel have written what will undoubtedly be a controversy and discussion-inspiring paper for the upcoming CIKM 2009 conference. The paper compares over 100 studies of information retrieval systems based on various TREC collections, and concludes that not much progress has been made over the last decade in terms off Mean Average Precision (MAP). They also found that studies that use the TREC data outside the TREC competition tend to pick poor baselines to show short-term improvement (which is publishable) without demonstrating long-term gains in system performance. This interesting analysis is summarized in a blog post by William Webber.

eBooks aren’t just for reading anymore

by Gene Golovchinsky on September 30, 2009 Comments (1)

There has been more news on eBook hardware front recently. Microsoft is floating a two-screen device idea reminiscent of Nick Chen‘s thesis work that he has published in part in CHI 2008. The video is worth watching. The rendering of the MS ‘Courier’ device is slick, but at this point no specs are available. A UX mockup video shows some nice ideas, but it is not clear how much of this will survive in the product. And of course it will need to compete with the Apple tablet, whether that thing materializes.

More interestingly, IREX announced a digital reader that is a follow-on to the Iliad.

IREX DR 800 eBook reader

Getting a CLuE

by Gene Golovchinsky on September 29, 2009 Comments (1)

An NSF-funded cloud computing event is coming to the Bay Area.

In October 2007, Google and IBM announced the first pilot phase of the Academic Cloud Computing Initiative (ACCI), which granted several prominent U.S. universities access to a large computer cluster running Hadoop, an open source distributed computing platform inspired by Google’s file system and MapReduce programming model. In February 2008, the ACCI partnered with the National Science Foundation to provide grant funding to academic researchers interested in exploring large-data applications that could take advantage of this infrastructure. This resulted in the creation of the Cluster Exploratory (CLuE) program led by Dr. Jim French, which currently funds 14 projects. See this NSF Press Release for a short description of all the projects funded under the CLuE program.

The event will be held on October 5th in the Computer History Museum (the current home of the Babbage Difference Engine No2 Serial #2), and will feature a great lineup of researchers reporting on their accomplishments in a variety of disciplines, including indexing for search, data processing, machine translation, text processing, databases, visualization, and other cloud computing topics. You can get more details about the schedule and the speakers here, and click here to register.

CFP: 2nd Workshop on Collaborative Information Seeking

by Gene Golovchinsky on September 28, 2009 Comments (2)

Jeremy and I have been blogging about collaborative search for a while, and it is our pleasure to announce that Merrie Morris and we are organizing another workshop on Collaborative Information Seeking. The first workshop was held in 2008 in conjunction with the JCDL 2008 conference. We had a many interesting presentations and a lot of discussion about systems, algorithms, and evaluation.You can find the proceedings from the workshop on arXiv.org (metadata and papers) and on the workshop web site.

It’s time to revisit this topic, this time in conjunction with the CSCW 2010 conference. The workshop call for participation is here. Our goal is

to bring together researchers with backgrounds in CSCW, social computing, information retrieval, library sciences and HCI to discuss the research challenges associated with the emerging field of collaborative information seeking.

To participate, please submit a 2-4 page position paper in the ACM format by November 20th. The workshop will take place in February, in Savannah, Georgia. Hope to see you there!

A tale of two islands

by Gene Golovchinsky on September 27, 2009 Comments (2)

ECDL 2009 is taking place this week, and those of us who could not make it to Corfu will have to settle for the island experience of the Second (Life) Kind. Just as JCDL 2009 did earlier this summer, the ECDL 2009 Poster Session is available for viewing online through SecondLife. The real Poster Session will take place Monday, September 28th, (7-9pm EET, 12:00-14:00 EST, 9-11am PDT), with a parallel session in SecondLife that will continue long after the real one ends.

The complete list of posters is available here; I am looking forward to “Improving annotations in digital documents,” “Searching in a book,” and “Workspace narrative exploration: overcoming interruption-caused context loss in information seeking tasks.”

There are some interesting papers at ECDL as well, including

“Annotation search: the FAST way” by Nicola Ferro
“Chance encounters in the digital library” by Elaine Toms and Lori Mccay-Peet
“Comparing Google to Ask-a-librarian service for answering factual and topical questions” by Pertti Vakkari and Mari Taneli
“Evaluation in context” by Jaap Kamps, Mounia Lalmas and Birger Larsen
“Exploratory web searching with dynamic taxonomies and results clustering” by Panagiotis Papadakos, Stella Kopidaki, Nikos Armenatzoglou and Yannis Tzitzikas

In pursuit of impact

by Gene Golovchinsky on September 25, 2009 Comments (3)

Impact of academic research is often measured through citation counts. Arguably, this is a more sensitive measure than just the number of publications, or even the number of publications in prestigious journals. Innovative work often gets published in venues with mixed reputations because prestigious journals and conferences may reject ideas that don’t fit well with the orthodoxy the discipline. In its heyday, for example, the ACM Hypertext Conference rejected Tim Berners-Lee’s paper on the World Wide Web because (among perhaps other reasons) that work contracted then-established standards of what makes interesting Hypertext research.

Thus it is useful to measure the citation counts of papers to understand their impact on the field. Traditionally, this has been the purview of librarians and citation indexes, but the proliferation of publication venues, and the desire to recognize work that was not published in the mainstream (or perhaps not officially published at all, as Daniel Lemire points out) makes the task of collation difficult.

The many faces of PubMed search

by Gene Golovchinsky on September 24, 2009 Comments (2)

The number of third-party tools for searching PubMed data seems to be increasing recently. As the NLM is about to roll out a new search interface, companies are starting to offer alternative interfaces for searching this important collection. The attraction is obvious: a large, motivated group of searchers, an important information need, and a manageable collection size. A decade ago, over 20 million searches were done monthly through the NLM site, and the numbers are surely higher today; the collection is large but not huge — currently over 17 million entries (some with full text), occupying somewhat more than 60GB of disk space. Thus we see an increasing number of sites offering search over this collection, including PubGet, GoPubMed, TexMed, and HubMed. The offerings range from basic to flashy, and appear to be aiming at different groups of searchers.

Blog

Categories

Archive

Blog

NSF-funded digital library?

Workshop: Virtual Worlds in 2020

Hacking in the Humanities

Lack of progress as an opportunity for progress

eBooks aren’t just for reading anymore

Getting a CLuE

CFP: 2nd Workshop on Collaborative Information Seeking

A tale of two islands

In pursuit of impact

The many faces of PubMed search