We propose to use MapReduce to quickly test new retrieval approaches on a cluster of machines by sequentially scanning all documents. We present a small case study in which we use...
Abstract. As the type of content available on the web is becoming increasingly diverse, a particular challenge is to properly determine the types of documents sought by a user, tha...
Shanu Sushmita, Benjamin Piwowarski, Mounia Lalmas
Abstract--This position paper describes the challenges related to federated enterprise search over heterogeneous product information sources. It focuses on the aspect of finding re...
Background: Biomedical ontologies are being widely used to annotate biological data in a computer-accessible, consistent and well-defined manner. However, due to their size and co...
Catherine Beauheim, Farrell Wymore, Michael Nitzbe...
Due to resource constraints, Web archiving systems and search engines usually have difficulties keeping the entire local repository synchronized with the Web. We advance the state...
Qingzhao Tan, Ziming Zhuang, Prasenjit Mitra, C. L...