The document-length normalization problem has been widely studied in the field of Information Retrieval. The Cosine Normalization [2], the Maximum tf Normalization [1] and the By...
Sylvain Lamprier, Tassadit Amghar, Bernard Levrat,...
In this work we compare different techniques to automatically find candidate web pages to substitute broken links. We extract information from the anchor text, the content of the p...
Abstract The explosion of content in distributed information retrieval (IR) systems requires new mechanisms to attain timely and accurate retrieval of unstructured text. In this pa...
In this paper we propose an attribute retrieval approach which extracts and ranks attributes from Web tables. We use simple heuristics to filter out improbable attributes and we ...
Many recommendation and retrieval tasks can be represented as proximity queries on a labeled directed graph, with typed nodes representing documents, terms, and metadata, and labe...