Information on the Web is not only abundant but also redundant. This redundancy of information has an important consequence on the relation between the recall of an information ga...
We describe an adaptive method for extracting records from web pages. Our algorithm combines a weighted tree matching metric with clustering for obtaining data extraction patterns...
We describe a method for predicting query difficulty in a precision-oriented web search task. Our approach uses visual features from retrieved surrogate document representations (...
Eric C. Jensen, Steven M. Beitzel, David A. Grossm...
Abstract. The rapid growth of on-line information including multimedia contents during the last decade caused a major problem for Web users - there is too much information availabl...
The retrieval of similar documents in the Web from a given document is different in many aspects from information retrieval based on queries generated by regular search engine use...
Felipe Bravo-Marquez, Gaston L'Huillier, Sebasti&a...