Even in a massive corpus such as the Web, a substantial fraction of extractions appear infrequently. This paper shows how to assess the correctness of sparse extractions by utiliz...
Abstract-- In the age of Web 2.0 people organize large collections of web pages, articles, or emails in hierarchies of topics, or arrange a large body of knowledge in ontologies. T...
We propose two methods for constructing automated programs for extraction of information from a class of web pages that are very common and of high practical significance - varia...
With the growing use of distributed information networks, there is an increasing need for algorithmic and system solutions for data-driven knowledge acquisition using distributed,...
Doina Caragea, Jaime Reinoso, Adrian Silvescu, Vas...
1 In this article, we report our efforts in mining the information encoded as clickthrough data in the server logs to evaluate and monitor the relevance ranking quality of a commer...