Search engines present fix-length passages from documents ranked by relevance against the query. In this paper, we present and compare novel, language-model based methods for extr...
The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
The PageRank algorithm, used in the Google search engine, greatly improves the results of Web search by taking into account the link structure of the Web. PageRank assigns to a pa...
Snippets are used by almost every text search engine to complement ranking schemes in order to effectively handle user keyword search. Despite the fact that XML is a standard repr...
Ranking microblogs, such as tweets, as search results for a query is challenging, among other things because of the sheer amount of microblogs that are being generated in real time...