In a new model for answer retrieval, document collections are distilled offline into large repositories of facts. Each fact constitutes a potential direct answer to questions seek...
In the AllRight project, we are developing an algorithm for unsupervised table detection and segmentation that uses the visual rendition of a Web page rather than the HTML code. O...
Search engine results are usually presented in some form of text summary (e.g., document title, some snippets of the page's content, a URL, etc). Based on the information con...
We propose measuring "visualness" of concepts with images on the Web, that is, what extent concepts have visual characteristics. This is a new application of "Web i...
We present a system that gathers and analyzes online discussion as it relates to consumer products. Weblogs and online message boards provide forums that record the voice of the p...
Natalie S. Glance, Matthew Hurst, Kamal Nigam, Mat...