Many users and applications require the integration of semi-structured data from autonomous, heterogeneous Web sources. Over the last years mediator systems have emerged that use d...
What makes template content in the Web so special that we need to remove it? In this paper I present a large-scale aggregate analysis of textual Web content, corroborating statist...
Background: Large molecular sequence databases are fundamental resources for modern bioscientists. Whether for project-specific purposes or sharing data with colleagues, it is oft...
Scott A. Givan, Christopher M. Sullivan, James C. ...
We describe a new paradigm for performing search in context. In the IntelliZap system we developed, search is initiated from a text query marked by the user in a document she view...
Lev Finkelstein, Evgeniy Gabrilovich, Yossi Matias...
In this paper, we present our online summarization system of web topics. The user defines the topic by a set of keywords. Then the system searches the Web for the relevant documen...