Ranking a set of retrieved documents according to their relevance to a given query has become a popular problem at the intersection of web search, machine learning, and informatio...
We cast name discrimination as a problem in clustering short contexts. Each occurrence of an ambiguous name is treated independently, and represented using second?order context vec...
Weblog is increasingly important over time with researchers anxious to learn why millions of Internet users are so eager to post their own diary on the web everyday. This study co...
In this paper, we present a new co-training strategy that makes use of unlabelled data. It trains two predictors in parallel, with each predictor labelling the unlabelled data for...
Keyword search enables web users to easily access XML data without the need to learn a structured query language and to study possibly complex data schemas. Existing work has addr...