With over 800 million pages covering most areas of human endeavor, the World-wide Web is a fertile ground for data mining research to make a di erence to the e ectiveness of infor...
We consider the problem of segmenting a webpage into visually and semantically cohesive pieces. Our approach is based on formulating an appropriate optimization problem on weighte...
An author may have multiple names and multiple authors may share the same name simply due to name abbreviations, identical names, or name misspellings in publications or bibliogra...
Surprisingly, console logs rarely help operators detect problems in large-scale datacenter services, for they often consist of the voluminous intermixing of messages from many sof...
Wei Xu, Ling Huang, Armando Fox, David Patterson, ...
There are still no established methods for the evaluation of browsing and exploratory search tools. In the (multimedia) information retrieval community evaluations following the C...