This paper proposes two mechanisms for reducing the communication-related overheads of Web applications. One mechanism is user-level connection tracking, which allows an applicati...
In a traditional information retrieval system, it is assumed that queries can be posed about any topic. In reality, a large fraction of web queries are posed about a relatively sm...
We present experiments in automatic genre classification on web corpora, comparing a wide variety of features on several different genreannotated datasets (HGC, I-EN, KI-04, KRYS...
Given the pairwise affinity relations associated with a set of data items, the goal of a clustering algorithm is to automatically partition the data into a small number of homogen...
We continue to advocate a methodology that we used earlier for pattern discovery through exhaustive search in selected small domains. This time we apply it to the problem of disco...