In active learning, a machine learning algorithm is given an unlabeled set of examples U, and is allowed to request labels for a relatively small subset of U to use for training. ...
Keyword3 generation for search engine advertising is an important problem for sponsored search or paidplacement advertising. A recent strategy in this area is bidding on nonobviou...
Abstract. The data stream model of computation is often used for analyzing huge volumes of continuously arriving data. In this paper, we present a novel algorithm called DUCstream ...
Abstract. Bayesian spam filters, in general, compute probability estimations for tokens either without considering the email areas of occurrences except the body or treating the s...
A novel approach to clustering co-occurrence data poses it as an optimization problem in information theory which minimizes the resulting loss in mutual information. A divisive cl...