We introduce a new type of graphical model called a `cumulative distribution network' (CDN), which expresses a joint cumulative distribution as a product of local functions. ...
A query speller is crucial to search engine in improving web search relevance. This paper describes novel methods for use of distributional similarity estimated from query logs in...
AdaBoost is a well known, effective technique for increasing the accuracy of learning algorithms. However, it has the potential to overfit the training set because its objective i...
Visualization research seeks to exploit the human user’s ability to interpret graphical representations of data in order to provide insight into properties of the data that may ...
Support vector machines (SVMs) are regularly used for classification of unbalanced data by weighting more heavily the error contribution from the rare class. This heuristic techn...