In many classification tasks training data have missing feature values that can be acquired at a cost. For building accurate predictive models, acquiring all missing values is of...
Prem Melville, Foster J. Provost, Raymond J. Moone...
Most queries in web search are ambiguous and multifaceted. Identifying the major senses and facets of queries from search log data, referred to as query subtopic mining in this pa...
Yunhua Hu, Ya-nan Qian, Hang Li, Daxin Jiang, Jian...
Most data mining algorithms assume static behavior of the incoming data. In the real world, the situation is different and most continuously collected data streams are generated by...
Lior Cohen, Gil Avrahami, Mark Last, Abraham Kande...
In this paper we introduce a novel architecture for data processing, based on a functional fusion between a data and a computation layer. We show how such an architecture can be le...
Radu Sion, Ramesh Natarajan, Inderpal Narang, Wen-...
Abstract. Two extensions of the original Wilson’s editing method are introduced in this paper. These new algorithms are based on estimating probabilities from the k-nearest neigh...