We present a technique for augmenting annotated training data with hierarchical word clusters that are automatically derived from a large unannotated corpus. Cluster membership is...
The popular K-means clustering partitions a data set by minimizing a sum-of-squares cost function. A coordinate descend method is then used to nd local minima. In this paper we sh...
Hongyuan Zha, Xiaofeng He, Chris H. Q. Ding, Ming ...
: Most of the recently discussed and commercially introduced test stimulus data compression techniques are based on low care bit densities found in typical scan test vectors. Data ...
Abstract—We propose a strategy to perform query processing on P2P similarity search systems based on peers and superpeers. We show that by approximating global but resumed inform...
We consider the problem of integrating a large number of interface schemas over the Deep Web, The scale of the problem and the diversity of the sources present serious challenges ...