In this paper, we propose a document clustering method that strives to achieve: (1) a high accuracy of document clustering, and (2) the capability of estimating the number of clus...
A major obstacle that decreases the performance of text classifiers is the extremely high dimensionality of text data. To reduce the dimension, a number of approaches based on rou...
We present a new algorithm for a robust family of Earth Mover’s Distances - EMDs with thresholded ground distances. The algorithm transforms the flow-network of the EMD so that t...
We look at the problem of location recognition in a large image dataset using a vocabulary tree. This entails finding the location of a query image in a large dataset containing 3...
In this paper, we introduce a novel Trademark Application Assistant (TAST) that aims to speed up the process of a successful trademark application. The core of TAST is a web-brows...
Paul Wing Hing Kwan, Kazuo Toraichi, Keisuke Kamey...