The general approach for automatically driving data collection using information from previously acquired data is called active learning. Traditional active learning addresses the...
Given a classification problem, our goal is to find a low-dimensional linear transformation of the feature vectors which retains information needed to predict the class labels. We...
Cross-language latent semantic indexing is a method that learns useful languageindependent vector representations of terms through a statistical analysis of a documentaligned text...
With the increasing amount of data and the need to integrate data from multiple data sources, a challenging issue is to find near duplicate records efficiently. In this paper, we ...
Chuan Xiao, Wei Wang 0011, Xuemin Lin, Jeffrey Xu ...
In this paper, we propose a new system extracting potentially copyright infringement texts from the Web, called EPCI. EPCI extracts them in the following way: (1) generating a set...
Takashi Tashiro, Takanori Ueda, Taisuke Hori, Yu H...