In this paper we consider distributed K-Nearest Neighbor (KNN) search and range query processing in high dimensional data. Our approach is based on Locality Sensitive Hashing (LSH...
An important task in machine learning is determining which learning algorithm works best for a given data set. When the amount of data is small the same data needs to be used repea...
Data de-duplication has become a commodity component in dataintensive systems and it is required that these systems provide high reliability comparable to others. Unfortunately, b...
Chuanyi Liu, Yu Gu, Linchun Sun, Bin Yan, Dongshen...
The paper proposes an adaptive web system--that is, a website that is capable of changing its original design to fit user requirements. For the purpose of improving shortcomings o...
Background: Text mining has spurred huge interest in the domain of biology. The goal of the BioCreAtIvE exercise was to evaluate the performance of current text mining systems. We...