Abstract. In this paper we present static and dynamic studies of duplicate and near-duplicate documents in the Web. The static and dynamic studies involve the analysis of similar c...
Abstract. An important problem in biology is to understand correspondences between mRNA microarray levels and mass spectrometry peptide counts. Recently, a compendium of mRNA expre...
Abstract. So far, most methods for identifying sequences under selection based on comparative sequence data have either assumed selectional pressures are the same across all branch...
Adam C. Siepel, Katherine S. Pollard, David Haussl...
Abstract Traditional nearest-neighbor (NN) search is based on two basic indexing approaches: object-based indexing and solution-based indexing. The former is constructed based on t...
Baihua Zheng, Jianliang Xu, Wang-Chien Lee, Dik Lu...
We consider a scenario where we want to query a large dataset that is stored in external memory and does not fit into main memory. The most constrained resources in such a situati...