A major challenge in document clustering is the extremely high dimensionality. For example, the vocabulary for a document set can easily be thousands of words. On the other hand, ...
The Peano Count Tree (P-tree) is a quadrant-based lossless tree representation of the original spatial data. The idea of P-tree is to recursively divide the entire spatial data, s...
Qin Ding, Maleq Khan, Amalendu Roy, William Perriz...
Commonly to classify new object in Data Mining one should estimate its similarity with given classes. Function of Rival Similarity (FRiS) is assigned to calculate quantitative mea...
Nikolay G. Zagoruiko, Irina V. Borisova, Vladimir ...
Background: With the biomedical literature continually expanding, searching PubMed for information about specific genes becomes increasingly difficult. Not only can thousands of r...
Catalina O. Tudor, Carl J. Schmidt, K. Vijay-Shank...
A number of medically important disease-causing bacteria (collectively called Gram-negative bacteria) are noted for the extra "outer" membrane that surrounds their cell....
Rong She, Fei Chen 0002, Ke Wang, Martin Ester, Je...