Most datasets in real applications come in from multiple sources. As a result, we often have attributes information about data objects and various pairwise relations (similarity) ...
Mean-Shift (MS) is a powerful non-parametric clustering method. Although good accuracy can be achieved, its computational cost is particularly expensive even on moderate data sets...
The annotation of words and phrases by ontology concepts is extremely helpful for semantic interpretation. However many ontologies, e.g. WordNet, are too fine-grained and even hu...
Latent Semantic Indexing (LSI) has been validated to be effective on many small scale text collections. However, little evidence has shown its effectiveness on unsampled large sca...
Many datasets can be described in the form of graphs or networks where nodes in the graph represent entities and edges represent relationships between pairs of entities. A common ...