— Clustering is grouping of patterns according to similarity or distance in different perspectives. Various data representations, similarity measurements and organization manners...
— Imputation of missing data is important in many areas, such as reducing non-response bias in surveys and maintaining medical documentation. Nearest neighbour (NN) imputation al...
The problem of identifying deviating patterns in XML repositories has important applications in data cleaning, fraud detection, and stock market analysis. Current methods determine...
We propose a novel extraction approach that exploits content redundancy on the web to extract structured data from template-based web sites. We start by populating a seed database...
Pankaj Gulhane, Rajeev Rastogi, Srinivasan H. Seng...
Similarity search in texts, notably in biological sequences, has received substantial attention in the last few years. Numerous filtration and indexing techniques have been create...