Many applications in text processing require significant human effort for either labeling large document collections (when learning statistical models) or extrapolating rules from...
In recent years, publish-subscribe (pub-sub) systems based on XML document filtering have received much attention. In a typical pubsub system, subscribed users specify their inte...
Discretization algorithms have played an important role in data mining and knowledge discovery. They not only produce a concise summarization of continuous attributes to help the ...
: In this article, we propose an efficient and effective method for finding arbitrarily oriented subspace clusters by mapping the data space to a parameter space defining the set o...
The problem of identifying approximately duplicate records in databases is an essential step for data cleaning and data integration processes. Most existing approaches have relied...