Previous work on mining transactional database has focused primarily on mining frequent itemsets, association rules, and sequential patterns. However, interesting relationships be...
: Sufficiently high data quality is crucial for almost every application. Nonetheless, data quality issues are nearly omnipresent. The reasons for poor quality cannot simply be bla...
As the amount of available data continues to increase, more and more effective means for discovering important patterns and relationships within that data are required. Although t...
Machine-learning algorithms are employed in a wide variety of applications to extract useful information from data sets, and many are known to suffer from superlinear increases in ...
Karthik Nagarajan, Brian Holland, Alan D. George, ...
Addressed in this paper is the issue of `email data cleaning' for text mining. Many text mining applications need take emails as input. Email data is usually noisy and thus i...