: We describe methods for automatically identifying signature blocks and reply lines in plaintext email messages. This analysis has many potential applications, such as preprocessi...
An important class of queries is the LIKE predicate in SQL. In the absence of an index, LIKE queries are subject to performance degradation. The notion of indexing on substrings (...
This paper presents a unified utility framework for resource selection of distributed text information retrieval. This new framework shows an efficient and effective way to infer ...
Biosequences typically have a small alphabet, a long length, and patterns containing gaps (i.e., “don’t care”) of arbitrary size. Mining frequent patterns in such sequences ...
Despite the current practice of re-keying most documents placed in digital libraries, we continue to try to improve accuracy of automated recognition techniques for obtaining docum...