Sequential pattern mining has been an emerging problem in data mining. In this paper, we propose a new algorithm for mining frequent sequences. It processes only one scan of the da...
Stemming can improve retrieval accuracy, but stemmers are language-specific. Character n-gram tokenization achieves many of the benefits of stemming in a language independent way,...
The problem of storage or transmission of codevectors is an essential issue in vector quantization with custom codebook. The proposed technique for compression of codebooks relies ...
This paper presents a region-representation scheme and comparative analysis methods based on a tesseral addressing system. The proposed scheme is described in the context of perfo...
A path dictionary encodes the connections among objects in the aggregation hierarchy. It has been shown to be an efficient mechanism for supporting nested object queries [6]. In t...