The last decade has seen a huge interest in classification of time series. Most of this work assumes that the data resides in main memory and is processed offline. However, recent...
Shashwati Kasetty, Candice Stafford, Gregory P. Wa...
In this paper, we propose an algorithm and data structure for computing the term contributed frequency (tcf) for all N-grams in a text corpus. Although term frequency is one of th...
Detecting and eliminating fuzzy duplicates is a critical data cleaning task that is required by many applications. Fuzzy duplicates are multiple seemingly distinct tuples which re...
Abstract. This paper presents an application of a hierarchical social (HS) metaheuristic to region-based segmentation. The original image is modelled as a simplified image graph, w...
—In practice, one is often faced with incomplete phylogenetic data, such as a collection of partial trees or partial splits. This paper poses the problem of inferring a phylogene...
Daniel H. Huson, Tobias Dezulian, Tobias H. Kl&oum...