The join is the most important, but also the most time consuming operation in relational database systems. We implemented the parallel Hybrid Hash Join algorithm on a PC-cluster a...
We present an approach to document clustering based on winnowing fingerprints that achieved good values of effectiveness with considerable save in memory space and computation tim...
Background: Microarray technologies produced large amount of data. The hierarchical clustering is commonly used to identify clusters of co-expressed genes. However, microarray dat...
Alexandre G. de Brevern, Serge A. Hazout, Alain Ma...
Classical clustering algorithms are based on the concept that a cluster center is a single point. Clusters which are not compact around a single point are not candidates for class...
—Fuzzy/similarity joins have been widely studied in the research community and extensively used in real-world applications. This paper proposes and evaluates several algorithms f...
Foto N. Afrati, Anish Das Sarma, David Menestrina,...