Data de-duplication has become a commodity component in dataintensive systems and it is required that these systems provide high reliability comparable to others. Unfortunately, b...
Chuanyi Liu, Yu Gu, Linchun Sun, Bin Yan, Dongshen...
This paper investigates the problem of Partitioning Skew1 in MapReduce-based system. Our studies with Hadoop, a widely used MapReduce implementation, demonstrate that the presence ...
Shadi Ibrahim, Hai Jin, Lu Lu, Song Wu, Bingsheng ...
Background: Genome context methods have been introduced in the last decade as automatic methods to predict functional relatedness between genes in a target genome using the patter...
Background: In a high throughput setting, effective flow cytometry data analysis depends heavily on proper data preprocessing. While usual preprocessing steps of quality assessmen...
Greg Finak, Juan-Manuel Perez, Andrew Weng, Raphae...
Background: Outer membrane proteins (OMPs) are frequently found in the outer membranes of gram-negative bacteria, mitochondria and chloroplasts and have been found to play diverse...