Background: With the current technological advances in high-throughput biology, the necessity to develop tools that help to analyse the massive amount of data being generated is e...
Maciej Paszkowski-Rogacz, Mikolaj Slabicki, M. Ter...
Traditionally, research in identifying structured entities in documents has proceeded independently of document categorization research. In this paper, we observe that these two t...
The success of popular algorithms such as k-means clustering or nearest neighbor searches depend on the assumption that the underlying distance functions reflect domain-specific n...
Background: When conducting multiple hypothesis tests, it is important to control the number of false positives, or the False Discovery Rate (FDR). However, there is a tradeoff be...
The clustering algorithm DBSCAN relies on a density-based notion of clusters and is designed to discover clusters of arbitrary shape as well as to distinguish noise. In this paper,...