Sampling is a popular method of data collection when it is impossible or too costly to reach the entire population. For example, television show ratings in the United States are g...
We consider the problem of the binary classification on imbalanced data, in which nearly all the instances are labelled as one class, while far fewer instances are labelled as the...
Kaizhu Huang, Haiqin Yang, Irwin King, Michael R. ...
Background: Solexa/Illumina short-read ultra-high throughput DNA sequencing technology produces millions of short tags (up to 36 bases) by parallel sequencing-by-synthesis of DNA ...
Jacques Rougemont, Arnaud Amzallag, Christian Isel...
We present the architecture of an end-to-end semantic search engine that uses a graph data model to enable interactive query answering over structured and interlinked data collecte...
We consider the problem of specifying data structures with complex sharing in a manner that is both declarative and results in provably correct code. In our approach, abstract data...
Peter Hawkins, Alex Aiken, Kathleen Fisher, Martin...