Background: Sequencing of environmental DNA (often called metagenomics) has shown tremendous potential to uncover the vast number of unknown microbes that cannot be cultured and s...
We propose two fast algorithms for abrupt change detection in streaming data that can operate on arbitrary unknown data distributions before and after the change. The first algor...
Boosting is a general method for improving the accuracy of learning algorithms. We use boosting to construct improved privacy-preserving synopses of an input database. These are da...
Gradient Boosted Regression Trees (GBRT) are the current state-of-the-art learning paradigm for machine learned websearch ranking — a domain notorious for very large data sets. ...
Stephen Tyree, Kilian Q. Weinberger, Kunal Agrawal...
Expressing complex OLAP queries using group-by, aggregation, and joins can be extremely difficult. As a result of this, many alternate ways of expressing such queries have been de...