Sciweavers

2277 search results - page 294 / 456
» Clustering by pattern similarity in large data sets
Sort
View
VLDB
2004
ACM
106views Database» more  VLDB 2004»
16 years 3 days ago
Structures, Semantics and Statistics
At a fundamental level, the key challenge in data integration is to reconcile the semantics of disparate data sets, each expressed with a different database structure. I argue th...
Alon Y. Halevy
WWW
2011
ACM
15 years 1 months ago
Parallel boosted regression trees for web search ranking
Gradient Boosted Regression Trees (GBRT) are the current state-of-the-art learning paradigm for machine learned websearch ranking — a domain notorious for very large data sets. ...
Stephen Tyree, Kilian Q. Weinberger, Kunal Agrawal...
ICCS
2007
Springer
16 years 28 days ago
Searching and Updating Metric Space Databases Using the Parallel EGNAT
Abstract. The Evolutionary Geometric Near-neighbor Access Tree (EGNAT) is a recently proposed data structure that is suitable for indexing large collections of complex objects. It ...
Mauricio Marín, Roberto Uribe, Ricardo J. B...
JCB
2007
130views more  JCB 2007»
15 years 6 months ago
Bayesian Inference of MicroRNA Targets from Sequence and Expression Data
MicroRNAs (miRNAs) regulate a large proportion of mammalian genes by hybridizing to targeted messenger RNAs (mRNAs) and down-regulating their translation into protein. Although mu...
Jim C. Huang, Quaid Morris, Brendan J. Frey
BMCBI
2011
14 years 10 months ago
PeakRanger: A cloud-enabled peak caller for ChIP-seq data
Background: Chromatin immunoprecipitation (ChIP), coupled with massively parallel short-read sequencing (seq) is used to probe chromatin dynamics. Although there are many algorith...
Xin Feng, Robert Grossman, Lincoln Stein