Sciweavers

17688 search results - page 464 / 3538
» Data Set Balancing
Sort
View
BMCBI
2005
118views more  BMCBI 2005»
15 years 6 months ago
Feature selection and classification for microarray data analysis: Evolutionary methods for identifying predictive genes
Background: In the clinical context, samples assayed by microarray are often classified by cell line or tumour type and it is of interest to discover a set of genes that can be us...
Thanyaluk Jirapech-Umpai, J. Stuart Aitken
ICDE
2007
IEEE
218views Database» more  ICDE 2007»
16 years 8 months ago
SKYPEER: Efficient Subspace Skyline Computation over Distributed Data
Skyline query processing has received considerable attention in the recent past. Mainly, the skyline query is used to find a set of non dominated data points in a multidimensional...
Akrivi Vlachou, Christos Doulkeridis, Yannis Kotid...
ICPR
2002
IEEE
16 years 8 months ago
Prototype Selection for Finding Efficient Representations of Dissimilarity Data
The nearest neighbor (NN) rule is a simple and intuitive method for solving classification problems. Originally, it uses distances to the complete training set. It performs well, ...
Elzbieta Pekalska, Robert P. W. Duin
OSDI
2004
ACM
16 years 7 months ago
MapReduce: Simplified Data Processing on Large Clusters
MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to ge...
Jeffrey Dean, Sanjay Ghemawat
EDBT
2010
ACM
116views Database» more  EDBT 2010»
16 years 1 months ago
HARRA: fast iterative hashed record linkage for large-scale data collections
We study the performance issue of the “iterative” record linkage (RL) problem, where match and merge operations may occur together in iterations until convergence emerges. We ...
Hung-sik Kim, Dongwon Lee