Time of creation is one of the predominant (often implicit) clustering strategies found not only in Data Warehouse systems: line items are created together with their correspondin...
All Netflix Prize algorithms proposed so far are prohibitively costly for large-scale production systems. In this paper, we describe an efficient dataflow implementation of a coll...
Srivatsava Daruru, Nena M. Marin, Matt Walker, Joy...
We consider the problem of multiclass classification where both labeled and unlabeled data points are given. We introduce and demonstrate a new approach for estimating a distribut...
Developers of new imageanalysis algorithmstypically require an interactive environment in which the imagedata can be passed through various operators, some of which may involve fe...
M. Stella Atkins, Torre Zuk, B. Johnston, T. Arden
XML has emerged as a common standard for data exchange over the World Wide Web. One way to manage XML data is to use the power of relational databases for storing and q...
Amir Jahangard Rafsanjani, Seyed-Hassan Mirian-Hos...