Bulk Synchronous Parallel ML is a high-level language for programming parallel algorithms. Built upon OCaml and using the BSP model, it provides a safe setting for their implementa...
Abstract--Loop tiling is an important compiler transformation used for enhancing data locality and exploiting coarsegrained parallelism. Tiled codes in which tile sizes are runtime...
Albert Hartono, Muthu Manikandan Baskaran, J. Rama...
We propose CCRank, the first parallel algorithm for learning to rank, targeting simultaneous improvement in learning accuracy and efficiency. CCRank is based on cooperative coev...
Shuaiqiang Wang, Byron J. Gao, Ke Wang, Hady Wiraw...
—Much of dense linear algebra has been successfully blocked to concentrate the majority of its time in the Level 3 BLAS, which are not only efficient for serial computation, but...
Because of the growing use of multimedia content over Internet, Content-Based Image Retrieval (CBIR) has recently received a lot of interest. While accurate search techniques base...