Matrix transpose in parallel systems typically involves costly all-to-all communications. In this paper, we provide a comparative characterization of various efficient algorithms f...
Similarity search is a fundamental operation for applications that deal with unstructured data sources. In this paper we propose a new pivot-based method for similarity search, ca...
We address the problem of efficient out-of-core code generation for a special class of imperfectly nested loops encoding tensor contractions arising in quantum chemistry computati...
: The power density in high performance systems continues to rise with every process technology generation, thereby increasing the operating temperature and creating ‘hot spotsâ€...
Giacomo Paci, Francesco Poletti, Luca Benini, Paul...
We present families of algorithms for operations related to the computation of the inverse of a Symmetric Positive Definite (SPD) matrix: Cholesky factorization, inversion of a tr...
Paolo Bientinesi, Brian C. Gunter, Robert A. van d...