In LAPACK many matrix operations are cast as block algorithms which iteratively process a panel using an unblocked algorithm and then update a remainder matrix using the high perf...
: Optimization is a healthy field targeted to find efficient solutions and algorithms to solve problems either in the academia and the industry. As the solved problems became harde...
Abstract. This paper presents PLDA, our parallel implementation of Latent Dirichlet Allocation on MPI and MapReduce. PLDA smooths out storage and computation bottlenecks and provid...
Yi Wang, Hongjie Bai, Matt Stanton, Wen-Yen Chen, ...
We describe a novel load-balancing method for sort-first parallel graphics rendering systems. It gives up geometry data which could be very large and tends to cause unacceptable c...
We present a new approach to preconditioning for very large, sparse, non-symmetric, linear systems. We explicitly compute an approximate inverse to our original matrix that can be...