Many code analysis techniques for optimization, debugging, or parallelization need to perform runtime disambiguation of sets of addresses. Such operations can be supported efficie...
James Tuck, Wonsun Ahn, Luis Ceze, Josep Torrellas
Abstract--In this paper, we analyze the computational challenges in implementing particle filtering, especially to video sequences. Particle filtering is a technique used for filte...
Aswin C. Sankaranarayanan, Ankur Srivastava, Rama ...
Memory-intensive threads can hoard shared resources without making progress on a multithreading processor (SMT), thereby hindering the overall system performance. A recent promisi...
Abstract—The increasing number of cores per node in highperformance computing requires an efficient intra-node MPI communication subsystem. Most existing MPI implementations rel...
CC-NUMA is a widely adopted and deployed architecture of high performance computers. These machines are attractive for their transparent access to local and remote memory. However...