A key challenge in achieving high performance on software DSM systems is overcoming their relatively large communication latencies. In this paper, we consider two techniques which...
Many embedded systems contain resource constrained microcontrollers where applications, operating system components and device drivers reside within a single address space with no...
Ram Kumar, Akhilesh Singhania, Andrew Castner, Edd...
Abstract--One important bottleneck when visualizing large data sets is the data transfer between processor and memory. Cacheaware (CA) and cache-oblivious (CO) algorithms take into...
In recent years, GPUs have emerged as an extremely cost-effective means for achieving high performance. Many application developers, including those with no prior parallel program...
Mai Zheng, Vignesh T. Ravi, Feng Qin, Gagan Agrawa...
Memory trace analysis is an important technology for architecture research, system software (i.e., OS, compiler) optimization, and application performance improvements. Many appro...
Yungang Bao, Mingyu Chen, Yuan Ruan, Li Liu, Jianp...