We consider the setting of a multiprocessor where the speeds of the m processors can be individually scaled. Jobs arrive over time and have varying degrees of parallelizability. A...
Transactional memory (TM) eliminates many problems associated with lock-based synchronization. Over recent years, much progress has been made in software and hardware implementati...
Tatiana Shpeisman, Ali-Reza Adl-Tabatabai, Robert ...
We analyze the performance of protocols for load balancing in distributed systems based on no-regret algorithms from online learning theory. These protocols treat load balancing a...
As the last-level on-chip caches in chip-multiprocessors increase in size, the physical locality of on-chip data becomes important for delivering high performance. The non-uniform...
Linked data structure (LDS) accesses are critical to the performance of many large scale applications. Techniques have been proposed to prefetch such accesses. Unfortunately, many...