Software code caches are increasingly being used to amortize the runtime overhead of dynamic optimizers, simulators, emulators, dynamic translators, dynamic compilers, and other t...
Derek Bruening, Vladimir Kiriansky, Timothy Garnet...
It has been shown that a small number of FPGAs can significantly accelerate certain computing tasks by up to two or three orders of magnitude. However, particularly intensive lar...
Arun Patel, Christopher A. Madill, Manuel Salda&nt...
We present Grouped Distributed Queues (GDQ), the first proportional share scheduler for multiprocessor systems that scales well with a large number of processors and processes. G...
Computing and maintaining network structures for efficient data aggregation incurs high overhead for dynamic events where the set of nodes sensing an event changes with time. Mor...
Device scaling and large scale integration have led to growing concerns about soft errors in microprocessors. To date, in all but the most demanding applications, implementing par...