Modern chip multiprocessors (CMPs) are designed to exploit both instruction-level parallelism (ILP) within processors and thread-level parallelism (TLP) within and across processo...
Changkyu Kim, Simha Sethumadhavan, M. S. Govindan,...
Instruction set customization is an effective way to improve processor performance. Critical portions of application dataflow graphs are collapsed for accelerated execution on s...
Nathan Clark, Jason A. Blome, Michael L. Chu, Scot...
We address the problem of message ordering for reliable multicast communication. End-to-end multicast ordering is useful for ensuring the collective integrity and consistency of d...
One model of multithreading gaining popularity on multiprocessor systems is the message-driven model of computation. The message-driven model is a reactive model in which an arriv...
This paper introduces an analysis technique, commutativity analysis, for automatically parallelizing computations that manipulate dynamic, pointer-based data structures. Commutati...