The Cell processor is a typical example of a heterogeneous multiprocessor-on-chip architecture that uses several levels of parallelism to deliver high performance. Closing the gap ...
This paper presents our experience developing applications in Jade, a portable, implicitly parallel programming language designed for exploiting task-level concurrency. Jade progr...
The past 10 years have delivered two significant revolutions. (1) Microprocessor design has been transformed by the limits of chip power, wire latency, and Dennard scaling—leadi...
Hadi Esmaeilzadeh, Ting Cao, Xi Yang, Stephen Blac...
This paper explores collective personalized communication. For example, in all-to-all personalized communication (AAPC), each processor sends a distinct message to every other pro...
We introduce a new deterministic parallel sorting algorithm based on the regular sampling approach. The algorithm uses only two rounds of regular all-to-all personalized communica...