Global address space languages like UPC exhibit high performance and portability on a broad class of shared and distributed memory parallel architectures. The most scalable applic...
A parallel MPEG-4 Simple Profile encoder for FPGA based multiprocessor System-on-Chip (SOC) is presented. The goal is a computationally scalable framework independent of platform....
Olli Lehtoranta, Erno Salminen, Ari Kulmala, Marko...
In response to the constant increase in wire delays, Non-Uniform Cache Architecture (NUCA) has been introduced as an effective memory model for dealing with growing memory latenci...
We describe the MPI/SX implementation of the MPI-2 standard for one-sided communication (Remote Memory Access) for the NEC SX-5 vector supercomputer. MPI/SX is a non-threaded impl...
In this paper, we present techniques and algorithms to improve the performance of various communication patterns on message-passing platforms where, for reasons of safety, user-le...