We propose a cooperative methodology for multithreaded software, where threads use traditional synchronization idioms such as locks, but additionally document each point of potent...
Recently a GPU has acquired programmability to perform general purpose computation fast by running ten thousands of threads concurrently. This paper presents a new algorithm for d...
The growing trend towards adoption of flexible and heterogeneous, parallel computing architectures has increased the challenges faced by the programming community. We propose a me...
Abstract. Programmability is an important capability that provides flexible computing devices, but it incurs significant performance and power penalties. We have proposed an archit...
Arthur Abnous, Katsunori Seno, Yuji Ichikawa, Marl...
Massively parallel processor array architectures can be used as hardware accelerators for a plenty of dataflow dominant applications. Bilateral filtering is an example of a stat...