In this paper we study the performance improvements and trade-offs derived from an optimized mapping approach applied on a parametric coarse grained reconfigurable array architect...
Grigoris Dimitroulakos, Michalis D. Galanis, Const...
Abstract. Embedded systems have strict timing and code size requirements. Retiming is one of the most important optimization techniques to improve the execution time of loops by in...
Chun Xue, Zili Shao, Meilin Liu, Mei Kang Qiu, Edw...
All-to-all personalized exchange is one of the most dense collective communication patterns and occurs in many important parallel computing/networking applications. In this paper,...
In this paper, we present a randomized algorithm for the multipacket (i.e., k − k) routing problem on an n × n mesh. The algorithm completes with high probability in at the mos...
One of the most fundamental problems automatic parallelization tools are confronted with is to find an optimal domain decomposition for a given application. For regular domain prob...