Data locality and synchronization overhead are two important factors that affect the performance of applications on multiprocessors. Loop fusion is an effective way for reducing s...
Edwin Hsing-Mean Sha, Chenhua Lang, Nelson L. Pass...
This paper presents an algorithm and a data structure for scalable dynamic synchronization in fine-grained parallelism. The algorithm supports the full generality of phasers with d...
Stefan Marr, Stijn Verhaegen, Bruno De Fraine, The...
Parallel programming models based on a mixture of task and data parallelism have shown to be successful in addressing the increasing communication overhead of distributed memory p...
In order for parallel logic programming systems to become popular, they should serve the broadest range of applications. To achieve this goal, designers of parallel logic programm...
Abstract. Most studies on adaptive partitioning policies for scheduling parallel jobs on distributed memory parallel computers ignore the constraints imposed by the memory requirem...