We present solutions to statically load-balance scatter operations in parallel codes run on Grids. Our loadbalancing strategy is based on the modification of the data distributio...
The characteristics of irregular algorithms make a parallel implementation difficult, especially for PC clusters or clusters of SMPs. These characteristics may include an unpredi...
Map- and fold-like skeletons are a suitable abstractions to guide parallel program execution in functional array processing. However, when it comes to achieving high performance, i...
In this paper, we present a new packaging architecture for chip-level optical interconnections based on imaging fiber bundles. Imaging fiber bundles consist of densely packed arra...
We study the problem of exploiting parallelism from search-based AI systems on distributed machines. We propose stack-splitting, a technique for implementing orparallelism, which ...