Sciweavers

421 search results - page 6 / 85
» An Intelligent Parallel Loop Scheduling for Parallelizing Co...
Sort
View
ICPP
1999
IEEE
15 years 10 months ago
Access Descriptor Based Locality Analysis for Distributed-Shared Memory Multiprocessors
Most of today's multiprocessors have a DistributedShared Memory (DSM) organization, which enables scalability while retaining the convenience of the shared-memory programming...
Angeles G. Navarro, Rafael Asenjo, Emilio L. Zapat...
LCPC
2009
Springer
15 years 10 months ago
Unrolling Loops Containing Task Parallelism
Classic loop unrolling allows to increase the performance of sequential loops by reducing the overheads of the non-computational parts of the loop. Unfortunately, when the loop con...
Roger Ferrer, Alejandro Duran, Xavier Martorell, E...
IPPS
1997
IEEE
15 years 10 months ago
A Compile-Time Partitioning Strategy for Non-Rectangular Loop Nests
This paper presents a compile-time scheme for partitioning non-rectangular loop nests which consist of inner loops whose bounds depend on the index of the outermost, parallel loop...
Rizos Sakellariou
IPPS
1997
IEEE
15 years 10 months ago
A BSP Approach to the Scheduling of Tightly-Nested Loops
This paper addresses the scheduling of uniformdependence loop nests within the framework of the bulksynchronous parallel (BSP) model. Two broad classes of tightly-nested loops are...
Radu Calinescu
PLDI
1993
ACM
15 years 10 months ago
Global Optimizations for Parallelism and Locality on Scalable Parallel Machines
Data locality is critical to achievinghigh performance on large-scale parallel machines. Non-local data accesses result in communication that can greatly impact performance. Thus ...
Jennifer-Ann M. Anderson, Monica S. Lam