Sciweavers

1611 search results - page 220 / 323
» A Library for Self-Adjusting Computation
Sort
View
SC
2009
ACM
16 years 1 months ago
Automating the generation of composed linear algebra kernels
Memory bandwidth limits the performance of important kernels in many scientific applications. Such applications often use sequences of Basic Linear Algebra Subprograms (BLAS), an...
Geoffrey Belter, Elizabeth R. Jessup, Ian Karlin, ...
CLUSTER
2009
IEEE
16 years 29 days ago
Integrating software distributed shared memory and message passing programming
Abstract—Software Distributed Shared Memory (SDSM) systems provide programmers with a shared memory programming environment across distributed memory architectures. In contrast t...
H'sien J. Wong, Alistair P. Rendell
ICPPW
2009
IEEE
16 years 25 days ago
Just-in-Time Renaming and Lazy Write-Back on the Cell/B.E.
— Cell Superscalar (CellSs) provides a simple, flexible and easy programming approach for the Cell Broadband Engine (Cell/B.E.) that automatically exploits the inherent concurre...
Pieter Bellens, Josep M. Pérez, Rosa M. Bad...
IEEEPACT
2009
IEEE
16 years 25 days ago
Automatic Tuning of Discrete Fourier Transforms Driven by Analytical Modeling
—Analytical models have been used to estimate optimal values for parameters such as tile sizes in the context of loop nests. However, important algorithms such as fast Fourier tr...
Basilio B. Fraguela, Yevgen Voronenko, Markus P&uu...
PVM
2009
Springer
16 years 22 days ago
Processing MPI Datatypes Outside MPI
The MPI datatype functionality provides a powerful tool for describing structured memory and file regions in parallel applications, enabling noncontiguous data to be operated on b...
Robert B. Ross, Robert Latham, William Gropp, Ewin...