Sciweavers

3836 search results - page 400 / 768
» Optimizing the Use of High Performance Software Libraries
Sort
View
CODES
2008
IEEE
16 years 1 months ago
Static analysis for fast and accurate design space exploration of caches
Application-specific system-on-chip platforms create the opportunity to customize the cache configuration for optimal performance with minimal chip estate. Simulation, in partic...
Yun Liang, Tulika Mitra
HIPEAC
2011
Springer
14 years 6 months ago
NoC-aware cache design for multithreaded execution on tiled chip multiprocessors
In chip multiprocessors (CMPs), data accesslatency dependson the memory hierarchy organization, the on-chip interconnect (NoC), and the running workload. Reducing data access late...
Ahmed Abousamra, Alex K. Jones, Rami G. Melhem
IPPS
1998
IEEE
15 years 11 months ago
Update Protocols and Iterative Scientific Applications
Software DSMs have been a research topic for over a decade. While good performance has been achieved in some cases, consistent performance has continued to elude researchers. This...
Peter J. Keleher
SIGMOD
2005
ACM
162views Database» more  SIGMOD 2005»
16 years 6 months ago
Fast and Approximate Stream Mining of Quantiles and Frequencies Using Graphics Processors
We present algorithms for fast quantile and frequency estimation in large data streams using graphics processor units (GPUs). We exploit the high computational power and memory ba...
Naga K. Govindaraju, Nikunj Raghuvanshi, Dinesh Ma...
PPOPP
1999
ACM
15 years 11 months ago
MagPIe: MPI's Collective Communication Operations for Clustered Wide Area Systems
Writing parallel applications for computational grids is a challenging task. To achieve good performance, algorithms designed for local area networks must be adapted to the differ...
Thilo Kielmann, Rutger F. H. Hofman, Henri E. Bal,...