Sciweavers

4213 search results - page 485 / 843
» The Tau Parallel Performance System
Sort
View
190
Voted
IPPS
2010
IEEE
15 years 4 months ago
Servet: A benchmark suite for autotuning on multicore clusters
Abstract--The growing complexity in computer system hierarchies due to the increase in the number of cores per processor, levels of cache (some of them shared) and the number of pr...
Jorge González-Domínguez, Guillermo ...
ICDCS
2010
IEEE
15 years 4 months ago
Versatile Stack Management for Multitasking Sensor Networks
Abstract--The networked application environment has motivated the development of multitasking operating systems for sensor networks and other low-power electronic devices, but thei...
Rui Chu, Lin Gu, Yunhao Liu, Mo Li, Xicheng Lu
ISCA
2011
IEEE
486views Hardware» more  ISCA 2011»
14 years 10 months ago
Dark silicon and the end of multicore scaling
Since 2005, processor designers have increased core counts to exploit Moore’s Law scaling, rather than focusing on single-core performance. The failure of Dennard scaling, to wh...
Hadi Esmaeilzadeh, Emily R. Blem, Renée St....
TPDS
2002
94views more  TPDS 2002»
15 years 6 months ago
Recursive Array Layouts and Fast Matrix Multiplication
The performance of both serial and parallel implementations of matrix multiplication is highly sensitive to memory system behavior. False sharing and cache conflicts cause traditi...
Siddhartha Chatterjee, Alvin R. Lebeck, Praveen K....
189
Voted
SAMOS
2010
Springer
15 years 5 months ago
Interleaving granularity on high bandwidth memory architecture for CMPs
—Memory bandwidth has always been a critical factor for the performance of many data intensive applications. The increasing processor performance, and the advert of single chip m...
Felipe Cabarcas, Alejandro Rico, Yoav Etsion, Alex...