Sciweavers

1001 search results - page 131 / 201
» Improving memory hierarchy performance for irregular applica...
Sort
View
DEBS
2010
ACM
15 years 10 months ago
Evaluation of streaming aggregation on parallel hardware architectures
We present a case study parallelizing streaming aggregation on three different parallel hardware architectures. Aggregation is a performance-critical operation for data summarizat...
Scott Schneider, Henrique Andrade, Bugra Gedik, Ku...
SPE
1998
129views more  SPE 1998»
15 years 6 months ago
Timing Trials, or the Trials of Timing: Experiments with Scripting and User-Interface Languages
This paper describes some basic experiments to see how fast various popular scripting and user-interface languages run on a spectrum of representative tasks. We found enormous var...
Brian W. Kernighan, Christopher J. Van Wyk
IPPS
2009
IEEE
16 years 29 days ago
Designing multi-leader-based Allgather algorithms for multi-core clusters
The increasing demand for computational cycles is being met by the use of multi-core processors. Having large number of cores per node necessitates multi-core aware designs to ext...
Krishna Chaitanya Kandalla, Hari Subramoni, Gopala...
ISPDC
2010
IEEE
15 years 4 months ago
Resource-Aware Compiler Prefetching for Many-Cores
—Super-scalar, out-of-order processors that can have tens of read and write requests in the execution window place significant demands on Memory Level Parallelism (MLP). Multi- ...
George C. Caragea, Alexandros Tzannes, Fuat Keceli...
MICRO
2003
IEEE
108views Hardware» more  MICRO 2003»
15 years 11 months ago
Reducing Design Complexity of the Load/Store Queue
With faster CPU clocks and wider pipelines, all relevant microarchitecture components should scale accordingly. There have been many proposals for scaling the issue queue, registe...
Il Park, Chong-liang Ooi, T. N. Vijaykumar