Sciweavers

442 search results - page 69 / 89
» Parallel programming over ChinaGrid
Sort
View
PPOPP
2009
ACM
16 years 6 months ago
A compiler-directed data prefetching scheme for chip multiprocessors
Data prefetching has been widely used in the past as a technique for hiding memory access latencies. However, data prefetching in multi-threaded applications running on chip multi...
Dhruva Chakrabarti, Mahmut T. Kandemir, Mustafa Ka...
CLUSTER
2007
IEEE
16 years 6 days ago
Balancing productivity and performance on the cell broadband engine
— The Cell Broadband Engine (BE) is a heterogeneous multicore processor, combining a general-purpose POWER architecture core with eight independent single-instructionmultiple-dat...
Sadaf R. Alam, Jeremy S. Meredith, Jeffrey S. Vett...
CODES
2003
IEEE
15 years 11 months ago
Design space exploration of a hardware-software co-designed GF(2m) galois field processor for forward error correction and crypt
This paper describes a hardware-software co-design approach for flexible programmable Galois Field Processing for applications which require operations over GF(2m ), such as RS an...
Wei Ming Lim, Mohammed Benaissa
HPCA
1999
IEEE
15 years 10 months ago
Dynamically Exploiting Narrow Width Operands to Improve Processor Power and Performance
In general-purpose microprocessors, recent trends have pushed towards 64-bit word widths, primarily to accommodate the large addressing needs of some programs. Many integer proble...
David Brooks, Margaret Martonosi
PLDI
1995
ACM
15 years 9 months ago
Unifying Data and Control Transformations for Distributed Shared Memory Machines
We present a unified approach to locality optimization that employs both data and control transformations. Data transformations include changing the array layout in memory. Contr...
Michal Cierniak, Wei Li