Sciweavers

2020 search results - page 284 / 404
» Scalable Instruction-Level Parallelism.
Sort
View
IEEEPACT
2007
IEEE
16 years 22 days ago
A Flexible Heterogeneous Multi-Core Architecture
Multi-core processors naturally exploit thread-level parallelism (TLP). However, extracting instruction-level parallelism (ILP) from individual applications or threads is still a ...
Miquel Pericàs, Adrián Cristal, Fran...
MICRO
2007
IEEE
133views Hardware» more  MICRO 2007»
16 years 21 days ago
Revisiting the Sequential Programming Model for Multi-Core
Single-threaded programming is already considered a complicated task. The move to multi-threaded programming only increases the complexity and cost involved in software developmen...
Matthew J. Bridges, Neil Vachharajani, Yun Zhang, ...
CF
2004
ACM
15 years 12 months ago
Improving the execution time of global communication operations
Many parallel applications from scientific computing use MPI global communication operations to collect or distribute data. Since the execution times of these communication opera...
Matthias Kühnemann, Thomas Rauber, Gudula R&u...
IPPS
2002
IEEE
15 years 11 months ago
Can User-Level Protocols Take Advantage of Multi-CPU NICs?
Modern high speed interconnects such as Myrinet and Gigabit Ethernet have shifted the bottleneck in communication from the interconnect to the messaging software at the sending an...
Piyush Shivam, Pete Wyckoff, Dhabaleswar K. Panda
IPPS
2010
IEEE
15 years 4 months ago
Optimization of linked list prefix computations on multithreaded GPUs using CUDA
We present a number of optimization techniques to compute prefix sums on linked lists and implement them on multithreaded GPUs using CUDA. Prefix computations on linked structures ...
Zheng Wei, Joseph JáJá