Sciweavers

1075 search results - page 119 / 215
» Parallel Programming with Transactional Memory
Sort
View
IPPS
2009
IEEE
16 years 29 days ago
Scalable RDMA performance in PGAS languages
Partitioned Global Address Space (PGAS) languages provide a unique programming model that can span shared-memory multiprocessor (SMP) architectures, distributed memory machines, o...
Montse Farreras, George Almási, Calin Casca...
ASAP
2007
IEEE
136views Hardware» more  ASAP 2007»
16 years 20 days ago
0/1 Knapsack on Hardware: A Complete Solution
We present a memory efficient, practical, systolic, parallel architecture for the complete 0/1 knapsack dynamic programming problem, including backtracking. This problem was inte...
K. Nibbelink, S. Rajopadhye, R. McConnell
PPOPP
2011
ACM
14 years 9 months ago
GRace: a low-overhead mechanism for detecting data races in GPU programs
In recent years, GPUs have emerged as an extremely cost-effective means for achieving high performance. Many application developers, including those with no prior parallel program...
Mai Zheng, Vignesh T. Ravi, Feng Qin, Gagan Agrawa...
ICFP
2012
ACM
13 years 8 months ago
Nested data-parallelism on the gpu
Graphics processing units (GPUs) provide both memory bandwidth and arithmetic performance far greater than that available on CPUs but, because of their Single-Instruction-Multiple...
Lars Bergstrom, John H. Reppy
CF
2006
ACM
15 years 10 months ago
Landing openMP on cyclops-64: an efficient mapping of openMP to a many-core system-on-a-chip
This paper presents our experience mapping OpenMP parallel programming model to the IBM Cyclops-64 (C64) architecture. The C64 employs a many-core-on-a-chip design that integrates...
Juan del Cuvillo, Weirong Zhu, Guang R. Gao