Sciweavers

6781 search results - page 1071 / 1357
» Processors for Mobile Applications
Sort
View
PPOPP
2009
ACM
16 years 7 months ago
OpenMP to GPGPU: a compiler framework for automatic translation and optimization
GPGPUs have recently emerged as powerful vehicles for generalpurpose high-performance computing. Although a new Compute Unified Device Architecture (CUDA) programming model from N...
Seyong Lee, Seung-Jai Min, Rudolf Eigenmann
VLSID
2003
IEEE
147views VLSI» more  VLSID 2003»
16 years 7 months ago
SoC Synthesis with Automatic Hardware Software Interface Generation
Design of efficient System-on-Chips (SoCs) require thorough application analysis to identify various compute intensive parts. These compute intensive parts can be mapped to hardwa...
Amarjeet Singh 0002, Amit Chhabra, Anup Gangwar, B...
PPOPP
2010
ACM
16 years 1 months ago
Load balancing on speed
To fully exploit multicore processors, applications are expected to provide a large degree of thread-level parallelism. While adequate for low core counts and their typical worklo...
Steven Hofmeyr, Costin Iancu, Filip Blagojevic
SASP
2009
IEEE
291views Hardware» more  SASP 2009»
16 years 1 months ago
FCUDA: Enabling efficient compilation of CUDA kernels onto FPGAs
— As growing power dissipation and thermal effects disrupted the rising clock frequency trend and threatened to annul Moore’s law, the computing industry has switched its route...
Alexandros Papakonstantinou, Karthik Gururaj, John...
IISWC
2008
IEEE
16 years 1 months ago
Characterizing and improving the performance of Intel Threading Building Blocks
Abstract— The Intel Threading Building Blocks (TBB) runtime library [1] is a popular C++ parallelization environment [2][3] that offers a set of methods and templates for creatin...
Gilberto Contreras, Margaret Martonosi
« Prev « First page 1071 / 1357 Last » Next »