The design of application (-domain) specific instructionset processors (ASIPs), optimized for code size, has traditionally been accompanied by the necessity to program assembly, ...
Parallel bit stream algorithms exploit the SWAR (SIMD within a register) capabilities of commodity processors in high-performance text processing applications such as UTF8 to UTF-...
We define the reachability-bound problem to be the problem of finding a symbolic worst-case bound on the number of times a given control location inside a procedure is visited in ...
The performance benefits of GPU parallelism can be enormous, but unlocking this performance potential is challenging. The applicability and performance of GPU parallelizations is...
Thomas B. Jablin, Prakash Prabhu, James A. Jablin,...
In writing parallel programs, programmers expose parallelism and optimize it to meet a particular performance goal on a single platform under an assumed set of workload characteri...
Arun Raman, Hanjun Kim, Taewook Oh, Jae W. Lee, Da...