Sciweavers

4491 search results - page 702 / 899
» Algorithm Engineering
Sort
View
CASES
2004
ACM
15 years 10 months ago
Automatic data partitioning for the agere payload plus network processor
With the ever-increasing pervasiveness of the Internet and its stringent performance requirements, network system designers have begun utilizing specialized chips to increase the ...
Steve Carr, Philip H. Sweany
CGO
2004
IEEE
15 years 10 months ago
Code Generation for Single-Dimension Software Pipelining of Multi-Dimensional Loops
Traditionally, software pipelining is applied either to the innermost loop of a given loop nest or from the innermost loop to the outer loops. In a companion paper, we proposed a ...
Hongbo Rong, Alban Douillet, Ramaswamy Govindaraja...
CGO
2004
IEEE
15 years 10 months ago
Custom Data Layout for Memory Parallelism
In this paper, we describe a generalized approach to deriving a custom data layout in multiple memory banks for array-based computations, to facilitate high-bandwidth parallel mem...
Byoungro So, Mary W. Hall, Heidi E. Ziegler
CODES
2004
IEEE
15 years 10 months ago
A loop accelerator for low power embedded VLIW processors
The high transistor density afforded by modern VLSI processes have enabled the design of embedded processors that use clustered execution units to deliver high levels of performan...
Binu K. Mathew, Al Davis
CODES
2004
IEEE
15 years 10 months ago
Operation tables for scheduling in the presence of incomplete bypassing
Register bypassing is a powerful and widely used feature in modern processors to eliminate certain data hazards. Although complete bypassing is ideal for performance, bypassing ha...
Aviral Shrivastava, Eugene Earlie, Nikil D. Dutt, ...