On modern computers, the performance of programs is often limited by memory latency rather than by processor cycle time. To reduce the impact of memory latency, the restructuring ...
Induprakas Kodukula, Keshav Pingali, Robert Cox, D...
In the Finite Capacity Dial-a-Ride problem the input is a metric space, a set of objects {di}, each specifying a source si and a destination ti, and an integer k--the capacity of t...
A Zero Overhead Loop Buffer (ZOLB) is an architectural feature that is commonly found in DSP processors. This buffer can be viewed as a compiler managed cache that contains a sequ...
Gang-Ryung Uh, Yuhong Wang, David B. Whalley, Sanj...
Instruction scheduling methods based on the construction of state diagrams (or automata) have been used for architectures involving deeply pipelined function units. However, the s...
Ramaswamy Govindarajan, N. S. S. Narasimha Rao, Er...
We present a scalable algorithm for the parallel computation of inverted files for large text collections. The algorithm takes into account an environment of a high bandwidth netw...
Berthier A. Ribeiro-Neto, Joao Paulo Kitajima, Gon...