Code placement techniques have traditionally improved instruction fetch bandwidth by increasing instruction locality and decreasing the number of taken branches. However, traditio...
We are attacking the memory bottleneck by building a “smart” memory controller that improves effective memory bandwidth, bus utilization, and cache efficiency by letting appl...
Binu K. Mathew, Sally A. McKee, John B. Carter, Al...
This paper demonstrates how an Instruction Path Coprocessor (I-COP) can be efficiently implemented using the PipeRench reconfigurable architecture. An I-COP is a programmable on-c...
Yuan C. Chou, Pazhani Pillai, Herman Schmit, John ...
In this paper we examine the effect of receptive field designs on classification accuracy in the commonly adopted pipeline of image classification. While existing algorithms us...
Because of the growing importance of decimal floating-point (DFP) arithmetic, specifications for it are included in the IEEE Draft Standard for Floating-point Arithmetic (IEEE P75...
Charles Tsen, Sonia Gonzalez-Navarro, Michael J. S...