Computer architects are constantly faced with the need to improve performance and increase the efficiency of computation in their designs. To this end, it is increasingly common ...
Nathan Clark, Amir Hormati, Scott A. Mahlke, Sami ...
Bank locality can be defined as localizing the number of load/store accesses to a small set of memory banks at a given time. An optimizing compiler can modify a given input code t...
Guilin Chen, Mahmut T. Kandemir, Hendra Saputra, M...
Filtering based algorithms have become popular in tracking human body pose. Such algorithms can suffer the curse of dimensionality due to the high dimensionality of the pose state ...
Rui Li, Ming-Hsuan Yang, Stan Sclaroff, Tai-Peng T...
We present a domain-specific approach to representing datapaths for hardware implementations of linear signal transform algorithms. We extend the tensor structure for describing l...
This paper considers the efficient parallel implementation of control constructs and expressions written in a common software programming language and synthesised to FPGA platform...