Linear DSP kernels such as transforms and filters are comprised exclusively of additions and multiplications by constants. These multiplications may be realized as networks of ad...
Mobile video processing as defined in standards like MPEG-4 and H.263 contains a number of timeconsuming computations that cannot be efficiently executed on current hardware archi...
Sami Khawam, Sajid Baloch, Arjun Pai, Imran Ahmed,...
In this paper, an efficient Montgomery multiplier is introduced for the modular exponentiation operation, which is fundamental to numerous public-key cryptosystems. Four aspects a...
The IBM Cyclops-64 (C64) chip employs a multithreaded architecture that integrates a large number of hardware thread units on a single chip. A cellular supercomputer is being deve...
Restricted Boltzmann Machines (RBMs) — the building block for newly popular Deep Belief Networks (DBNs) — are a promising new tool for machine learning practitioners. However,...
Sang Kyun Kim, Lawrence C. McAfee, Peter L. McMaho...