This paper explains why parallel implementation of matrix multiplication--a seemingly simple algorithm that can be expressed as one statement and three nested loops--is complex: P...
John A. Gunnels, Calvin Lin, Greg Morrow, Robert A...
Exploiting instruction-level parallelism (ILP) is extremely important for achieving high performance in application specific instruction set processors (ASIPs) and embedded process...
Ramaswamy Govindarajan, Erik R. Altman, Guang R. G...
In this paper, we present a parallel algorithm for the minimization of deterministic finite state automata (DFA's) and discuss its implementation on a connection machine CM-5...
Evaluating the design of a distributed application is di cult but provides useful information for program development and maintenance. In distributed debugging, for example, proce...
A case study of performance and dependability evaluation of fault-tolerant multiprocessors is presented. Two specific architectures are analyzed taking into account system functio...