This paper presents a novel technique to perform global optimization of communication and preprocessing calls in the presence of array accesses with arbitrary subscripts. Our sche...
Programmers of message-passing codes for clusters of workstations face a daunting challenge in understanding the performance bottlenecks of their applications. This is largely due...
Abstract. Distributed applications running on clusters may be composed of several components with very different performance requirements. The FlowVR middleware allows the develop...
Dependencies between iterations of loop structures cannot always be determined at compile-time because they may depend on input data which is known only at run-time. A prime examp...
V. Prasad Krothapalli, Thulasiraman Jeyaraman, Mar...
Abstract. The traditional technique to simulate physical systems modeled by partial differential equations is by means of a time-stepped methodology where the state of the system ...
Homa Karimabadi, Jonathan Driscoll, Jagrut Dave, Y...