In prior work, we have proposed techniques to extend the ease of shared-memory parallel programming to distributed-memory platforms by automatic translation of OpenMP programs to ...
Performance analysis tools are critical for the effective use of large parallel computing resources, but existing tools have failed to address three problems that limit their scal...
Conventional processors use a fully-associative store queue (SQ) to implement store-load forwarding. Associative search latency does not scale well to capacities and bandwidths re...
Real-time rendering of massively textured 3D scenes usually involves two major problems: Large numbers of texture switches are a well-known performance bottleneck and the set of s...
This paper describes an integrated system for coordinating multiple rover behavior with the overall goal of collecting planetary surface data. The MISUS system combines techniques...
Tara A. Estlin, Daniel M. Gaines, Forest Fisher, R...