Modern scientific computing involves organizing, moving, visualizing, and analyzing massive amounts of data at multiple sites around the world. The technologies, the middleware se...
Brian Tierney, Jason Lee, Brian Crowley, Mason Hol...
In this paper, we describe the design and implementation of an integrated architecture for cache systems that scale to hundreds or thousands of caches with thousands to millions o...
Renu Tewari, Michael Dahlin, Harrick M. Vin, Jonat...
On-line transaction processing exhibits poor memory behavior in high-end multiprocessor servers because of complex sharing patterns and substantial interaction between the databas...
Abstract. Source code transformations are a very effective method of parallelizing and improving the efficiency of programs. Unfortunately most compiler systems require implementin...
Instance based locality optimization 6 is a semi automatic program restructuring method that reduces the number of cache misses. The method imitates the human approach of consideri...