The increasing numbers of cores, shared caches and memory nodes within machines introduces a complex hardware topology. High-performance computing applications now have to carefull...
A framework for task assignment in heterogeneous computing systems is presented in this work. The framework is based on a learning automata model. The proposed model can be used f...
We describe methods of identifying and exploiting sharing patterns in multi-threaded DSM applications. Active correlation tracking is used to determine the affinity, or amount of ...
Trace-level reuse is based on the observation that some traces (dynamic sequences of instructions) are frequently repeated during the execution of a program, and in many cases, th...
Automatic parallelization of general-purpose programs is still not possible in general in the presence of irregular data structures and complex control-flows. One promising strate...