This paper explores the role of branch predictor organization in power/energy/performance tradeoffs for processor design. We find that as a general rule, to reduce overall energy ...
Dharmesh Parikh, Kevin Skadron, Yan Zhang, Marco B...
This paper advocates that cache coherence protocols use a bandwidth adaptive approach to adjust to varied system configurations (e.g., number of processors) and workload behaviors...
Milo M. K. Martin, Daniel J. Sorin, Mark D. Hill, ...
Several studies of speculative execution based on values have reported promising performance potential. However, virtually all microarchitectures in these studies were described i...
Thread-Level Speculation (TLS) allows us to automatically parallelize general-purpose programs by supporting parallel execution of threads that might not actually be independent. ...
J. Gregory Steffan, Christopher B. Colohan, Antoni...
We propose a low overhead, on-line memory monitoring scheme utilizing a set of novel hardware counters. The counters act like pressure gauges indicating the marginal gain in the n...