Data prefetching has been widely used in the past as a technique for hiding memory access latencies. However, data prefetching in multi-threaded applications running on chip multi...
Dhruva Chakrabarti, Mahmut T. Kandemir, Mustafa Ka...
The efficient mapping of program parallelism to multi-core processors is highly dependent on the underlying architecture. This paper proposes a portable and automatic compiler-bas...
Because data races represent a hard-to-manage class of errors in concurrent programs, numerous approaches to detect them have been proposed and evaluated. We specifically consider...
Paruj Ratanaworabhan, Martin Burtscher, Darko Kiro...
This paper introduces a new way to provide strong atomicity in an implementation of transactional memory. Strong atomicity lets us offer clear semantics to programs, even if they ...
Researchers in transactional memory (TM) have proposed open nesting as a methodology for increasing the concurrency of transactional programs. The idea is to ignore "low-leve...