The widespread use of multicore processors has dramatically increased the demands on high bandwidth and large capacity from memory systems. In a conventional DDR2/DDR3 DRAM memory...
— Leveraging the power of scratchpad memories (SPMs) available in most embedded systems today is crucial to extract maximum performance from application programs. While regular a...
Taylan Yemliha, Shekhar Srikantaiah, Mahmut T. Kan...
As current trends in software development move toward more complex object-oriented programming, inlining has become a vital optimization that provides substantial performance impr...
An algorithm that remains in use at the core of many partitioning systems is the Kernighan-Lin algorithm and a variant the Fidducia-Matheysses (FM) algorithm. To understand the FM...
Wray L. Buntine, Lixin Su, A. Richard Newton, Andr...
This paper presents an automated performance tuning solution, which partitions a program into a number of tuning sections and finds the best combination of compiler options for e...