Many techniques for increasing the amount of instruction-level parallelism (ILP) put increased pressure on the registers inside a CPU. These techniques allow for more operations t...
Jason Hiser, Steve Carr, Philip H. Sweany, Steven ...
As communication and I/O traffic increase on the interconnection network of high-performance systems, network contention becomes a critical problem drastically reducing performan...
In a machine that follows the dynamically trace scheduled VLIW (DTSVLIW) architecture, VLIW instructions are built dynamically through an algorithm that can be implemented in hard...
All-to-all personalized exchange is one of the most dense collective communication patterns and occurs in many important parallel computing/networking applications. In this paper,...
There have been many debates about the feasibility of providing guaranteed Quality of Service (QoS) when network traffic travels beyond the enterprise domain and into the vast unk...