This paper studies realization and performance comparison of the sequential and weak consistency models in the network-on-chip (NoC) based distributed shared memory (DSM) multi-cor...
Abdul Naeem, Xiaowen Chen, Zhonghai Lu, Axel Jants...
Abstract. Designing and tuning parallel applications with MPI, particularly at large scale, requires understanding the performance implications of different choices of algorithms ...
Torsten Hoefler, William Gropp, Rajeev Thakur, Jes...
Abstract. This paper presents TimeNET, a software tool for the modelling and performability evaluation using stochastic Petri nets. The tool has been designed especially for models...
This paper focuses on the Cyclops64 computer architecture and presents an analytical model and performance simulation results for the preloading and loop unrolling approaches to op...
Yanwei Niu, Ziang Hu, Kenneth E. Barner, Guang R. ...
Abstract—Performance is a key feature of large-scale computing systems. However, the achieved performance when a certain program is executed is significantly lower than the maxi...