Many large-scale parallel programs follow a bulk synchronous parallel (BSP) structure with distinct computation and communication phases. Although the communication phase in such ...
Torsten Hoefler, Christian Siebert, Andrew Lumsdai...
This paper presents an analytical model to predict the performance of general-purpose applications on a GPU architecture. The model is designed to provide performance information ...
Sara S. Baghsorkhi, Matthieu Delahaye, Sanjay J. P...
—One-sided communication is important to enable asynchronous communication and data movement for Global Address Space (GAS) programming models. Such communication is typically re...
Xinyu Que, Weikuan Yu, Vinod Tipparaju, Jeffrey S....
s of titles and abstracts) as irrelevant to our focus. We read the remaining 519 papers in full to establish our final list. The 92 papers we chose were originally published in the...
Tracy Hall, Helen Sharp, Sarah Beecham, Nathan Bad...
Because of cost and resource constraints, sensor nodes do not have a complicated hardware architecture or operating system to protect program safety. Hence, the notorious buffer-o...