In multicomputer architectures where communication latency is distance independent, thread placement is expected to have a limited impact on an application's performance. In this paper, the impact of thread placement on application performance is demonstrated on a wormhole routed multicomputer, the Intel Paragon. A communication intensive synthetic workload is used to "stress test" the effects of contention on communication latency induced by thread placement. It is shown by means of experimentation and modeling that appropriate thread placement patterns minimizing contention in the system's interconnection network improve performance. The analytic model and the experimental observations are in good agreement. Keywords:...
It is well known that the placement of threads and memory plays a crucial role for performance on NU...
International audienceWith the introduction of multi-core processors, thread affinity has quickly ap...
The pursuit of high connectivity in communication network design for multicomputers is often complic...
In multicomputer architectures where communication latency is distance independent, thread placement...
Abstract — Significant theoretical research was done on in-terconnect topologies and topology aware ...
In the early years of parallel computing research, significant theoretical studies were done on inte...
2D-mesh and torus networks have often been proposed as the interconnection pattern for parallel comp...
As the field of High Performance Computing (HPC) approaches the Exascale era we see larger systems c...
The performance evaluation of multiprocessor interconnects cannot be divorced from issues of traffic...
One of the most important contemporary issues in concurrent computing is network performance, for wi...
Abstract—There is a clear trend in current processor design towards the combination of several threa...
Multithreading is a processor technique that can effectively hide long latencies that can occur due ...
Multithreaded architectures use the parallelism in programs to tolerate long latencies for communica...
Multithreaded architectures context switch to another instruction stream to hide the latency of memo...
Abstract—this paper studies the influence that task placement may have on the performance of applica...
It is well known that the placement of threads and memory plays a crucial role for performance on NU...
International audienceWith the introduction of multi-core processors, thread affinity has quickly ap...
The pursuit of high connectivity in communication network design for multicomputers is often complic...
In multicomputer architectures where communication latency is distance independent, thread placement...
Abstract — Significant theoretical research was done on in-terconnect topologies and topology aware ...
In the early years of parallel computing research, significant theoretical studies were done on inte...
2D-mesh and torus networks have often been proposed as the interconnection pattern for parallel comp...
As the field of High Performance Computing (HPC) approaches the Exascale era we see larger systems c...
The performance evaluation of multiprocessor interconnects cannot be divorced from issues of traffic...
One of the most important contemporary issues in concurrent computing is network performance, for wi...
Abstract—There is a clear trend in current processor design towards the combination of several threa...
Multithreading is a processor technique that can effectively hide long latencies that can occur due ...
Multithreaded architectures use the parallelism in programs to tolerate long latencies for communica...
Multithreaded architectures context switch to another instruction stream to hide the latency of memo...
Abstract—this paper studies the influence that task placement may have on the performance of applica...
It is well known that the placement of threads and memory plays a crucial role for performance on NU...
International audienceWith the introduction of multi-core processors, thread affinity has quickly ap...
The pursuit of high connectivity in communication network design for multicomputers is often complic...