Barrier synchronization in shared memory parallel ma-chines has been widely implemented through busy-waiting on shared variables. However, typical implementations of barrier synchronization tend to produce hot-spots in terms of memory and network contention, thus creating perfor-mance bottlenecks that become markedly more pronounced as the number of cores or processors increases. To over-come such limitations, we present a novel hardware-based barrier mechanism in the context of many-core CMPs. Our proposal is based on global interconnection lines (G-lines) and the S-CSMA technique, which have been recently used to enhance a flow control mechanism (EVC) in the context of networks-on-chip. Based on this technology, we have de-signed a simple...
[[abstract]]©1998 IEEE-In this paper, we consider a tree-based routing scheme for supporting barrier...
International audienceWith the rise of multi-core processors with a large number of cores, the need ...
The Bulk Synchronous Parallel (BSP) model of computation can be used to develop efficient and portab...
We present in this work a novel hardware-based barrier mech-anism for synchronization on many-core C...
Abstract. Whereas efcient barrier implementations were once a concern only in high-performance compu...
This paper presents a novel mechanism for barrier synchronization on chip multi-processors (CMPs). B...
Abstract—Barrier synchronization is a key programming primitive for shared memory embedded MPSoCs. A...
Interconnects based on Networks-on-Chip are an appealing solution to address future microprocessor d...
Barrier synchronization is a key programming primitive for shared memory embedded MPSoCs. As the cor...
Barrier synchronization is a commonly used primitive in parallel processing, but has traditionally b...
This paper investigates optimized synchronization techniques for shared memory on-chip multiprocesso...
Abstract — Network-on-Chip (NoC) based many-cores are be-coming popular due to their high scalabilit...
To simplify program development for the Singlechip Cloud Computer (SCC) it is desirable to have high...
Barrier synchronisation is a widely-studied topic since the supercomputer era due to its significant...
The MPI Barrier() call can be crucial for several applications and has been target of different opti...
[[abstract]]©1998 IEEE-In this paper, we consider a tree-based routing scheme for supporting barrier...
International audienceWith the rise of multi-core processors with a large number of cores, the need ...
The Bulk Synchronous Parallel (BSP) model of computation can be used to develop efficient and portab...
We present in this work a novel hardware-based barrier mech-anism for synchronization on many-core C...
Abstract. Whereas efcient barrier implementations were once a concern only in high-performance compu...
This paper presents a novel mechanism for barrier synchronization on chip multi-processors (CMPs). B...
Abstract—Barrier synchronization is a key programming primitive for shared memory embedded MPSoCs. A...
Interconnects based on Networks-on-Chip are an appealing solution to address future microprocessor d...
Barrier synchronization is a key programming primitive for shared memory embedded MPSoCs. As the cor...
Barrier synchronization is a commonly used primitive in parallel processing, but has traditionally b...
This paper investigates optimized synchronization techniques for shared memory on-chip multiprocesso...
Abstract — Network-on-Chip (NoC) based many-cores are be-coming popular due to their high scalabilit...
To simplify program development for the Singlechip Cloud Computer (SCC) it is desirable to have high...
Barrier synchronisation is a widely-studied topic since the supercomputer era due to its significant...
The MPI Barrier() call can be crucial for several applications and has been target of different opti...
[[abstract]]©1998 IEEE-In this paper, we consider a tree-based routing scheme for supporting barrier...
International audienceWith the rise of multi-core processors with a large number of cores, the need ...
The Bulk Synchronous Parallel (BSP) model of computation can be used to develop efficient and portab...