Modern multicore systems have a large number of components operating in different clock domains and communicating through asynchronous interfaces. These interfaces use synchronizer circuits, which guard against metastability failures but introduce latency in processing the asynchronous input. We propose a speculative method that hides synchronization latency by overlapping it with computation cycles. We verify the correctness of our approach through a field programmable gate array implementation and apply it to a number of synthesized benchmarks. Synthesis results reveal that our approach achieves average savings of 135% and 204% in area costs and nearly 100% in power costs compared to two similar speculative technique
A synchronization solution is developed in order to allow finer grained segmentation of clock domain...
a robust communication scheme between modules, it is possible to reduce the design effort of the glo...
For scalable-shared memory multiprocessor Systemon-a-Chip implementations, synchronization overhead ...
Abstract- Synchronizers were required when reading an asynchronous input. In a multi clock system, s...
Multi-core processors are ubiquitous. Even embedded systems nowadays use processors with multiple co...
In multicores, performance-critical synchronization is increasingly performed in a lock-free manner ...
Metastability causes unpredictable behavior in circuits, and can cause circuit failure. Any binary v...
Abstract—Asynchronous circuits have a number of potential performance advantages over their synchron...
Computing systems are now frequently composed of independently clocked subsystems that cooperate to ...
In this paper, we revisit the design of synchronization primitives---specifically barriers, mutexes,...
A new method for low-latency asynchronous circuit design uses a two-level architecture. It consists ...
We analyze an Alpha 21264-like Globally–Asynchronous, Locally–Synchronous (GALS) processor organized...
This thesis presents novel communication schemes between independent clock domains. The clock domai...
Abstract This paper proposes and evaluates new synchronization schemes for a simultaneous multithrea...
For many years, CMOS process scaling has allowed a steady increase in the operating frequency and in...
A synchronization solution is developed in order to allow finer grained segmentation of clock domain...
a robust communication scheme between modules, it is possible to reduce the design effort of the glo...
For scalable-shared memory multiprocessor Systemon-a-Chip implementations, synchronization overhead ...
Abstract- Synchronizers were required when reading an asynchronous input. In a multi clock system, s...
Multi-core processors are ubiquitous. Even embedded systems nowadays use processors with multiple co...
In multicores, performance-critical synchronization is increasingly performed in a lock-free manner ...
Metastability causes unpredictable behavior in circuits, and can cause circuit failure. Any binary v...
Abstract—Asynchronous circuits have a number of potential performance advantages over their synchron...
Computing systems are now frequently composed of independently clocked subsystems that cooperate to ...
In this paper, we revisit the design of synchronization primitives---specifically barriers, mutexes,...
A new method for low-latency asynchronous circuit design uses a two-level architecture. It consists ...
We analyze an Alpha 21264-like Globally–Asynchronous, Locally–Synchronous (GALS) processor organized...
This thesis presents novel communication schemes between independent clock domains. The clock domai...
Abstract This paper proposes and evaluates new synchronization schemes for a simultaneous multithrea...
For many years, CMOS process scaling has allowed a steady increase in the operating frequency and in...
A synchronization solution is developed in order to allow finer grained segmentation of clock domain...
a robust communication scheme between modules, it is possible to reduce the design effort of the glo...
For scalable-shared memory multiprocessor Systemon-a-Chip implementations, synchronization overhead ...