Parameterized first-order models for throughput, energy, and bandwidth are presented in this paper. Models are developed for many common pipeline methodologies, including clocked flopped, clocked time-borrowing latch protocols, asynchronous two-cycle, four-cycle, delay-insensitive, and source synchronous. The paper focuses on communication costs which have the potential to throttle design performance as scaling continues. The models can also be applied to logic. The equations share common parameters to allow apples-to-apples comparisons against different design targets and pipeline methodologies. By applying the parameters to various design targets, one can determine when unclocked communication is superior at the physical level to clocked ...
In this paper, we adapt Gustafson-Barsis' law to evaluate the effect of communication on the pe...
Digital circuits operating in the sub-threshold regime are able to perform minimum energy operation ...
As technology scales, signals may reach proportionally less and less chip area within a single clock...
Journal ArticleParameterized first-order models for throughput, energy, and bandwidth are presented...
Journal ArticleCommunication costs, which have the potential to throttle design performance as scal...
As the complexity of parallel computers grows, constraints posed by the construction of larger syste...
Synchronous very large-scale integration (VLSI) design is approaching a critical point, with clock d...
The problem of implementing reliable message delivery using timing information is considered. Two im...
This paper presents a survey on high-throughput and ultra low-power asynchronous pipeline design met...
As the complexity of parallel computers grows, constraints posed by the construction of larger syste...
AbstractIt is important to give a quick estimation of the performances of asynchronous pipelines dur...
This paper introduces a new methodology for evaluating performance of asynchronous linear-pipelines....
In this paper, a set of simple, general, yet practical performance models for RISC architectures are...
thesisThis thesis presents the trade-offs between concurrency reduction, energy and performance acr...
Among the claims made concerning the advantages of asynchronous logic are that circuits can take adv...
In this paper, we adapt Gustafson-Barsis' law to evaluate the effect of communication on the pe...
Digital circuits operating in the sub-threshold regime are able to perform minimum energy operation ...
As technology scales, signals may reach proportionally less and less chip area within a single clock...
Journal ArticleParameterized first-order models for throughput, energy, and bandwidth are presented...
Journal ArticleCommunication costs, which have the potential to throttle design performance as scal...
As the complexity of parallel computers grows, constraints posed by the construction of larger syste...
Synchronous very large-scale integration (VLSI) design is approaching a critical point, with clock d...
The problem of implementing reliable message delivery using timing information is considered. Two im...
This paper presents a survey on high-throughput and ultra low-power asynchronous pipeline design met...
As the complexity of parallel computers grows, constraints posed by the construction of larger syste...
AbstractIt is important to give a quick estimation of the performances of asynchronous pipelines dur...
This paper introduces a new methodology for evaluating performance of asynchronous linear-pipelines....
In this paper, a set of simple, general, yet practical performance models for RISC architectures are...
thesisThis thesis presents the trade-offs between concurrency reduction, energy and performance acr...
Among the claims made concerning the advantages of asynchronous logic are that circuits can take adv...
In this paper, we adapt Gustafson-Barsis' law to evaluate the effect of communication on the pe...
Digital circuits operating in the sub-threshold regime are able to perform minimum energy operation ...
As technology scales, signals may reach proportionally less and less chip area within a single clock...