Advances in parallel computing architectures (e.g., Graphics Processing Units (GPUs)) have had great success in helping meet the performance and energy-efficiency demands of many high-performance computing (HPC) applications. DRAM bandwidth is generally a critical performance bottleneck for many of such applications. With the advances in memory technology, the DRAM bandwidth bottleneck is shifting towards other parts of the system hierarchy (e.g., interconnects). We identify neural network backpropagation as one application where the interconnect network is one of the biggest performance bottlenecks. We show that the interconnect bottleneck for backpropagation can be significantly alleviated if computing cores and caching units are carefull...
There has been a recent emergence of applications from the domain of machine learning, data mining, ...
As a throughput-oriented device, Graphics Processing Unit(GPU) has already integrated with cache, wh...
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy cons...
The computation power from graphics processing units (GPUs) has become prevalent in many fields of c...
Machine learning is a key application driver of new computing hardware. Designing high-performance m...
Abstract. We explore the backtracking paradigm with properties seen as sub-optimal for GPU architect...
There is a well-known spectrum of computing hardware ranging from central processing units (CPUs) to...
This thesis presents the results of an architectural study on the design of FPGA- based architecture...
AbstractAlthough volunteer computing with a huge number of high-performance game consoles connected ...
Graphical processing units (GPUs) achieve high throughput with hundreds of cores for concurrent exec...
abstract: With the massive multithreading execution feature, graphics processing units (GPUs) have b...
<p>Heterogeneous processors with accelerators provide an opportunity to improve performance within a...
This report evaluates two distinct methods of improving the performance of GPU memory systems. Over ...
Simulation is a third pillar next to experiment and theory in the study of complex dynamic systems s...
Graphics processing units (GPUs) contain a significant number of cores relative to central processin...
There has been a recent emergence of applications from the domain of machine learning, data mining, ...
As a throughput-oriented device, Graphics Processing Unit(GPU) has already integrated with cache, wh...
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy cons...
The computation power from graphics processing units (GPUs) has become prevalent in many fields of c...
Machine learning is a key application driver of new computing hardware. Designing high-performance m...
Abstract. We explore the backtracking paradigm with properties seen as sub-optimal for GPU architect...
There is a well-known spectrum of computing hardware ranging from central processing units (CPUs) to...
This thesis presents the results of an architectural study on the design of FPGA- based architecture...
AbstractAlthough volunteer computing with a huge number of high-performance game consoles connected ...
Graphical processing units (GPUs) achieve high throughput with hundreds of cores for concurrent exec...
abstract: With the massive multithreading execution feature, graphics processing units (GPUs) have b...
<p>Heterogeneous processors with accelerators provide an opportunity to improve performance within a...
This report evaluates two distinct methods of improving the performance of GPU memory systems. Over ...
Simulation is a third pillar next to experiment and theory in the study of complex dynamic systems s...
Graphics processing units (GPUs) contain a significant number of cores relative to central processin...
There has been a recent emergence of applications from the domain of machine learning, data mining, ...
As a throughput-oriented device, Graphics Processing Unit(GPU) has already integrated with cache, wh...
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy cons...