Tensor Cores (TCUs) are specialized units first introduced by NVIDIA in the Volta microarchitecture in order to accelerate matrix multiplications for deep learning and linear algebra workloads. While these units have proved to be capable of providing significant speedups for specific applications, their programmability remains difficult for the average user. In this paper, we extend the Halide DSL and compiler with the ability to utilize these units when generating code for a CUDA based NVIDIA GPGPU. To this end, we introduce a new scheduling directive along with custom lowering passes that automatically transform a Halide AST in order to be able to generate code for the TCUs. We evaluate the generated code and show that it can achieve over...
The future of computation is the GPU, i.e. the Graphical Processing Unit. The graphics cards have sh...
Thesis (Master's)--University of Washington, 2019Previous work has developed a tool, the Tensor Temp...
Tensor algorithms are a rapidly growing field of research with applications in many scientific domai...
Tensor Cores (TCUs) are specialized units first introduced by NVIDIA in the Volta microarchitecture ...
The Halide DSL and compiler have enabled high-performance code generation for image processing pipel...
There has been a surge in the demand for a Domain Specific Architecture due to wide ranging deep lea...
We introduce Harmonic CUDA, a dataflow programming model for GPUs that allows programmers to describ...
We present a new algorithm to automatically generate high-performance GPU implementations of complex...
Deep learning algorithms are gaining popularity in autonomous systems. These systems typically have ...
In this work, we evaluate OpenCL as a programming tool for developing performance-portable applicati...
2012-05-02Graphics Processing Units (GPUs) have evolved to devices with teraflop-level performance p...
To respond to the intense computational load of deep neural networks, a plethora of domain-specific ...
General Matrix Multiplication or GEMM kernels take centre place in high performance computing and ma...
Graphics Processing Units (GPU) have been widely adopted to accelerate the execution of HPC workload...
Tensor Cores have been an important unit to accelerate Fused Matrix Multiplication Accumulation (MMA...
The future of computation is the GPU, i.e. the Graphical Processing Unit. The graphics cards have sh...
Thesis (Master's)--University of Washington, 2019Previous work has developed a tool, the Tensor Temp...
Tensor algorithms are a rapidly growing field of research with applications in many scientific domai...
Tensor Cores (TCUs) are specialized units first introduced by NVIDIA in the Volta microarchitecture ...
The Halide DSL and compiler have enabled high-performance code generation for image processing pipel...
There has been a surge in the demand for a Domain Specific Architecture due to wide ranging deep lea...
We introduce Harmonic CUDA, a dataflow programming model for GPUs that allows programmers to describ...
We present a new algorithm to automatically generate high-performance GPU implementations of complex...
Deep learning algorithms are gaining popularity in autonomous systems. These systems typically have ...
In this work, we evaluate OpenCL as a programming tool for developing performance-portable applicati...
2012-05-02Graphics Processing Units (GPUs) have evolved to devices with teraflop-level performance p...
To respond to the intense computational load of deep neural networks, a plethora of domain-specific ...
General Matrix Multiplication or GEMM kernels take centre place in high performance computing and ma...
Graphics Processing Units (GPU) have been widely adopted to accelerate the execution of HPC workload...
Tensor Cores have been an important unit to accelerate Fused Matrix Multiplication Accumulation (MMA...
The future of computation is the GPU, i.e. the Graphical Processing Unit. The graphics cards have sh...
Thesis (Master's)--University of Washington, 2019Previous work has developed a tool, the Tensor Temp...
Tensor algorithms are a rapidly growing field of research with applications in many scientific domai...