Convolution computation is a common operation in deep neural networks (DNNs) and is often responsible for performance bottlenecks during training and inferencing. Existing approaches for accelerating convolution operations aim to reduce computational complexity. However, these strategies often increase the memory footprint with extra memory accesses, thereby leaving much room for performance improvement. This paper presents a novel approach to optimize memory access for convolution operations, specifically targeting GPU execution. Our approach leverages two optimization techniques to reduce the number of memory operations for convolution operations performed on the width and height dimensions. For convolution computations on the width dimen...
This work is focused on the pruning of some convolutional neural networks (CNNs) and improving their...
Transpose convolution has shown prominence in many deep learning applications. However, transpose co...
The focus of this paper is speeding up the application of convolutional neural networks. While deliv...
The depthwise separable convolution is widely used to reduce the computation overhead of multi-chann...
Convolutional neural network (CNN) is an important deep learning method. The convolution operation t...
Graphical processing units (GPUs) achieve high throughput with hundreds of cores for concurrent exec...
The main contribution of this paper is to show efficient implementations of the convolution-pooling ...
Convolution is the most computationally intensive task of the Convolutional Neural Network (CNN). It...
Convolutional neural networks (CNNs) have recently attracted considerable attention due to their out...
We present an implementation of the overlap-and-save method, a method for the convolution of very lo...
Recently, convolutional neural networks (CNN) have been widely used in image processing and computer...
Part 8: Short PapersInternational audienceArtificial intelligence has developed rapidly in recent ye...
The research domain of Multimedia Content Analysis (MMCA) considers all aspects of the automated ext...
Recently, machine learning, especially deep learning, has been a core algorithm to be widely used in...
With the increasing sophistication of image processing algorithms, and because of its low computatio...
This work is focused on the pruning of some convolutional neural networks (CNNs) and improving their...
Transpose convolution has shown prominence in many deep learning applications. However, transpose co...
The focus of this paper is speeding up the application of convolutional neural networks. While deliv...
The depthwise separable convolution is widely used to reduce the computation overhead of multi-chann...
Convolutional neural network (CNN) is an important deep learning method. The convolution operation t...
Graphical processing units (GPUs) achieve high throughput with hundreds of cores for concurrent exec...
The main contribution of this paper is to show efficient implementations of the convolution-pooling ...
Convolution is the most computationally intensive task of the Convolutional Neural Network (CNN). It...
Convolutional neural networks (CNNs) have recently attracted considerable attention due to their out...
We present an implementation of the overlap-and-save method, a method for the convolution of very lo...
Recently, convolutional neural networks (CNN) have been widely used in image processing and computer...
Part 8: Short PapersInternational audienceArtificial intelligence has developed rapidly in recent ye...
The research domain of Multimedia Content Analysis (MMCA) considers all aspects of the automated ext...
Recently, machine learning, especially deep learning, has been a core algorithm to be widely used in...
With the increasing sophistication of image processing algorithms, and because of its low computatio...
This work is focused on the pruning of some convolutional neural networks (CNNs) and improving their...
Transpose convolution has shown prominence in many deep learning applications. However, transpose co...
The focus of this paper is speeding up the application of convolutional neural networks. While deliv...