[Abstract] Improving the performance of the convolution operation has become a key target for High Performance Computing (HPC) developers due to its prevalence in deep learning applied mainly to video processing. The improvement is being pushed by algorithmic and implementation innovations. Algorithmically, the convolution can be solved as it is mathematically enunciated, but other methods allow to transform it into a Fast Fourier Transform (FFT) or a GEneral Matrix Multiplication (GEMM). In this latter group, the Winograd algorithm is a state-of-the-art variant that is specially suitable for smaller convolutions. In this paper, we present openCNN, an optimized CUDA C++ implementation of the Winograd convolution algorithm. Our approach achi...
The main contribution of this paper is to show efficient implementations of the convolution-pooling ...
This paper focuses on the use of GPGPU (General-Purpose computing on Graphics Processing Units) for ...
In this paper, we demonstrate the implementation of our ultra-efficient deep convolutional neural ne...
Convolutional neural networks (CNNs) have recently attracted considerable attention due to their out...
Convolution is the most computationally intensive task of the Convolutional Neural Network (CNN). It...
Convolutional neural network (CNN) is an important deep learning method. The convolution operation t...
Convolution computation is a common operation in deep neural networks (DNNs) and is often responsibl...
Current digital processing algorithms require more computing power to achieve more accurate results ...
Convolutional Neural Network (CNN) has been used widely for the tasks of object recognition and faci...
The convolutional neural network (CNN) is the most widely used machine learning technique within the...
Les réseaux de neurones convolutifs (CNN) sont largement utilisés dans le domaine la reconnaissance ...
Winograd's minimal filtering algorithm has been widely used in Convolutional Neural Networks (CNNs) ...
This thesis puts to the test the power of parallel computing on the GPU against the massive computat...
Most of the experts admit that the true behavior of the neural network is hard to predict. It is qui...
Convolutional neural networks (CNNs) have been extensively used in many aspects, such as face and sp...
The main contribution of this paper is to show efficient implementations of the convolution-pooling ...
This paper focuses on the use of GPGPU (General-Purpose computing on Graphics Processing Units) for ...
In this paper, we demonstrate the implementation of our ultra-efficient deep convolutional neural ne...
Convolutional neural networks (CNNs) have recently attracted considerable attention due to their out...
Convolution is the most computationally intensive task of the Convolutional Neural Network (CNN). It...
Convolutional neural network (CNN) is an important deep learning method. The convolution operation t...
Convolution computation is a common operation in deep neural networks (DNNs) and is often responsibl...
Current digital processing algorithms require more computing power to achieve more accurate results ...
Convolutional Neural Network (CNN) has been used widely for the tasks of object recognition and faci...
The convolutional neural network (CNN) is the most widely used machine learning technique within the...
Les réseaux de neurones convolutifs (CNN) sont largement utilisés dans le domaine la reconnaissance ...
Winograd's minimal filtering algorithm has been widely used in Convolutional Neural Networks (CNNs) ...
This thesis puts to the test the power of parallel computing on the GPU against the massive computat...
Most of the experts admit that the true behavior of the neural network is hard to predict. It is qui...
Convolutional neural networks (CNNs) have been extensively used in many aspects, such as face and sp...
The main contribution of this paper is to show efficient implementations of the convolution-pooling ...
This paper focuses on the use of GPGPU (General-Purpose computing on Graphics Processing Units) for ...
In this paper, we demonstrate the implementation of our ultra-efficient deep convolutional neural ne...