There are many successful applications to take advantages of massive parallelization on GPU for deep learning algorithm. In this project, I implemented a basic deep learning algorithm, i.e. Autoencoder. Core parts of this project are based on CUBLAS and CUDA kernels. I will first briefly introduce sparse autoencoder to make this report coherent, and to inspire th
Graphics processing units (GPUs) contain a significant number of cores relative to central processin...
The Graphics Processing Unit (GPU) parallel architecture is now being used not just for graphics but...
Deep convolutional neural networks achieve state-of-the-art performance in image classification. The...
Deep learning is an emerging workload in the field of HPC. This powerful method of resolution is abl...
Source-code for sparse matrix-dense matrix multiplication and sampled dense-dense matrix multiplicat...
The rise of deep-learning (DL) has been fuelled by the improvements in accelerators. Due to its uniq...
This work is focused on the pruning of some convolutional neural networks (CNNs) and improving their...
Open-source deep learning tools has been distributed numerously and has gain popularity in the past ...
We present a library that provides optimized implementations for deep learning primitives. Deep lear...
The aim of this project is to conduct a study of deep learning on multi-core processors. The study i...
The invention of deep belief network (DBN) provides a powerful tool for data modeling. The key advan...
This thesis presents a few methods to accelerate the inference of Deep Neural Networks that are lar...
Abstract In the next decade, the demands for computing in large scientific experimen...
Deep learning techniques have been gaining prominence in the research world in the past years, howev...
Over the last years, deep learning architectures have gained attention by winning important interna...
Graphics processing units (GPUs) contain a significant number of cores relative to central processin...
The Graphics Processing Unit (GPU) parallel architecture is now being used not just for graphics but...
Deep convolutional neural networks achieve state-of-the-art performance in image classification. The...
Deep learning is an emerging workload in the field of HPC. This powerful method of resolution is abl...
Source-code for sparse matrix-dense matrix multiplication and sampled dense-dense matrix multiplicat...
The rise of deep-learning (DL) has been fuelled by the improvements in accelerators. Due to its uniq...
This work is focused on the pruning of some convolutional neural networks (CNNs) and improving their...
Open-source deep learning tools has been distributed numerously and has gain popularity in the past ...
We present a library that provides optimized implementations for deep learning primitives. Deep lear...
The aim of this project is to conduct a study of deep learning on multi-core processors. The study i...
The invention of deep belief network (DBN) provides a powerful tool for data modeling. The key advan...
This thesis presents a few methods to accelerate the inference of Deep Neural Networks that are lar...
Abstract In the next decade, the demands for computing in large scientific experimen...
Deep learning techniques have been gaining prominence in the research world in the past years, howev...
Over the last years, deep learning architectures have gained attention by winning important interna...
Graphics processing units (GPUs) contain a significant number of cores relative to central processin...
The Graphics Processing Unit (GPU) parallel architecture is now being used not just for graphics but...
Deep convolutional neural networks achieve state-of-the-art performance in image classification. The...