Parallel back-propagation neural network training technique using CUDA on multiple GPUs

Zhang, S. (Shunlu)
Gunupudi, P. (Pavan)
Zhang, Q.J. (Qi-Jun)

Open link

Publication date

February 2016

DOI

10.1109/NEMO.2015.7415056

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Abstract

A parallel Back-Propagation(BP) neural network training technique using Compute Unified Device Architecture (CUDA) on multiple Graphics Processing Units(GPUs) is proposed. To exploit the maximum performance of GPUs, we propose to implement batch mode BP training by building input neurons, hidden neurons and output neurons into matrix form. The implementation includes CUDA Basic Linear Algebra Subroutines (cuBLAS) function to perform matrix and vector operations and CUDA kernel. The proposed technique utilizes multiple GPUs to achieve further acceleration. Each GPU has the same neural network structure and weight parameter. The number of training samples are distributed to multiple GPUs. Each GPU calculates local training error and the gradi...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Parallel back-propagation neural network training technique using CUDA on multiple GPUs

Abstract

Extracted data

Parallel back-propagation neural network training technique using CUDA on multiple GPUs

Abstract

Extracted data

Related items

Related items