Stochastic Gradient Descent Algorithm in the Computational Network Toolkit

Brian Guenter
Dong Yu
Adam Eversole
Oleksii Kuchaiev
Michael L. Seltzer

Publication date

December 2014

Abstract

We introduce the stochastic gradient descent algorithm used in the computational network toolkit (CNTK) — a general purpose machine learning toolkit written in C++ for training and using models that can be expressed as a computational network. We describe the algorithm used to compute the gradients automatically for a given network. We also propose a low-cost automatic learning rate selection algorithm and demonstrate that it works well in practice. 1 Computational Network Toolkit A computational network (CN) is a directed graph in which each leaf represents an input value or a learnable parameter and each node represents an operator. Figure 1 illustrates an example CN of a log-linear model. Here, each node is identified by a {node name: o...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Stochastic Gradient Descent Algorithm in the Computational Network Toolkit

Abstract

Extracted data

Stochastic Gradient Descent Algorithm in the Computational Network Toolkit

Abstract

Extracted data

Related items

Related items