We study on-line generalized linear regression with multidimensional outputs, i.e., neural networks with multiple output nodes but no hidden nodes. We allow at the final layer transfer functions such as the softmax function that need to consider the linear activations to all the output neurons. The weight vectors used to produce the linear activations are represented indirectly by maintaining separate parameter vectors. We get the weight vector by applying a particular parameterization function to the parameter vector. Updating the parameter vectors upon seeing new examples is done additively, as in the usual gradient descent update. However, by using a nonlinear parameterization function between the parameter vectors and the weight vectors...
© 2018 IEEE. A loss function measures the discrepancy between the true values (observations) and the...
International audienceThis paper discusses the asymptotic behavior of regression models under genera...
Despite the success of Lipschitz regularization in stabilizing GAN training, the exact reason of its...
Abstract. We study on-line generalized linear regression with multidimensional outputs, i.e., neural...
Abstract—We analyze and compare the well-known gradient descent algorithm and the more recent expone...
AbstractIn this paper we present a new analysis of two algorithms, Gradient Descent and Exponentiate...
Abstract. Foster and Vovk proved relative loss bounds for linear regression where the total loss of ...
Recently, several studies have proven the global convergence and generalization abilities of the gra...
A new loss function is proposed which learns the hinge loss function an infinite number of times pus...
Modern machine learning models, particularly those used in deep networks, are characterized by massi...
Although recent works have brought some insights into the performance improvement of techniques used...
Normalized gradient descent has shown substantial success in speeding up the convergence of exponen...
Neural networks can be trained to solve regression problems by using gradient-based methods to minim...
Under mild assumptions, we investigate the structure of loss landscape of two-layer neural networks ...
International audienceThis work concerns the estimation of multidimensional nonlinear regression mod...
© 2018 IEEE. A loss function measures the discrepancy between the true values (observations) and the...
International audienceThis paper discusses the asymptotic behavior of regression models under genera...
Despite the success of Lipschitz regularization in stabilizing GAN training, the exact reason of its...
Abstract. We study on-line generalized linear regression with multidimensional outputs, i.e., neural...
Abstract—We analyze and compare the well-known gradient descent algorithm and the more recent expone...
AbstractIn this paper we present a new analysis of two algorithms, Gradient Descent and Exponentiate...
Abstract. Foster and Vovk proved relative loss bounds for linear regression where the total loss of ...
Recently, several studies have proven the global convergence and generalization abilities of the gra...
A new loss function is proposed which learns the hinge loss function an infinite number of times pus...
Modern machine learning models, particularly those used in deep networks, are characterized by massi...
Although recent works have brought some insights into the performance improvement of techniques used...
Normalized gradient descent has shown substantial success in speeding up the convergence of exponen...
Neural networks can be trained to solve regression problems by using gradient-based methods to minim...
Under mild assumptions, we investigate the structure of loss landscape of two-layer neural networks ...
International audienceThis work concerns the estimation of multidimensional nonlinear regression mod...
© 2018 IEEE. A loss function measures the discrepancy between the true values (observations) and the...
International audienceThis paper discusses the asymptotic behavior of regression models under genera...
Despite the success of Lipschitz regularization in stabilizing GAN training, the exact reason of its...