We explore the loss landscape of fully-connected and convolutional neural networks using random, low-dimensional hyperplanes and hyperspheres. Evaluating the Hessian, H, of the loss function on these hypersurfaces, we observe 1) an unusual excess of the number of positive eigenvalues of H, and 2) a large value of Tr(H)/||H|| at a well defined range of configuration space radii, corresponding to a thick, hollow, spherical shell we refer to as the Goldilocks zone. We observe this effect for fully-connected neural networks over a range of network widths and depths on MNIST and CIFAR-10 datasets with the ReLU and tanh non-linearities, and a similar effect for convolutional networks. Using our observations, we demonstrate a close connection betw...
Neural networks provide a rich class of high-dimensional, non-convex optimization problems. Despite ...
Despite the fact that the loss functions of deep neural networks are highly nonconvex, gradient-base...
Recent work has established clear links between the generalization performance of trained neural net...
Loss landscape analysis is extremely useful for a deeper understanding of the generalization ability...
Loss landscape analysis is extremely useful for a deeper understanding of the generalization ability...
Deep learning has been immensely successful at a variety of tasks, ranging from classification to ar...
International audienceDeep learning has been immensely successful at a variety of tasks, ranging fro...
International audienceDeep learning has been immensely successful at a variety of tasks, ranging fro...
International audienceDeep learning has been immensely successful at a variety of tasks, ranging fro...
International audienceDeep learning has been immensely successful at a variety of tasks, ranging fro...
International audienceDeep learning has been immensely successful at a variety of tasks, ranging fro...
Abstract: We investigate the structure of the loss function landscape for neural networks subject to...
Abstract: We investigate the structure of the loss function landscape for neural networks subject to...
Abstract We investigate the structure of the loss function landscape for neural netwo...
Recent work has established clear links between the generalization performance of trained neural net...
Neural networks provide a rich class of high-dimensional, non-convex optimization problems. Despite ...
Despite the fact that the loss functions of deep neural networks are highly nonconvex, gradient-base...
Recent work has established clear links between the generalization performance of trained neural net...
Loss landscape analysis is extremely useful for a deeper understanding of the generalization ability...
Loss landscape analysis is extremely useful for a deeper understanding of the generalization ability...
Deep learning has been immensely successful at a variety of tasks, ranging from classification to ar...
International audienceDeep learning has been immensely successful at a variety of tasks, ranging fro...
International audienceDeep learning has been immensely successful at a variety of tasks, ranging fro...
International audienceDeep learning has been immensely successful at a variety of tasks, ranging fro...
International audienceDeep learning has been immensely successful at a variety of tasks, ranging fro...
International audienceDeep learning has been immensely successful at a variety of tasks, ranging fro...
Abstract: We investigate the structure of the loss function landscape for neural networks subject to...
Abstract: We investigate the structure of the loss function landscape for neural networks subject to...
Abstract We investigate the structure of the loss function landscape for neural netwo...
Recent work has established clear links between the generalization performance of trained neural net...
Neural networks provide a rich class of high-dimensional, non-convex optimization problems. Despite ...
Despite the fact that the loss functions of deep neural networks are highly nonconvex, gradient-base...
Recent work has established clear links between the generalization performance of trained neural net...