Scaling behaviour of neural networks: Existence and character of large width limits

Hron, Jiri

Open link

Publication date

January 2023

DOI

10.17863/CAM.100638

Publisher

Trinity College

Abstract

It took until the last decade to finally see a machine match human performance on essentially any task related to vision or natural language understanding. Most of these successes were achieved by neural networks (NNs), a class of algorithms for finding patterns in large swaths of data. The progress since 2020 in particular has been driven by: (i) growing the number of parameters NNs can use to make predictions, and (ii) increasing the amount of data used to optimise these parameters. The race for scale has been fuelled by the discovery of scaling laws, an empirical phenomenon where the number of errors a model makes decays as a power law of the dataset size and NN parameter count. This thesis is devoted to understanding how parameter cou...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Scaling behaviour of neural networks: Existence and character of large width limits

Abstract

Extracted data

Scaling behaviour of neural networks: Existence and character of large width limits

Abstract

Extracted data

Related items

Related items