Rutgers University Department of Physics and Astronomy
Video File
Presentation Slides (Keynote)

Jared Kaplan
(Johns Hopkins University)

Title: Scaling Laws in Machine Learning and GPT-3

Abstract: A variety of recent works suggest that scaling laws are ubiquitous in machine learning. In particular, neural network performance obeys scaling laws with respect to the number of parameters, dataset size, and the training compute budget. I will explain these scaling laws, and argue that they are both precise and very universal. Then I will explain how this line of thinking led to the GPT-3 language model, and what it suggests for the future.

For help, please contact Webmaster.