Power law covariance and a solvable model of the Kaplan scaling laws

Series
Stochastics Seminar
Time
Thursday, October 23, 2025 - 3:30pm for 1 hour (actually 50 minutes)
Location
Speaker
Elliot Paquette – McGill University
Organizer
Cheng Mao

One of the foundational ideas in modern machine learning is the scaling hypothesis: that machine learning models will improve in a predictable manner, with each doubling of resources leading to a commensurate improvement in abilities.  These were formalized for large language models in the Kaplan et al. scaling laws.

This is an almost entirely empirically observed law, which motivates the development probabilistic models that can explain these laws and to ultimately inform how to answer fundamental questions, such as: what can improve these laws? Or what causes them to break?

In this talk I’ll focus on a simple random matrix model of these scaling laws, the power law random features model, which motivates new iteration of stochastic algorithms which have the potential to change these scaling laws.  This random matrix model is not fully solved, and there are many open questions, both in pure probability and machine learning that rise in this study.