Learning Theory of Transformers -- An Operator Learning Viewpoint

Series
SIAM Student Seminar
Time
Friday, December 5, 2025 - 11:00am for 1 hour (actually 50 minutes)
Location
Clough 125
Speaker
Peilin Liu – University of Sydney – peilin.liu@sydney.edu.auhttps://www.maths.usyd.edu.au/ut/people?who=P_Liu
Organizer
Wenjing Liao
To study the underlying mechanisms behind transformers and related techniques, we propose a transformer learning framework motivated by a two-stage sampling process, with distributions being inputs, and present a mathematical formulation of the attention mechanism as kernel embedding. Our findings show that by the attention operator, transformers can compress distributions into function representations without loss of information. We also demonstrate the in-context learning capabilities of efficient transformer structures through a rigorous generalization analysis.