Events
ML Seminar - Suvrat Raju - 10/04/26
Centre for Theoretical Physics and AstronomyDate: 10 April 2026 Time: 14:30 - 15:30
Location: 114 GO Jones Building
Title: A model of errors in transformers
Abstract: We study the error rate of LLMs on tasks like arithmetic that require a deterministic output, and repetitive processing of tokens drawn from a small set of alternatives. By analyzing the accumulation of errors in the attention mechanism, we theoretically derive a quantitative two-parameter relationship between the accuracy and the complexity of the task. We empirically verify our formula across a range of tasks and state-of-the art LLMs find excellent agreement between the predicted and observed accuracy in many cases. We also identify deviations in some cases that lead us to interesting insights about the functioning of models. We show how this understanding helps to construct prompts to reduce the error rate.
(work in collaboration with Praneeth Netrapalli)
Updated by: Dimitrios Bachtis
