Events
What I learned from CS336: A deep dive into the modern LLMs
Centre for Human-Centred ComputingFor this week's CogSci Seminar, we will be returning to an in-person format with our own PhD student, Yuan Liang talk about what she learned from the Stanford CS336 module, which will include a deep dive into the modern LLMs regarding the details of the architectural changes from the original transformers.
Title: What I learned from CS336: A deep dive into the modern LLMs
Speaker: Yuan Liang
Abstract:
In this talk, I share key technical insights from CS336 (Large Language Models). We'll cover how the transformer architecture has evolved since 2017, how to estimate training memory and compute cost from first principles, what standard benchmarks actually measure, and what scaling laws tell us about the limits of model improvement. The goal is to go beyond surface-level intuition and build a concrete, quantitative picture of how modern LLMs are built and trained.
| Contact: | Juntao Yu |
| Email: | juntao.yu@qmul.ac.uk |
Updated by: Paul Curzon