Events

Fundamentals of AI Reading Group: Shared Representations for Speech and Music? Evidence from Frozen Transfer

Centre for Fundamentals of AI and Computational Theory

Date: 29 April 2026 Time: 10:30 - 13:00

Location: G2 Engineering Building

At this week's reading group, Farida Yusuf will present their work on:
Shared Representations for Speech and Music? Evidence from Frozen Transfer

Abstract: Recent work in deep auditory modelling has investigated the effect of task on model-brain alignment of neural representations. Outputs of this work include convolutional architectures narrowly trained on classification tasks that model basic functions of auditory perception, such as speaker identification, word recognition and environmental sound recognition. I leverage these pretrained models to ask whether representations developed from learning speech show any immediate transfer towards tasks modelling music perception, starting with the case of instrument sound recognition as classification.
Preliminary(!) results show success in frozen transfer for classifying instrument sounds and further suggest architectural and stimulus factors of transfer. However, we still need to determine whether this transfer effect is a) asymmetric between speech and music tasks, and b) not simply a domain-agnostic effect of pretraining. In this talk, I will present these preliminary results and discuss motives for further experiments. Overall, probing deep auditory models seems a promising direction to investigate affordances, limitations and modulators of speech-derived representations towards music perception, thus potentially contributing novel insights to a longstanding debate on the shared neural bases—or lack thereof—of speech and music perception.

Contact:	Gabryel Thomas Mason-Williams
Email:	g.t.mason-williams@qmul.ac.uk

Updated by: Paul Curzon