Events

Seminar: Fast Structure-informed Positional Encoding for Music Generation

Centre for Multimodal AI
Image:

Date: 3 December 2024   Time: 14:00 - 15:00    Add this event to your calendar 

Location: Room G2, Engineering Building, Mile End Campus and online on Zoom

You are invited to join us for a Centre for Digital Music / Centre for Multimodal AI seminar on Tuesday 3 December from 2-3 pm, where Manvi Agarwal (Télécom Paris) will be presenting her work on "Fast Structure-informed Positional Encoding for Music Generation". Attend in person in ENG G2 or virtually via Zoom. Please see details below.


Topic: Fast Structure-informed Positional Encoding for Music Generation
Speaker: Manvi Agarwal (Télécom Paris)
Date/time: 3 December 2024, 2-3 pm
Location: G2, Engineering Building and online on Zoom: https://qmul-ac-uk.zoom.us/j/2387202947


Abstract:
Music generated by deep learning methods often suffers from a lack of coherence and long-term organization. Over the years, several solutions have been proposed to help generative music architectures capture multi-scale hierarchical structure, which is a distinctive feature of music signals. The focus of my talk is the use of musically-relevant structural information to improve music Transformers. Specifically, I will present structure-informed positional encoding as a way to achieve superior music generation performance with low resource requirements. I will put forward two perspectives - an empirical approach exploring different designs for incorporating structural information in positional encoding and a theoretical approach using kernel approximations for improving the generative performance and computational complexity of such designs. In this way, I hope to underline the strengths of well-designed priors in dealing with some of the challenges facing music generation systems.


Bio:
Manvi Agarwal is doing her PhD under the supervision of Dr. Changhong Wang and Prof. Gaël Richard in the ADASP (Audio Data Analysis and Signal Processing) group, Télécom Paris, Institut Polytechnique de Paris, France, supported by the ERC-funded Hi-Audio project. Her PhD research looks at how inductive biases can be introduced into Transformers to improve their modelling capabilities on music data. Broadly, she is interested in how sequence-based learning works and how our understanding of this process in different deep learning architectures can help us make these architectures perform better, especially in low-resource settings.

Arranged by:  Centre for Digital Music
Contact:  C4DM seminar organisers
Email:  c4dm-seminar-organisers@qmul.ac.uk
Website:  

Updated by: Emmanouil Benetos