<?xml version="1.0" encoding="ISO-8859-1"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <atom:link href="https://www.seresearch.qmul.ac.uk/cmai/news/" rel="self" type="application/rss+xml" />
        <title>QMUL Centre for Multimodal AI News</title>
        <description>Here's the latest news from The Centre for Multimodal AI at QMUL</description>
        <link>https://www.seresearch.qmul.ac.uk/cmai/news/</link>
        <lastBuildDate>Mon, 04 May 2026 16:57:22 +0100</lastBuildDate>
        <image>
            <url>https://www.seresearch.qmul.ac.uk/design_local/images/SITE_QMUL_square_logo.png</url>
            <title>QMUL Centre for Multimodal AI News</title>
            <link>https://www.seresearch.qmul.ac.uk/cmai/news/</link>
            <description>News from Centre for Multimodal AI - click to visit</description>
        </image>
        <webMaster>QMUL S&amp;amp;E Research Centres Webmaster (m.m.knight@qmul.ac.uk)</webMaster>
        <item>
            <title>Queen Mary hosts inaugural event of new London Interdisciplinary Music Research Initiative</title>
            <link>https://www.seresearch.qmul.ac.uk/chcc/news/5478/queen-mary-hosts-inaugural-event-of-new-london-interdisciplinary-music-research-initiative/</link>
            <description>&lt;img src=&quot;https://www.seresearch.qmul.ac.uk/content/news/images/cb8592d163b4c82132cb1e3830078327.jpg&quot; /&gt;

&lt;br&gt;Yesterday (29th April 2026), the Centre for Digital Music (C4DM), part of the School of Electronic Engineering and Computer Science, hosted the inaugural event of the London Interdisciplinary Music Research Initiative (LIMRI) - bringing together leading researchers and practitioners from across London to explore the science, scholarship, and art of expert musical performance.

The afternoon workshop, titled Interdisciplinary Conversations on Expert Performance, took place at Queen Mary University of London's Mile End Campus and featured four invited speakers from King's College London, Imperial College London, City St George's University of London, and the Royal College of Music.

LIMRI is a newly launched cross-London network designed to foster collaboration and dialogue between researchers working across music, technology, science and the arts. The initiative is co-led by Dr Charalampos Saitis, Lecturer in Digital Music Processing alongside colleagues from Goldsmiths, Kingston University London, and King's College London.

LIMRI was formally launched in December 2025. Yesterday's workshop was the first in a series of themed research events that LIMRI plans to host at different institutions across London. Further details about the initiative can be found on the LIMRI website.</description>
            <category>Public news</category>
            <pubDate>Wed, 29 Apr 2026 23:00:00 +0100</pubDate>
            <guid>news5478</guid>
        </item>
        <item>
            <title>Centre for Multimodal AI at ICASSP 2026</title>
            <link>https://www.seresearch.qmul.ac.uk/cmai/news/5445/centre-for-multimodal-ai-at-icassp-2026/</link>
            <description>&lt;img src=&quot;https://www.seresearch.qmul.ac.uk/content/news/images/d77d146300eb645168c5479c238e08c6.jpg&quot; /&gt;

&lt;br&gt;On 4-8 May 2026, several CMAI researchers will participate at the 2026 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2026). ICASSP is the leading conference in the field of signal processing and the flagship event of the IEEE Signal Processing Society.

As in previous years, the Centre for Multimodal AI will have a strong presence at the conference, both in terms of numbers and overall impact. The below papers authored or co-authored by CMAI members will be presented at the main ICASSP 2026 track:


    Chain-of-Caption: Training-free improvement of multimodal large language model on referring expression comprehension, by Yik Lung Pang, Changjae Oh
    Consistency-aware learning for unbiased visual question answer, by Xinyu Jiang, Qiang Lu, Liang Zhao, Yunfei Long, Zhenfang Zhu, Jianyong Chai
    Making Dialogue Grounding Data Rich: A Three-Tier Data Synthesis Framework for Generalized Referring Expression Comprehension, by Juexi Shao, Siyou Li, Yujian Gan, Chris Madge, Vanja Karan, Massimo Poesio
    RAVE: Retrieval and Scoring Aware Verifiable Claim Detection, by Yufeng Li, Arkaitz Zubiaga
    Diffusion Timbre Transfer Via Mutual Information Guided Inpainting, by Ching Ho Lee, Javier Nistal, Stefan Lattner, Marco Pasini, George Fazekas
    Towards Effective Negation Modeling in Joint Audio-Text Models for Music, by Yannis Vasilakis, Rachel Bittner, Johan Pauwels
    Domain-Invariant Representation Learning of Bird Sounds, by Ilyass Moummad, Romain Serizel, Emmanouil Benetos, Nicolas Farrugia
    The MUSE Benchmark: Probing Music Perception and Auditory Relational Reasoning in Audio LLMs, by Brandon Carone, Iran Roman, Pablo Ripollés
    Beat and Downbeat Detection: A Reformulated Approach, by James Bolt, Johan Pauwels, George Fazekas
    Learning Vocal-Tract Area and Radiation with a Physics-Informed Webster Model, by Minhui Lu, Joshua D. Reiss
    Scalable Evaluation for Audio Identification via Synthetic Latent Fingerprint Generation, by Aditya Bhattacharjee, Marco Pasini, Emmanouil Benetos
    Audio-to-Score Jazz Solo Transcription with the Rhythm Perceiver, by Ivan Shanin, Xavier Riley, Simon Dixon


The following papers which have been published at IEEE or EURASIP journals will also be presented at the conference:


    Neural Audio Synthesis for Sound Effects: A Scope Review, by Mateo Cámara, Fernando Marcos, Anders Bargum, Cuhmur Erkut, Joshua Reiss, José Luis Blanco
        Published in the IEEE Transactions on Audio, Speech and Language Processing
    Domain Adaptation of Few-Shot Bioacoustic Event Detection in Different Environments, by Yizhou Tan, Haojun Ai, Shengchen Li, György Fazekas
        Published in the IEEE Transactions on Audio, Speech and Language Processing
    Parameter optimisation for a physical model of the vocal system, by Mateo Cámara, José Luis Blanco, Joshua D. Reiss
        Published in the EURASIP Journal on Audio, Speech, and Music Processing
    Acoustic Prompt Tuning: Empowering Large Language Models With Audition Capabilities, by Jinhua Liang, Xubo Liu, Wenwu Wang, Mark D. Plumbley, Huy Phan, Emmanouil Benetos
        Published in the IEEE Transactions on Audio, Speech and Language Processing
    Velocity2DMs: A Contextual Modeling Approach to Dynamics Marking Prediction in Piano Performance, by Hyon Kim, Emmanouil Benetos, Xavier Serra
        Published in the IEEE Signal Processing Letters



See you in Barcelona!</description>
            <category>Public news</category>
            <pubDate>Sun, 19 Apr 2026 23:00:00 +0100</pubDate>
            <guid>news5445</guid>
        </item>
        <item>
            <title>Centre for Multimodal AI at ICLR 2026</title>
            <link>https://www.seresearch.qmul.ac.uk/cmai/news/5429/centre-for-multimodal-ai-at-iclr-2026/</link>
            <description>&lt;img src=&quot;https://www.seresearch.qmul.ac.uk/content/news/images/e1b6924083744f1e06cfc014164defcc.jpg&quot; /&gt;

&lt;br&gt;On 23-27 April, CMAI researchers will participate at the Fourteenth International Conference on Learning Representations (ICLR 2026), taking place in Rio de Janeiro, Brazil. ICLR is the premier gathering of professionals dedicated to the advancement of the branch of artificial intelligence called representation learning, but generally referred to as deep learning.

The following papers authored or coauthored by CMAI papers will be presented at the main track of ICLR 2026:


    
        Spectral Attention Steering for Prompt Highlighting, by Weixian Waylon Li, Yuchen Niu, Yongxin Yang, Keshuang Li, Tiejun Ma, Shay B Cohen
    
    
        SCRAPL: scattering transform with random paths for machine learning, by Christopher Mitcheltree, Vincent Lostanlen, Emmanouil Benetos, Mathieu Lagrange
    
    
        ViMo: A Generative Visual GUI World Model for App Agents, by Dezhao Luo, Bohan Tang, Kang Li, Georgios Papoudakis, Jifei Song, Shaogang Gong, Jianye Hao, Jun Wang, Kun Shao
    
    
        Beyond Linear Probes: Dynamic Safety Monitoring for Language Models, by James Oldfield, Philip Torr, Ioannis Patras, Adel Bibi, Fazl Barez
    
    
        CASteer: Cross-Attention Steering for Controllable Concept Erasure, by Tatiana Gaintseva, Andreea-Maria Oncescu, Chengcheng Ma, Ziquan Liu, Martin Benning, Gregory Slabaugh, Jiankang Deng, Ismail Elezi
    
    OmniVideoBench: towards audio-visual understanding evaluation for omni MLLMs, by Caorui Li, Yu Chen, Yiyan Ji, Jin Xu, Zhenyu Cui, Shihao Li, Yuanxing Zhang, Zhenghao Song, Dingling Zhang, Heying, Haoxiang Liu, Yuxuan Wang, Qiufeng Wang, Jiafu Tang, Zhenhe Wu, Jiehui Luo, Zhiyu Pan, Weihao Xie, Chenchen Zhang, Zhaohui Wang, Jiayi Tian, Yanghai Wang, Zhe Cao, Minxin Dai, ke wang, Runzhe Wen, Yinghao Ma, Yaning Pan, Sungkyun Chang, Termeh Taheri, Haiwen Xia, Christos Plachouras, Emmanouil Benetos, Yizhi Li, Ge Zhang, Jian Yang, Tianhao Peng, Zili Wang, Minghao Liu, Junran Peng, Zhaoxiang Zhang, Jiaheng Liu
    YuE: scaling open foundation models for long-form music generation, by Ruibin Yuan, Hanfeng Lin, Shuyue Guo, Ge Zhang, Jiahao Pan, Yongyi Zang, Haohe Liu, Yiming Liang, Wenye Ma, Xingjian Du, Xeron Du, Zhen Ye, Tianyu Zheng, Zhengxuan Jiang, Yinghao Ma, Minghao Liu, Zeyue Tian, Ziya Zhou, Liumeng Xue, Xingwei Qu, Yizhi Li, Shangda Wu, Tianhao Shen, Ziyang Ma, Jun Zhan, Chunhui Wang, Yatian Wang, Xiaowei Chi, Xinyue Zhang, Zhenzhu Yang, XiangzhouWang, Shansong Liu, Lingrui Mei, Peng Li, Junjie Wang, Jianwei Yu, Guojian Pang, Xu Li, Zihao Wang, Xiaohuan Zhou, Lijun Yu, Emmanouil Benetos, Yong Chen, Chenghua Lin, Xie Chen, Gus Xia, Zhaoxiang Zhang, Chao Zhang, Wenhu Chen, Xinyu Zhou, Xipeng Qiu, Roger Dannenberg, Jiaheng Liu, Jian Yang, Wenhao Huang, Wei Xue, Xu Tan, Yike Guo



The following paper authored by CMAI members will be presented at the ICLR 2026 Workshop on Lifelong Agents: 


    Beyond syntax: Action semantics learning for app agents, by Bohan Tang, Dezhao Luo, Jianheng Liu, Jingxuan Chen, Shaogang Gong, Jianye Hao, Jun Wang, Kun Shao



See you all at ICLR!</description>
            <category>Public news</category>
            <pubDate>Wed, 08 Apr 2026 23:00:00 +0100</pubDate>
            <guid>news5429</guid>
        </item>
        <item>
            <title>CMAI PhD Student Completes Research Fellowship at UK Parliament</title>
            <link>https://www.seresearch.qmul.ac.uk/cmai/news/5420/cmai-phd-student-completes-research-fellowship-at-uk-parliament/</link>
            <description>&lt;img src=&quot;https://www.seresearch.qmul.ac.uk/content/news/images/ddd5e375a6f7fdb2cf6e412a46347e8d.jpg&quot; /&gt;

&lt;br&gt;CMAI PhD student Alexander Williams recently completed a 3 month research fellowship at the Parliamentary Office of Science and Technology (POST) following a successful application to the UKRI Policy Internship Scheme.

POST is an impartial research and knowledge exchange service based in the UK Parliament. They work to ensure cutting-edge research evidence and expertise is available to members of both houses of parliament (House of Commons and House of Lords), covering emerging and complex science and social science topics.

During the fellowship, Alex worked closely with POST's Physical Sciences and Digital Lead, Simon Brawley, to research and write a POSTnote—an impartial, accurate, and peer-reviewed briefing tailored for UK parliamentarians—on data centres and their sustainability.

Data centres are crucial infrastructure that underpin many aspects of modern life, including artificial intelligence. The POSTnote, titled What are data centres and how sustainable are they?, discusses what data centres are, their presence in the UK, and their impact on different aspects of sustainability. This briefing was produced in consultation with experts and stakeholders from academia, industry, government, and beyond, including interviews with Google, techUK and academics from University of Oxford, Loughborough University, and University of Manchester.

The POSTnote can be read in full here.

Well done Alex!</description>
            <category>Public news</category>
            <pubDate>Sun, 29 Mar 2026 23:00:00 +0100</pubDate>
            <guid>news5420</guid>
        </item>
        <item>
            <title>Reimagining music videos with AI: CMAI research breaks new ground</title>
            <link>https://www.seresearch.qmul.ac.uk/cmai/news/5275/reimagining-music-videos-with-ai-cmai-research-breaks-new-ground/</link>
            <description>&lt;img src=&quot;https://www.seresearch.qmul.ac.uk/content/news/images/1d9355dcb7ccbd2e8acf2f603acb6fc0.jpg&quot; /&gt;

&lt;br&gt;Yinghao Ma, a PhD candidate in the Centre for Multimodal AI at Queen Mary University of London, has helped develop AutoMV, the first open-source AI system capable of generating complete music videos directly from full-length songs.

Music-to-video generation remains a major challenge for generative AI. While recent video models can produce visually impressive short clips, they often struggle with long-form storytelling, musical alignment, and character consistency. AutoMV addresses these limitations by introducing a multi-agent AI system designed specifically for full-length music video production.

Developed through a collaboration between Queen Mary researchers and partners at Beijing University of Posts and Telecommunications, Nanjing University, Hong Kong University of Science and Technology, and the University of Manchester, AutoMV brings together expertise in music information retrieval, multimodal AI, and creative computing. The work was led by Dr Emmanouil Benetos, with contributions from Yinghao Ma as well as Dr. Changjae Oh and Chaoran Zhu from the Centre for Intelligent Sensing.

AutoMV works like a virtual film production team. First, it analyses a song's musical structure, beats, and time-aligned lyrics. Then, a set of specialised AI agents—taking on roles such as screenwriter, director, and editor—collaborate to plan scenes, maintain character identity, and generate images and video clips. A final quality-control &quot;verifier&quot; agent checks for coherence and consistency, regenerating content where needed.

This approach allows AutoMV to produce music videos that follow a song from beginning to end, maintaining narrative flow and visual identity throughout. Human expert evaluations show that AutoMV significantly outperforms existing commercial tools, narrowing the gap between AI-generated videos and professionally produced music videos.

By lowering the cost of music video production from tens of thousands of pounds to roughly the cost of an API call, AutoMV has the potential to empower independent musicians, educators, and creators who previously lacked access to professional video production. As an open-source project, it also supports transparent, reproducible research and encourages community collaboration.

The team is actively inviting researchers and students to contribute to the codebase, extend the benchmark, and explore future directions for long-form, multimodal AI systems.


    Code: https://github.com/multimodal-art-projection/AutoMV
    Paper: https://arxiv.org/abs/2512.12196
    Project website: https://m-a-p.ai/AutoMV/</description>
            <category>Public news</category>
            <pubDate>Tue, 06 Jan 2026 00:00:00 +0100</pubDate>
            <guid>news5275</guid>
        </item>
        <item>
            <title>Women in Higher Education Network plus grant</title>
            <link>https://www.seresearch.qmul.ac.uk/chcc/news/5458/women-in-higher-education-network-plus-grant/</link>
            <description>Ekaterina Ivanova and Anna Xambo Sedo have been awarded a grant from the QMUL Erica fund worth £13,200 to support the Women in Higher Education Network plus (WHEN+), until July 2027.


The Women in Higher Education Network (WHEN) was founded in 2023 at EECS with the aim of building a strong and sustainable community for individuals identifying as women, while contributing to advancing diversity, equity, and inclusion in STEM. Over the past two years, under the leadership of Ekaterina Ivanova and Anna Xambó Sedó, WHEN has established a solid foundation for community engagement through a dedicated website, mailing list, and LinkedIn group. The network has grown from 38 active participants in its first year to 88 in the second, and now exceeds 100 registered members with the inclusion of SBBS and collaboration with ITS Women in Tech. To date, WHEN has delivered 30 monthly events, including workshops, talks, and social gatherings. These activities have received consistently positive feedback (July 2024 survey) and have fostered strong peer support, with members actively proposing new initiatives.

With this new extension funding from ERICA, they aim to increase the project's impact through


    widening inclusion by collaborating with other underrepresented groups, including LGBTQIA+ communities;
    expanding participation across the Faculty of Engineering; and
    extending the network geographically from QMUL to the wider London academic community, supporting interdisciplinary exchange.


Following their already established series of regular but distinctive events, the monthly activities will include social events and skill development events. Social events include coffee mornings, cinema screenings with discussion, tea parties, social lunches, sports (e.g. yoga, zumba), and so on. Skill development events include group coaching, presentations, workshops, meetups, and any relevant activity/idea proposed by and/or powered by network members.

See the WHEN website for past activities:</description>
            <category>Public news</category>
            <pubDate>Thu, 01 Jan 2026 00:00:00 +0100</pubDate>
            <guid>news5458</guid>
        </item>
        <item>
            <title>CMAI at NeurIPS 2025</title>
            <link>https://www.seresearch.qmul.ac.uk/cmai/news/5219/cmai-at-neurips-2025/</link>
            <description>&lt;img src=&quot;https://www.seresearch.qmul.ac.uk/content/news/images/a20d19fb2ed10997b9bc578028722b0a.jpg&quot; /&gt;

&lt;br&gt;On 2-7 December, several CMAI researchers will participate at the 39th Annual Conference on Neural Information Processing Systems (NeurIPS 2025), taking place in San Diego. NeurIPS is a prestigious annual academic conference and non-profit foundation that fosters the exchange of research in artificial intelligence (AI), machine learning (ML), and computational neuroscience.

CMAI members will be presenting the following papers at the main track of NeurIPS 2025:


    Towards Interpretability Without Sacrifice: Faithful Dense Layer Decomposition with Mixture of Decoders
        James Oldfield, Shawn Im, Sharon Li, Mihalis Nicolaou, Ioannis Patras, Grigorios Chrysos
        https://openreview.net/forum?id=jcvX8XFNqX
    Compress &amp; Cache: Vision token compression for efficient generation and retrieval
        Adrian Bulat, Yassine Ouali, Georgios Tzimiropoulos
        https://openreview.net/forum?id=nGEq3D6FFX
    ViDAR: Video Diffusion-Aware 4D Reconstruction From Monocular Inputs
        Michal Nazarczuk, Sibi Catley-Chandar, Thomas Tanay, Zhensong Zhang, Gregory Slabaugh, Eduardo Pérez-Pellitero
        https://openreview.net/forum?id=mLVqiNH0aA
    Large language models can learn and generalize steganographic chain-of-thought under process supervision
        Robert McCarthy, Joey SKAF, Luis Ibanez-Lissen, Vasil Georgiev, Connor Watts, Hannes Whittingham, Lorena Gonzalez-Manzano, Cameron Tice, Edward James Young, Puria Radmard, David Lindner
        https://openreview.net/forum?id=2g5cJqX15Y
    Uncertainty-quantified Rollout Policy Adaptation for Unlabelled Cross-domain Video Temporal Grounding
        Jian Hu, Zixu Cheng, Shaogang Gong, Isabel Guan, Jianye HAO, Jun Wang, Kun Shao
        https://openreview.net/forum?id=RfNiN2rENM
    CALM: Culturally Self-Aware Language Models
        Lingzhi Shen, Xiaohao Cai, YUNFEI LONG, Imran Razzak, Guanming Chen, Shoaib Jameel
        https://openreview.net/forum?id=16QYhVFvrO
    
        λ-Orthogonality Regularization for Compatible Representation Learning
        Simone Ricci, Niccolò Biondi, Federico Pernici, Ioannis Patras, Alberto Del Bimbo
        https://openreview.net/forum?id=Due3iZPa6u
    


The following papers will be presented at the Datasets and Benchmarks track of NeurIPS 2025:


    OmniBench: Towards The Future of Universal Omni-Language Models
        Yizhi LI, Ge Zhang, Yinghao Ma, Ruibin Yuan, King Zhu, Hangyu Guo, Yiming Liang, Jiaheng Liu, Zekun Moore Wang, Jian Yang, Siwei Wu, Xingwei Qu, Jinjie Shi, Xinyue Zhang, Zhenzhu Yang, Yidan WEN, Yanghai Wang, Shihao Li, Zhaoxiang Zhang, Ruibo Liu, Emmanouil Benetos, Wenhao Huang, Chenghua Lin
        https://openreview.net/forum?id=SSF4qgsNYE
    MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix
        Ziyang Ma, Yinghao Ma, Yanqiao Zhu, Chen Yang, Yi-Wen Chao, Ruiyang Xu, Wenxi Chen, Yuanzhe Chen, Zhuo Chen, Jian Cong, Kai Li, Keliang Li, Siyou Li, Xinfeng Li, Xiquan Li, Zheng Lian, Yuzhe Liang, Minghao Liu, Zhikang Niu, tianrui wang, Yuping Wang, Yuxuan Wang, Yihao Wu, Guanrou Yang, Jianwei Yu, Ruibin Yuan, Zhisheng Zheng, Ziya Zhou, Haina Zhu, Wei Xue, Emmanouil Benetos, Kai Yu, EngSiong Chng, Xie Chen
        https://openreview.net/forum?id=fgmrBJemlQ
    XIFBench: Evaluating Large Language Models on Multilingual Instruction Following
        Zhenyu Li, Kehai Chen, YUNFEI LONG, Xuefeng Bai, Yaoyin Zhang, Xuchen Wei, Juntao Li, Min Zhang
        https://openreview.net/forum?id=qkdVjCAPOE


The following paper will be presented at the Creative AI track of NeurIPS 2025:


    The Ghost in the Keys: A Disklavier Demo for Human-AI Musical Co-Creativity
        Louis Bradshaw, Alexander Spangher, Stella Biderman, Simon Colton
        https://openreview.net/forum?id=3yeBer3J5z



See you all at NeurIPS!</description>
            <category>Public news</category>
            <pubDate>Mon, 17 Nov 2025 00:00:00 +0100</pubDate>
            <guid>news5219</guid>
        </item>
        <item>
            <title>CMAI PhD student awarded Google PhD Fellowship</title>
            <link>https://www.seresearch.qmul.ac.uk/cmai/news/5187/cmai-phd-student-awarded-google-phd-fellowship/</link>
            <description>&lt;img src=&quot;https://www.seresearch.qmul.ac.uk/content/news/images/4ac60065bd12d76e2e9830ee1a3f805a.jpg&quot; /&gt;

&lt;br&gt;We are extremely proud to announce that Yinghao Ma, PhD student in AI and Music at the Centre for Multimodal AI of QMUL and supervised by Dr Emmanouil Benetos, has been awarded the 2025 Google Fellowship in Machine Perception.

A Google spokesperson said: &quot;The student nominations we received this year were exemplary in their quality, but Yinghao especially stood out and was endorsed by the research scientists and distinguished engineers within Google who participated in the review. Congratulations to Yinghao on this well-deserved recognition, it's an honor to support such incredibly talented students.&quot;

Yinghao's PhD research focuses on advancing Large Language Models (LLMs) for music understanding and generation. Specifically, he studies how multimodal models can integrate audio, symbolic, and textual information to understand, reason about, and generate music.

Together with colleagues, he developed MERT, a large-scale music audio representation model which has more than 10k monthly download in the past three years. His recent work includes developing music instruction-following datasets and benchmarks that help evaluate how well AI systems can comprehend and create music.

He said: &quot;It's my great honour to receive the Google PhD Fellowship that recognises my research and strongly contribute to my future career. I'm deeply grateful to Google and QMUL for the support, providing good platforms for AI &amp; music research.&quot;

Congratulations Yinghao!</description>
            <category>Public news</category>
            <pubDate>Wed, 22 Oct 2025 23:00:00 +0100</pubDate>
            <guid>news5187</guid>
        </item>
        <item>
            <title>From biodiversity to artificial intelligence</title>
            <link>https://www.seresearch.qmul.ac.uk/news/5156/from-biodiversity-to-artificial-intelligence/</link>
            <description>&lt;img src=&quot;https://www.seresearch.qmul.ac.uk/content/news/images/f9700750a2b9f995526d7e6854e85cb0.jpg&quot; /&gt;

&lt;br&gt;We showcase the research work of Kabiru Abubakari, a PhD student in the Centre for Probability, Statistics and Data Science, and the research work of Dr David Mguni, a lecturer in the Centre for Multimodal AI.

 

Kabiru Abubakari

Kabiru's research focuses on Bayesian spatial modelling for biodiversity.

His PhD project is devoted to developing and applying Bayesian spatial and spatio-temporal modelling techniques to enhance understanding of the association between plant species at risk of extinction and areas in need of protection in the face of climate change, changing land use (especially agriculture), and pollution. Working together with his supervisors — Prof Silvia Liverani (SMS), Prof Andrew Leitch (SBBS), and Dr Ilia Leitch (Royal Botanic Garden, Kew) — Kabiru combines statistical modelling and ecology to develop methods that better capture uncertainty in biodiversity data.

His academic journey began with a degree in Economics at the University for Development Studies (UDS) in Tamale, Ghana, where he graduated in 2020. Since joining Queen Mary, Kabiru has also been very active in supporting students of Black heritage as a tutor in Levelling Up Maths and as a panellist at the Black Heroes of Mathematics Conference.

Read more about Kabiru's research in this poster.

 

David Mguni

David is a Lecturer in Artificial Intelligence. His research spans reinforcement learning, game theory, and optimal control, with a focus on developing self-improving, cooperative learning systems. His work contributes to a broader vision of building AI that can reason, adapt, and learn autonomously in an open-ended world.

Together with his PhD student Yaqi Sun and master's students, David is working towards one of the grand goals of artificial intelligence: creating systems that can not only learn from existing training data but also learn how to learn and invent their own challenges. The group's research on the Recursive Meta-Learning Framework explores how intelligent systems can evolve their own learning rules and generate and solve new problems that push them beyond the limits of human-derived data.

A central focus of the group's work is reinforcement learning — particularly understanding how multiple intelligent systems can cooperate, compete, and coordinate in open, dynamic environments. The group's research seeks to overcome the limitations of traditional reinforcement learning algorithms by enabling AI to learn the rules of learning itself.

This approach has far-reaching implications. By allowing AI systems to invent new challenges, discover hidden structures, and maintain stability as they learn together, the research moves toward the long-term goal of artificial general intelligence: machines capable of generalising knowledge, adapting creatively, and cooperating safely across domains. Possible applications range from AI programs that autonomously generate novel mathematical proofs to agents that continually refine their understanding of molecular structures for drug discovery.

The group's work blends theory with practical experimentation, drawing on dynamical systems, game theory, category theory, stochastic control, and variational optimisation. These mathematical foundations ensure that the learning mechanisms they develop are not only powerful and flexible but also grounded in principles that make them interpretable, stable, and safe.

To learn more about David's research, read also here.</description>
            <category>Public news</category>
            <pubDate>Wed, 15 Oct 2025 23:00:00 +0100</pubDate>
            <guid>news5156</guid>
        </item>
        <item>
            <title>Best student paper and outstanding reviewer awards at ISMIR 2025</title>
            <link>https://www.seresearch.qmul.ac.uk/cmai/news/5098/best-student-paper-and-outstanding-reviewer-awards-at-ismir-2025/</link>
            <description>&lt;img src=&quot;https://www.seresearch.qmul.ac.uk/content/news/images/0d9bd1939a6f78714877cb0f2df8af8a.jpg&quot; /&gt;

&lt;br&gt;We are delighted to share that CMAI PhD student Ben Hayes, along with CMAI academics Charalampos Saitis and George Fazekas, have received the best student paper award at the ISMIR 2025 conference.

The paper &quot;Audio Synthesizer Inversion in Symmetric Parameter Spaces With Approximately Equivariant Flow Matching&quot; proposes using permutation equivariant continuous normalizing flows to handle the ill-posed problem of audio synthesizer inversion, where multiple parameter configurations can produce identical sounds due to intrinsic symmetries in synthesizer design. By explicitly modeling these symmetries, particularly permutation invariance across repeated components like oscillators and filters, the method outperforms both regression-based approaches and symmetry-naive generative models on both synthetic tasks and a real-world synthesizer (Surge XT).

We are also happy to share that two CMAI PhD students, Yannis Vasilakis and Ben Hayes, were recognised as outstanding reviewers.</description>
            <category>Public news</category>
            <pubDate>Sun, 05 Oct 2025 23:00:00 +0100</pubDate>
            <guid>news5098</guid>
        </item>
        <item>
            <title>CMAI at WASPAA 2025</title>
            <link>https://www.seresearch.qmul.ac.uk/cmai/news/5103/cmai-at-waspaa-2025/</link>
            <description>&lt;img src=&quot;https://www.seresearch.qmul.ac.uk/content/news/images/6871445306f9af07141444a39ae524a8.jpg&quot; /&gt;

&lt;br&gt;On 12-15 October, several CMAI researchers will participate at the 2025 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, taking place at the Granlibakken Tahoe Resort near Lake Tahoe, in Tahoe City, CA, USA. WASPAA is a premier event in the field of audio signal processing, organised by the IEEE's Audio and Acoustic Signal Processing (AASP) technical committee, with a strong focus on music signal processing and computational sound scene analysis.

The Centre for Multimodal AI, as in previous years, will have a strong presence at WASPAA 2025.

In the Technical Programme, the following papers are authored by CMAI members:


    Latent Acoustic Mapping for Direction of Arrival Estimation: A Self-Supervised Approach (Adrian S. Roman, Iran R. Roman, Juan Pablo Bello)
    RUMAA: Repeat-Aware Unified Music Audio Analysis for Score-Performance Alignment, Transcription, and Mistake Detection (Sungkyun Chang, Simon Dixon, Emmanouil Benetos)
    Modulation Discovery with Differentiable Digital Signal Processing (Christopher Mitcheltree, Hao Hao Tan, Joshua D. Reiss)
    Beyond Architecture: The Critical Impact of Inference Overlap on Music Source Separation Benchmarks (Harnick Khera, Johan Pauwels, Alan W. Archer-Boyd, Mark B. Sandler)
    Improving Inference-Time Optimisation for Vocal Effects Style Transfer with a Gaussian Prior (Chin-Yun Yu, Marco A. Martínez-Ramírez, Junghyun Koo, Wei-Hsiang Liao, Yuki Mitsufuji, George Fazekas)
    Self-Supervised Representation Learning with a JEPA Framework for Multi-instrument Music Transcription (Mary Pilataki, Matthias Mauch, Simon Dixon)


In the Demo Session, the following demos will be presented by C4DM members:


    Neural Audio Synthesis for Non-Keyboard Instruments (Franco Caspe, Andrew McPherson, Mark Sandler)
    PCA-DiffVox: Augmenting Vocal Effects Tweakability With a Bijective Latent Space (Chin-Yun Yu, Marco A. Martínez-Ramírez, Junghyun Koo, Wei-Hsiang Liao, Yuki Mitsufuji, George Fazekas)



See you at WASPAA!</description>
            <category>Public news</category>
            <pubDate>Sun, 05 Oct 2025 23:00:00 +0100</pubDate>
            <guid>news5103</guid>
        </item>
        <item>
            <title>Game AI group celebrates multiple successes at IEEE conference on games 2025</title>
            <link>https://www.seresearch.qmul.ac.uk/cmai/news/5053/game-ai-group-celebrates-multiple-successes-at-ieee-conference-on-games-2025/</link>
            <description>&lt;img src=&quot;https://www.seresearch.qmul.ac.uk/content/news/images/92cf6fba34d7de7de9a0d4bc8f20ded6.jpg&quot; /&gt;

&lt;br&gt;Queen Mary University of London's Game AI Group had a standout presence at the prestigious IEEE Conference on Games (IEEE CoG) 2025, held 26–29 August in Milan. The conference is one of the world's leading venues for research on video games, board games and game-related technologies.

The group published an impressive five papers (four full and two short), with two full papers nominated for the Best Paper Award. The paper &quot;Bootstrap Your Own Teacher: Online Policy Distillation for Multi-Game Reinforcement Learning&quot; – led by Donal Byrne and co-authored by colleagues including Queen Mary's Marko Tot – went on to win the award, marking a major achievement for the team. Marko, an IGGI PhD student in the Game AI Group, carried out this work while on placement at Instadeep.

The team also celebrated recognition beyond their own papers. Simon Lucas, Professor of Artificial Intelligence and head of the Game AI Group, was presented with an Outstanding Contribution Award for co-founding the IEEE Conference on Games in 2005 (originally the IEEE Symposium on Computational Intelligence and Games), and for founding the IEEE Transactions on Games journal.

Contributions at IEEE CoG 2025:


    Full Papers
        
            JSON-Bag: A generic game trajectory representation – Dien Nguyen, Diego Perez Liebana and Simon Lucas
            (Best Paper Nomination) How Task Complexity Moderates the Impact of AI-Generated Images on User Experience in Gamified Text Labelling – Fatima Althani, Chris Madge and Massimo Poesio
            (Best Paper Nomination and Winner) Bootstrap Your Own Teacher: Online Policy Distillation for Multi-Game Reinforcement Learning – Donal Byrne, Marko Tot, Paul Duckworth, Clement Bonnet, Alexandre Laterre and Thomas Barrett
        
    
    Short Papers
        
            Play-Style Identification Using Low-Level Representations of Play Traces in MicroRTS – Ruizhe Yu Xia, Jeremy Gow and Simon Lucas
            Constraint Propagation for Reasoning in Single-player Deduction Games – Fandi Meng, Kaijie Xu and Simon Lucas
        
    
    Competitions
        
            PlanetWars AI Challenge – Simon Lucas 
            Tabletop Games Balancing Competition – George Long 
        
    


These successes showcase the Queen Mary Game AI Group's international reputation at the forefront of artificial intelligence and games research, and their continuing influence in shaping the future of the field. 

And a big thank you to our IGGI CDT for their help and support with this.</description>
            <category>Public news</category>
            <pubDate>Wed, 10 Sep 2025 23:00:00 +0100</pubDate>
            <guid>news5053</guid>
        </item>
        <item>
            <title>CMAI at ISMIR 2025</title>
            <link>https://www.seresearch.qmul.ac.uk/cmai/news/5038/cmai-at-ismir-2025/</link>
            <description>&lt;img src=&quot;https://www.seresearch.qmul.ac.uk/content/news/images/1f63c29bf8c9e3effafc2164312893eb.jpg&quot; /&gt;

&lt;br&gt;On 21-25 September 2025, several CMAI researchers will participate at the 26th International Society for Music Information Retrieval Conference (ISMIR 2025). ISMIR is the leading conference in the field of music informatics, and is currently the top-cited publication for Music &amp; Musicology (source: Google Scholar). This year ISMIR will take place onsite in Daejeon, Korea.

Similar to previous years, the Centre for Multimodal AI will have a strong presence at ISMIR 2025.


In the Scientific Programme, the following papers are authored/co-authored by CMAI members:


    Audio Synthesizer Inversion in Symmetric Parameter Spaces with Approximately Equivariant Flow Matching (Ben Hayes, Charalampos Saitis, György Fazekas)
    SLAP: Siamese Language Audio Pretraining without Negative Samples for Music Understanding (Julien Guinot, Alain Riou, Elio Quinton, György Fazekas)
    GD-Retriever: Controllable Generative Text Music Retrieval with Diffusion Models (Julien Guinot, Elio Quinton, György Fazekas)
    Instruct-MusicGen: Unlocking Text to Music Editing for Music Language Models via Instruction Tuning (Yixiao Zhang, Yukara Ikemiya, Woosung Choi, Naoki Murata, Marco A. Martínez Ramírez, Liwei Lin, Gus Xia, Wei Hsiang Liao, Yuki Mitsufuji, Simon Dixon)
    Scaling Self Supervised Representation Learning for Symbolic Piano Performance (Louis Bradshaw, Honglu Fan, Alexander Spangher, Stella Biderman, Simon Colton)
    Codicodec: Unifying Continuous and Discrete Compressed Representations of Audio (Marco Pasini, Stefan Lattner, György Fazekas)
    MIDI-VALLE: Improving Expressive Piano Performance Synthesis through Neural Codec Language Modelling (Jingjing Tang, Xin Wang, Zhe Zhang, Junichi Yamagishi, Geraint Wiggins, György Fazekas)
    Universal Music Representations? Evaluating Foundation Models on World Music Corpora (Charilaos Papaioannou, Emmanouil Benetos, Alexandros Potamianos)
    Perceptual Errors in Music Source Separation: Looking Beyond SDR Averages (Saurjya Sarkar, Victoria Moomjian, Basil Woods, Emmanouil Benetos, Mark Sandler)
    GOAT: a Large Dataset of Paired Guitar Audio Recordings and Tablatures (Jackson Loth, Pedro Sarmento, Saurjya Sarkar, Zixun Guo, Mathieu Barthet, Mark Sandler)
    CMI-Bench: a Comprehensive Benchmark for Evaluating Music Instruction Following (Yinghao Ma, Siyou Li, Juntao Yu, Emmanouil Benetos, Akira Maezawa)
    Assessing the Alignment of Audio Representations with Timbre Similarity Ratings (Haokun Tian, Stefan Lattner, Charalampos Saitis)
    Improving Neural Pitch Estimation with SWIPE Kernels (David Marttila, Joshua D. Reiss)
    Refining Music Sample Identification with a Self Supervised Graph Neural Network (Aditya Bhattacharjee, Ivan Meresman Higgs, Mark Sandler, Emmanouil Benetos)



The following Tutorials will be co-presented by CMAI PhD students Rodrigo Diaz and Julien Guinot:


    Differentiable Physical Modeling Sound Synthesis: Theory, Musical Application, and Programming (Jin Woo Lee, Stefan Bilbao, Rodrigo Diaz)
    Self-supervised Learning for Music - An Overview and New Horizons (Julien Guinot, Alain Riou, Yuexuan Kong, Marco Pasini, Gabriel Meseguer-Brocal, Stefan Lattner)



The following journal papers published at TISMIR which are co-authored by CMAI members will be presented at the conference:


    Predicting Eurovision Song Contest Results: A Hit Song Science Approach (Katarzyna Adamska, Joshua Reiss)
    The GigaMIDI Dataset with Features for Expressive Music Performance Detection (Keon Ju Lee, Jeff Ens, Sara Adkins, Pedro Sarmento, Mathieu Barthet, Philippe Pasquier)



As part of the MIREX public evaluations:


    CMAI PhD student Yinghao Ma is task captain for the Music Reasoning QA, Audio Beat Tracking, and Audio Key Detection tasks
    CMAI PhD student Huan Zhang is task captain for RenCon 2025: Expressive Performance Rendering Competition


Finally, on the organisational side:


    CMAI PhD student Chin-Yun Yu is Virtual Co-Chair for the ISMIR 2025 conference.
    CMAI PhD student Yinghao Ma is co-organising the satellite workshop LLM4MA: Large Language Models for Music &amp; Audio



See you at Daejeon!</description>
            <category>Public news</category>
            <pubDate>Sun, 07 Sep 2025 23:00:00 +0100</pubDate>
            <guid>news5038</guid>
        </item>
        <item>
            <title>CMAI organises AES AIMLA 2025 conference</title>
            <link>https://www.seresearch.qmul.ac.uk/cmai/news/5033/cmai-organises-aes-aimla-2025-conference/</link>
            <description>&lt;img src=&quot;https://www.seresearch.qmul.ac.uk/content/news/images/4c3e3835a3bc6ad9f7d8a77915c8904c.jpg&quot; /&gt;

&lt;br&gt;The AES International Conference on Artificial Intelligence and Machine Learning for Audio (AIMLA 2025) will be hosted by the Centre for Multimodal AI of Queen Mary University of London and is taking place on Sept. 8-10, 2025.

Several CMAI members are involved in the organisation of the conference, including but not limited to:


    Josh Reiss (General Chair)
    George Fazekas (Papers Co-chair)
    Soumya Vanka (Special Sessions Co-Chair)
    Franco Caspe (Special Sessions Co-Chair)
    Farida Yusuf (Sponsorship Chair)
    Emmanouil Benetos (Publicity Chair)
    Nelly Garcia (Social Events Coordinator)
    Ilias Ibnyahya (Treasurer)
    Chin-Yun Yu (Late Breaking Papers Chair)
    Marikaiti Primenta (Invited Speakers Chair)


Several papers and presentations will be made from CMAI members at AIMLA as well. The following peer-reviewed papers will be presented at the conference:


    NablAFx: A Framework for Differentiable Black-box and Gray-box Modeling of Audio Effects, by Marco Comunità, Christian Steinmetz, Joshua Reiss
    Transfer Learning for Neural Modelling of Nonlinear Distortion Effects, by Tara Vanhatalo, Pierrick Legrand, Myriam Desainte-Catherine, Pierre Hanna, Guillaume Pille, Antoine Brusco, Joshua Reiss
    Sound Matching an Analogue Levelling Amplifier Using the Newton-Raphson Method, by Chin-Yun Yu, George Fazekas
    Procedural Music Generation Systems in Games, by Shangxuan Luo, Joshua Reiss
    Neutone SDK: An Open Source Framework for Neural Audio Processing, by Christopher Mitcheltree, Bogdan Teleaga, Andrew Fyfe, Naotake Masuda, Matthias Schäfer, Alfie Bradic, Nao Tokui


The following late-breaking posters from CMAI members will be presented at AIMLA:


    Transformer-Based Sustain Pedal Reconstruction for Expressive Piano Performance MIDI, by Wenhao Liu, George Fazekas, Jingjing Tang
    Decoding Melodic Acoustic Features from Neural Data, by Zorka Bozilovic, Iran Roman
    Towards Intelligent Music Education: Score-Informed Transcription and Performance Assessment, by Jack Loth, Marikaiti Primenta, Jingjing Tang, Xavier Riley, Simon Dixon, Emmanouil Benetos


Last but not least, the following tutorial will be co-presented by CMAI PhD student Franco Caspe:


    Real-Time Neural Audio Inference , by Franco Caspe and Jatin Chowdhury


See you in London!</description>
            <category>Public news</category>
            <pubDate>Tue, 02 Sep 2025 23:00:00 +0100</pubDate>
            <guid>news5033</guid>
        </item>
        <item>
            <title>CMAI student to join the Alan Turing Institute in 2025-2026</title>
            <link>https://www.seresearch.qmul.ac.uk/cmai/news/5015/cmai-student-to-join-the-alan-turing-institute-in-2025-2026/</link>
            <description>&lt;img src=&quot;https://www.seresearch.qmul.ac.uk/content/news/images/84457edde2978607b183e69bea1840e1.jpg&quot; /&gt;

&lt;br&gt;CMAI PhD student Aditya Bhattacharjee has been awarded an enrichment placement by the Alan Turing Institute, the UK's national institute in artificial intelligence and data science, enabling him to join and interact with institute researchers and its community in the 2025/26 academic year. Aditya is supervised by Dr Emmanouil Benetos and will be entering the final year of his PhD study. Aditya's placement will be hosted by the Turing's Fundamental research in data science and AI research programme.

Congratulations to Aditya!</description>
            <category>Public news</category>
            <pubDate>Tue, 29 Jul 2025 23:00:00 +0100</pubDate>
            <guid>news5015</guid>
        </item>
        <item>
            <title>CMAI at IJCNN 2025 conference</title>
            <link>https://www.seresearch.qmul.ac.uk/cmai/news/4995/cmai-at-ijcnn-2025-conference/</link>
            <description>&lt;img src=&quot;https://www.seresearch.qmul.ac.uk/content/news/images/a58a8504bc44da5ee614b02910398e33.jpg&quot; /&gt;

&lt;br&gt;On 30 June - 5 July 2025, CMAI researchers will participate at the IEEE International Joint Conference on Neural Networks (IJCNN 2025), the flagship conference of the IEEE Computational Intelligence Society and the International Neural Network Society.

The Centre for Multimodal AI will have a strong presence at the conference. The following papers authored/co-authored by CMAI members will be presented at IJCNN 2025:


    VLLMs Provide Better Context for Emotion Understanding Through Common Sense Reasoning, by Alexandros Xenos, Niki Foteinopoulou, Ioanna Ntinou, Ioannis Patras, Georgios Tzimiropoulos
    ImprovNet - Generating Controllable Musical Improvisations with Iterative Corruption Refinement, by Keshav Bhandari, Sungkyun Chang, Tongyu Lu, Fareza Rahman Enus, Louis Bradshaw, Dorien Herremans, Simon Colton
    Mamba-Diffusion Model with Learnable Wavelet for Controllable Symbolic Music Generation, by Jincheng Zhang, George Fazekas, Charalampos Saitis
    Towards a Unified Representation Evaluation Framework Beyond Downstream Tasks, by Christos Plachouras, Julien Guinot, George Fazekas, Elio Quinton, Emmanouil Benetos, Johan Pauwels



The following presentation from CMAI members will also be made at IJCNN 2025:


    Split Fine-Tuning of BERT-based Music Models in the Edge-Cloud Continuum: An Empirical Analysis, by Bradley Aldous, Wai Fong Tam, Ahmed M. A. Sayed


 

See you in Rome!</description>
            <category>Public news</category>
            <pubDate>Fri, 13 Jun 2025 23:00:00 +0100</pubDate>
            <guid>news4995</guid>
        </item>
        <item>
            <title>CMAI best paper award at EvoMUSART 2025</title>
            <link>https://www.seresearch.qmul.ac.uk/cmai/news/4970/cmai-best-paper-award-at-evomusart-2025/</link>
            <description>&lt;img src=&quot;https://www.seresearch.qmul.ac.uk/content/news/images/b77da60f8aff9865d0a75e25f6644e22.jpg&quot; /&gt;

&lt;br&gt;The 14th International Conference on Artificial Intelligence in Music, Sound, Art and Design (EvoMUSART), part of Evostar, took place in Trieste, Italy, between 23 and 25 April 2025.

We are pleased to announce that the following paper, authored by CMAI PhD student Keshav Bhandari, received the best paper award!

Yin-Yang: Developing Motifs With Long-Term Structure And Controllability, Keshav Bhandari, Geraint A. Wiggins, Simon Colton

Yin-Yang is a neuro-symbolic framework that combines three transformer models to generate structured melodies with coherent long-term development, while allowing user control over musical themes and variations.</description>
            <category>Public news</category>
            <pubDate>Sun, 11 May 2025 23:00:00 +0100</pubDate>
            <guid>news4970</guid>
        </item>
    </channel>
</rss>
 