<?xml version="1.0" encoding="ISO-8859-1"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <atom:link href="https://www.seresearch.qmul.ac.uk/cfcs/news/" rel="self" type="application/rss+xml" />
        <title>QMUL Centre for Fundamentals of AI and Computational Theory News</title>
        <description>Here's the latest news from The Centre for Fundamentals of AI and Computational Theory at QMUL</description>
        <link>https://www.seresearch.qmul.ac.uk/cfcs/news/</link>
        <lastBuildDate>Sat, 06 Jun 2026 16:52:51 +0100</lastBuildDate>
        <image>
            <url>https://www.seresearch.qmul.ac.uk/design_local/images/SITE_QMUL_square_logo.png</url>
            <title>QMUL Centre for Fundamentals of AI and Computational Theory News</title>
            <link>https://www.seresearch.qmul.ac.uk/cfcs/news/</link>
            <description>News from Centre for Fundamentals of AI and Computational Theory - click to visit</description>
        </image>
        <webMaster>QMUL S&amp;amp;E Research Centres Webmaster (m.m.knight@qmul.ac.uk)</webMaster>
        <item>
            <title>Pasquale Malacaria and Yunxiao Zhang: Strategic Decision-Making in Uncertain Turn-Based ...</title>
            <link>https://www.seresearch.qmul.ac.uk/cfcs/news/5549/pasquale-malacaria-and-yunxiao-zhang-strategic-decision-making-in-uncertain-turn-based-security-games/</link>
            <description>Pasquale Malacaria and Yunxiao Zhang have a paper published in a top security journal, IEEE Transactions on Information Forensics and Security (TIFS), on Strategic Decision-Making in Uncertain Turn-Based Security Games.

The paper studies the problem of cybersecurity decision making. It extends previous Leader-Follower game-theoretical models where the leader is the defender and the follower is the attacker (these are called Stackelberg games (ssee point 1 below) ). There are two types of uncertainties here.

a) the state of the attacker; at each turn of the game the attacker may be in several possible states.

b) the values of the security parameters (see point 2 below).

In this work we model both uncertainties in terms of possible worlds, i.e. each admissible instantiation of the uncertainties generates a game in a possible world; hence the model can be described as a Leader-multiple-Followers games where the defender has to choose a strategy &quot;optimal&quot; in all possible worlds. We introduce a solution concept for this type of games as a minimization of the geometric mean across all possible worlds.
We show fundamental mathematical properties of this game solution:

(a) it is Pareto optimal,

(b) it is equivalent to a standard leader-follower game over the sequence of possible worlds, and

(c) it is robust.

The solution is inspired by information theoretical optimal strategies in financial investment, in particular the Kelly criterion (see point 3 below).

We validate our approach through experiments, where it is shown our solutions outperform classic robust optimization solutions like minmax regret.

Point 1) The attacker is the follower because he can observe the defender strategy (the leader) before attacking, hence the rational leader has to think of his defence against all possible follow-up strategies of the attacker; it is a min-max problem (bi-level optimization)

Point 2) Dealing with this type of uncertainty is the big problem for these mathematical models, e.g. what is the effectiveness of a firewall? 60%? 80% we can't run real world experiments to get reliable statistics, so these numbers are mostly experts' educated guesses

Point 3) Kelly proved that the optimal betting strategy (over repeated betting where all capital is reinvested each time) is a strategy maximising the geometric mean not the arithmetic mean. This accounts for the fact that gains and losses are compounding (or multiplying) over time. When money compounds, a big loss hurts you much more than an equally sized gain helps you. For example, losing 50%of your portfolio requires a 100% gain to get back to even. In our case we want to minimise the risk across all possible worlds and the above intuition applies to our problem setting.

Reference

P. Malacaria and Y. Zhang (2026) &quot;Strategic Decision-Making in Uncertain Turn-Based Security Games,&quot; in IEEE Transactions on Information Forensics and Security, DOI: 10.1109/TIFS.2026.3698586</description>
            <category>Public news</category>
            <pubDate>Sun, 31 May 2026 23:00:00 +0100</pubDate>
            <guid>news5549</guid>
        </item>
        <item>
            <title>Przemek Wałęga paper on graph neural networks awarded &quot;spotlight paper&quot; distinction</title>
            <link>https://www.seresearch.qmul.ac.uk/cfcs/news/5518/przemek-waga-paper-on-graph-neural-networks-awarded-spotlight-paper-distinction/</link>
            <description>Przemek Wałęga is part of a team that has had a paper accepted at the Forty-Third International Conference on Machine Learning (ICML) in Seoul, South KoreaJuly 6th - 11th, 2026. It is one of the three main machine learning conferences.

Their paper shows that graph neural networks have &quot;winning tickets&quot;: small subnetworks of the same performance as the full network. This paper was awarded &quot;spotlight paper&quot; distinction which was given to top 2.2% papers.

Reference:
Lorenz Kummer, Samir Moustafa, Anatol Ehrlich, Franka Bause, Marco Nennstiel, Przemysław A Wałęga, Nils Morten Kriege, A Unifying Relational Perspective on Expressive Lottery Tickets, ICML, 2026</description>
            <category>Public news</category>
            <pubDate>Wed, 13 May 2026 23:00:00 +0100</pubDate>
            <guid>news5518</guid>
        </item>
        <item>
            <title>Przemek Wałęga has 2 papers accepted at the top conference on Knowledge Representation</title>
            <link>https://www.seresearch.qmul.ac.uk/cfcs/news/5519/przemek-waga-has-2-papers-accepted-at-the-top-conference-on-knowledge-representation/</link>
            <description>Przemyslaw Wałęga has had two full papers (and 2 further extended abstract papers) accepted for the 23rd International Conference on Principles of Knowledge Representation and Reasoning, July 20-23, 2026 - Lisbon, Portugal. KR is the main conference paper on knowledge representation and reasoning, which takes place as part of FLoC.

The first paper is about the expressive power of graph neural networks, depending on the form of the aggregation function (used to aggregate information about neighbours in a graph), being SUM, MEAN, MAX, or arbitrary. The landscape turns out to be non-trivial and depending on whether GNN has &quot;global readout&quot; (ACR-GNN) and whether it is simple (uses a single perceptron as a combination function) or not we obtain different results.

The second KR paper is a result of collaboration with Mathijs van Noort who visited QMUL in 2025. They have showed that instead of introducing new solvers for temporal logic programming, we can easily extend standard (non-temporal) logic programming solvers. Due to access to mature non-temporal solvers, this approach is competitive even with dedicated temporal reasoners. This is surprising and suggests practical usefulness of our approach.

References
S. P. Hauke, P. A. Wałęga, How Aggregation Functions Affect the Uniform Expressiveness of Graph Neural Networks, KR, 2026


M. van Noort, P. A. Wałęga, Efficient Temporal Reasoning with Non-Temporal Engines: Embedding DatalogMTL into Datalog, KR, 2026</description>
            <category>Public news</category>
            <pubDate>Thu, 07 May 2026 23:00:00 +0100</pubDate>
            <guid>news5519</guid>
        </item>
        <item>
            <title>Vasileios Klimis:  Turning a Compiler Against Itself</title>
            <link>https://www.seresearch.qmul.ac.uk/cfcs/news/5475/vasileios-klimis-turning-a-compiler-against-itself/</link>
            <description>Vasileios Klimis will be presenting a paper  at the 34th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering to be held 5 - 9 July 2026 in Montreal, Canada.

It introduces a new way to validate compilers and so make all software created using them potentially more reliable.

Abstract:

Modern software, from the network switches that route our internet traffic to the graphics cards that render our games, relies on highly specialised programs translated by even more complex software called compilers. A bug in a compiler is like a flaw in a factory blueprint: every product made from it can be subtly broken in ways that are incredibly difficult to detect. Traditional methods for finding these bugs often require a perfect &quot;master copy&quot; to compare against, which is frequently unavailable.

This research introduces a new validation principle called compilomorphism, a term blending &quot;compiler&quot; and the mathematical concept of &quot;isomorphism&quot;. An isomorphism describes a structure-preserving mapping between two objects -- a way of saying two things are different in representation but identical in structure. We apply this idea to compiler testing: a correct compiler should act as an isomorphism, preserving the essential behaviour of a program even when presented with different but semantically equivalent versions.

This work uses this principle to turn a compiler against itself. This method automatically generates multiple program variants that are structurally different but logically identical. When the compiler processes these variants, the results should be functionally indistinguishable. If they are not, the compiler has failed to act as an isomorphism, revealing a deep semantic bug. This self-consistency check acts as a built-in oracle for correctness. This work provides the formal foundation for this idea and demonstrates its feasibility with a working prototype, opening a new direction for automatically ensuring the reliability of critical software infrastructure.

Reference:

Klimis V. Compilomorphic Fuzzing: Turning a Compiler Against Itself. In Proceedings of the 34th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (FSE Companion '26), Montreal, QC, Canada, 2026.</description>
            <category>Public news</category>
            <pubDate>Tue, 28 Apr 2026 23:00:00 +0100</pubDate>
            <guid>news5475</guid>
        </item>
        <item>
            <title>Vasileios Klimis: Noise Fingerprints and Quantum Simulators</title>
            <link>https://www.seresearch.qmul.ac.uk/cfcs/news/5471/vasileios-klimis-noise-fingerprints-and-quantum-simulators/</link>
            <description>Vasileios Klimis is part of a team that have a paper accepted for the 19th IEEE International Conference on Software Testing, Verification and Validation in Daejeon, Republic of Korea to be held 18-22 May

It concerns Noise Fingerprints for Cross-Platform Quantum Simulator Discrepancy Analysis. Quantum algorithms are tested on quantum simulators, but different simulators behave differently with the same algorithm as a result of noise. This work identifies noise profiles of simulators surfacing the otherwise hidden differences.

Abstract

Before running code on a real, multi-million dollar quantum computer, engineers first test their algorithms on software simulators. These 'virtual labs' are essential for getting things right, but they have a hidden problem: different simulators, even when given identical instructions, can behave in subtly different ways due to how they model the &quot;noise&quot; inherent in quantum systems. This is like having two different brands of a scientific calculator giving different answers to the same complex equation, undermining the reliability and reproducibility of research.
Our work introduces a new paradigm to solve this problem: &quot;noise fingerprinting&quot;. Just as a human fingerprint is a unique identifier, our method, SimShadow, generates a unique digital signature that reveals the precise character of a simulator's noise. To achieve this, we adapt a cutting-edge technique from quantum physics called &quot;classical shadow tomography&quot; -- a highly efficient method for taking detailed 'snapshots' of a quantum system's behaviour without needing to measure everything, which would be impossibly slow.
This 'fingerprint' acts as a definitive report card. By comparing the fingerprints of different simulators, we can for the first time see and quantify their hidden differences, turning an opaque problem into a measurable one. SimShadow brings a much-needed layer of scientific rigour and quality control to the world of quantum software, paving the way for more reliable, reproducible, and robust quantum technologies.

Reference

Bensoussan A., Chachkarova E., Even-Mendoza K., Fortz S., Klimis V., Reza Mousavi M. Noise Fingerprints for Cross-Platform Quantum Simulator Discrepancy Analysis. In proceedings of the 19th IEEE International Conference on Software Testing, Verification and Validation 2026 (ICST '26), Daejeon, Republic of Korea.</description>
            <category>Public news</category>
            <pubDate>Mon, 27 Apr 2026 23:00:00 +0100</pubDate>
            <guid>news5471</guid>
        </item>
        <item>
            <title>Flip the Script: supporting better university progression</title>
            <link>https://www.seresearch.qmul.ac.uk/chcc/news/5461/flip-the-script-supporting-better-university-progression/</link>
            <description>Paul Curzon and Edmund Robinson were part of the expert panel at the Flip the Script participatory design workshop at the British Computer Society offices. The aim of the meeting was to co-develop a new practice model that can help Computing departments across the UK to support better progression and completion of their undergraduate students.

The workshop with staff, led by Professor Louise Archer from University College London (UCL), is part of the development process which also involves workshops with students, aims to co-develop the model. The aim was to ensure the ultimate model is high quality, accessible, close to practice and grounded in the needs of participating departments and the communities they serve. From this workshop, the intention was to refine the draft model and produce a version that will be ready for application in the Autumn.</description>
            <category>Public news</category>
            <pubDate>Mon, 20 Apr 2026 23:00:00 +0100</pubDate>
            <guid>news5461</guid>
        </item>
        <item>
            <title>Vasileios Klimis presents paper on &quot;Beyond Specification Conformance&quot; in Rio de Janeiro</title>
            <link>https://www.seresearch.qmul.ac.uk/cfcs/news/5444/vasileios-klimis-presents-paper-on-beyond-specification-conformance-in-rio-de-janeiro/</link>
            <description>&lt;img src=&quot;https://www.seresearch.qmul.ac.uk/content/news/images/f85ff7f668c0a50e3695f06d8ab8fa18.jpg&quot; /&gt;

&lt;br&gt;Vasileios Klimis has presented a paper on &quot;Beyond Specification Conformance&quot; at the 48th IEEE/ACM International Conference on Software Engineering in Rio de Janeiro

It introduces a new way, based on a mathematical logic, that complements specifications to take into account the needs of users to improve the quality of software.

Abstract

We build software to serve people. Yet time and again, systems pass every test, meet every requirement, and still fail the people who use them. Not because of bugs. Not because of careless engineering. But because the software did exactly what it was told -- and what it was told was never the whole story. This work asks a simple but surprisingly hard question: how do we formally measure the gap between what a system does and what people actually expected it to do? I introduce Semantic Expectation Logic (SEL) -- a framework that treats stakeholder expectations as a first-class, formally reasoned artifact. Not buried in a wiki. Not assumed in a style guide. Not whispered in a standup. Formally captured, systematically elicited, and quantifiably tested. SEL does not replace specifications. It addresses what specifications were never designed to hold - the conventions, assumptions, and shared understanding that developers, users, and operators carry in their heads, and that no test suite has ever been written to check. The result is a new way of thinking about software quality. Not just: did we build the system right? But: did we build the right system -- for the people who have to live with it?

Read the Computer Science for Fun (CS4FN) article on the paper.

Read the paper itself.

Reference

Klimis V. Beyond Spec Conformance: A Logic for Validating Stakeholder Expectations. In IEEE/ACM 48th International Conference on Software Engineering (ICSE-NIER), Rio de Janeiro, Brazil, 2026.</description>
            <category>Public news</category>
            <pubDate>Thu, 16 Apr 2026 23:00:00 +0100</pubDate>
            <guid>news5444</guid>
        </item>
        <item>
            <title>1st AI: Brains and Bits Symposium</title>
            <link>https://www.seresearch.qmul.ac.uk/cfcs/news/5443/1st-ai-brains-and-bits-symposium/</link>
            <description>&lt;img src=&quot;https://www.seresearch.qmul.ac.uk/content/news/images/48cd52401fba506e4657f7960d398199.jpg&quot; /&gt;

&lt;br&gt;The first QMUL AI: Brains and Bits symposium was held on 13 April 2026. It brought together researchers from across Science and Engineering to discuss the Fundamentals of AI including issues of how AI works and how it should work. It included talks from investigators from Biology, the Blizzard Institute, Physics, Maths, Electronic Engineering and Computer Science as well as whole and small group discussion sessions, so brought together wide-ranging disciplinary viewpoints to the fundamentals of AI.

Topics covered ranged from animal based neuroscience and the fundamental principles of human learning to collective AI and the social behaviour of machines.

Abhishek Banerjee first spoke about the neuroscience behind active learning in mammals, driven by life challenges, resources and social interaction. Research on, for example, rats allows parts of the brain to be switched off to help us deeply understand how mammalian brains work and so how AI could. Now computer science models can help better understand this, and how this might lead to better understandding of future agentic AI. Early models were based on feed-forward mechanisms from sensory input to flexible bahaviour. Now models are bidirectional and this happens at the cell neuron level as well as higher levels.

Iran Roman, by contrast, spoke about models of machine learning that learn dynamically. Modern machine learning is based on back propagation but older models were more sophisticated. Rather than having a passive learning phase followed by an action phase when the learning is used, in dynamic models learning continues to happen so is reactive. Results show this can be as effective as current algorithms and has potential to be more flexible approach for the future as it is interprettable controllable and optimizable.

Andrea Benucci then discussed the way the human brain works from a psychology point of view and how the brain processes visual perception experiences through two separate streams, one focussing on the 'what' and another on the 'why' that follow different pathways in the brain. Our brains also take input about both bodily motion and eye motion (top-down motor-signals) to maintain a stable perception of what is being seen (a bottom-up visualise signal), rather than just working from a visual signal. This understanding has applications in embodied agent systems such as robotic and self-driving car systems, giving ideas for new architectures so new ways for AI to work.

Vito Latora finished the first session with a network science take on the fundamentals of AI talking about how there is limited work on collective intelligence and using the wisdom of the crowds. This linked to later points made by David Berman, of aiming for social intelligence rather than ever aiming for &quot;bigger brains&quot; in AI. The former is likely to be the long term way forward, so more research is needed understanding the fundamentals such as how behaviour spreads, what collective AI will look like and how might human-AI collective intelligence might work.

In the afternoon, Chris White first spoke about the ongoing importance of meta-science and how AI was able to contribute. What is needed, for example, to understand public attitudes to science is to understand people's stances on issues from social media posts and the like. What is generally done is instead sentiment analysis which is consistently bad at predicting stance.

David Berman outlined his personal journey moving from studying string theory to working in industry applying AI and physics-informed approaches, including for formal mathematics. He raised the issue of understanding the scaling properties of systems: do we scale communication speeds, architecture or number of people connected. For example, if we double the speed of communicating do we double the complexity of the emergent society? The rates that AI is scaling currently are essentially insane - doubling every 7 months. Projects need to take the rapid changes into account.

Boris Khoruzhenko then discussed modelling of complex systems and how interactions can be replaced by random matrices or random functions to investigate them, answering questions such as whether a large system will be stable eg will small disturbances only lead to small changes.

Finally, before the participants split into small groups to discuss issues that emerged, Adrian Baule talked about the application of statistical mechanics of non-equilibrium systems and how behaviour emerges from the behaviour of the components. AI models can potentially help understand such systems, as well as it being something that needs to be understood about the properties that emerge of network based and social AI architectures.

The separate talks led to a great deal of interdisiciplinary discussion about the issues arising with potential for a range of future collaborations.</description>
            <category>Public news</category>
            <pubDate>Sun, 12 Apr 2026 23:00:00 +0100</pubDate>
            <guid>news5443</guid>
        </item>
        <item>
            <title>Raymond Hu awarded a grant to investigate Distributed Dynamic Software Updates worth £646,000 ...</title>
            <link>https://www.seresearch.qmul.ac.uk/cfcs/news/5419/raymond-hu-awarded-a-grant-to-investigate-distributed-dynamic-software-updates-worth-646-000-working-with-monzo-bank-and-sap/</link>
            <description>&lt;img src=&quot;https://www.seresearch.qmul.ac.uk/content/news/images/6514ab9fa5f4f257817eb4bc77c47756.jpg&quot; /&gt;

&lt;br&gt;Raymond Hu has been awarded a New Investigator research grant worth £646,000 by EPSRC. The project is called &quot;DymSUM - Distributed Dynamic Software Updates using Multiparty Session Types&quot; and investigates a theory-based approach to how distributed software can evolve while running. The project involves collaboration with industrial partners at Monzo Bank and SAP.

Distributed systems are at the heart of our infrastructures and society, encompassing, for example, the many Internet and mobile applications that we rely on in our daily lives. A crucial characteristic of modern Distributed Systems is dynamic evolution: many important Distributed Systems are designed to evolve - while the system remains running - in both their program source code (e.g., updates to add or improve features and fix bugs) and execution configuration (e.g., migration of communication links and processes).

The project focuses on what are known as Multiparty session types (MSTs). They are a type systems approach to message passing programming. This offers a theoretical grounding for formalising communication protocols and verifying that distributed programs are protocol-compliant. This project will investigate the development of MST-based techniques and tools for formal specification and safe implementation of dynamically-evolving Distributed Sysstems (DDS). The project will support a PDRA and a PhD for three years, starting later this year.</description>
            <category>Public news</category>
            <pubDate>Mon, 30 Mar 2026 23:00:00 +0100</pubDate>
            <guid>news5419</guid>
        </item>
        <item>
            <title>Humans vs Vision-Language Models</title>
            <link>https://www.seresearch.qmul.ac.uk/chcc/news/5462/humans-vs-vision-language-models/</link>
            <description>Shalom Lappin is part of a team centred at  the Centre for Linguistic Theory and Studies in Probability (CLASP) at the University of Gothenburg that have published new work on a comparison of Humans vs Vision-Language Models

It proposes a unified way to measure narrative coherence in writing about about sequences of visual scenes. The experimental results suggest that human descriptions have more coherence, across different dimensions, than Large Language Model (LLM) ones, despite the fluency of the latter. Human writing about visual narratives show significantly more elements of surprise.
 

Abstract

We study narrative coherence in visually grounded stories by comparing human-written narratives with those generated by vision-language models (VLMs) on the Visual Writing Prompts corpus. Using a set of metrics that capture different aspects of narrative coherence, including coreference, discourse relation types, topic continuity, character persistence, and multimodal character grounding, we compute a narrative coherence score. We find that VLMs show broadly similar coherence profiles that differ systematically from those of humans. In addition, differences for individual measures are often subtle, but they become clearer when considered jointly. Overall, our results indicate that, despite human-like surface fluency, model narratives exhibit systematic differences from those of humans in how they organise discourse across a visually grounded story. Our code is available at this https URL.

Reference

Nikolai Ilinykh, Hyewon Jang, Shalom Lappin, Asad Sayeed, Sharid Loáiciga (2026) Humans vs Vision-Language Models: A Unified Measure of Narrative Coherence, arXiv, March. DOI: 10.48550/arXiv.2603.25537</description>
            <category>Public news</category>
            <pubDate>Thu, 26 Mar 2026 00:00:00 +0100</pubDate>
            <guid>news5462</guid>
        </item>
        <item>
            <title>Søren Riis and Marc Roth are joint winners of &quot;Humanity's Last Exam&quot; Competition</title>
            <link>https://www.seresearch.qmul.ac.uk/cfcs/news/5411/sren-riis-and-marc-roth-are-joint-winners-of-humanity-s-last-exam-competition/</link>
            <description>&lt;img src=&quot;https://www.seresearch.qmul.ac.uk/content/news/images/7783680837fb781610f015036e34da09.jpg&quot; /&gt;

&lt;br&gt;Søren Riis and Marc Roth are both joint winners of the SafeAI Benchmark Competition &quot;Humanity's Last Exam&quot;.

The linked paper &quot;A benchmark of expert-level academic questions to assess AI capabilities&quot;, co-authored by FACT researchers Søren Riis and Marc Roth, has been published in Nature. This paper accompanies the benchmark set created in Humanity's Last Exam (HLE), an initiative designed to push artificial intelligence to its limits by challenging it with expert-level questions.

AI systems are typically evaluated based on benchmark questions that assess their intelligence and performance. However, as AI models have rapidly advanced, existing benchmarks have become too easy. The HLE competition aimed to change this by curating a new benchmark set of exceptionally difficult questions.

The competition attracted more than 1,000 researchers and experts, who submitted questions spanning over 100 subjects. The selection process involved three stages:

1. AI Evaluation: five of the best AI models (late 2024) attempted each question. If all failed, the question advanced.

2. Expert Review: experts refined and assessed the questions and answers.

3. Final Selection: a panel of experts and organisers made the final call.

Out of over 70,000 submitted questions to stage 1, only 2,500 made it into the final benchmark, with the top 50 declared as winners, each earning a prize. All contributors were invited to join the paper accompanying the competition as co-authors.

Søren and Marc were the sole participants from QMUL. Both contributed multiple questions, and both are joint winners of the competition. Moreover, one of Marc Roth's questions has further been selected to be featured in the Nature paper.

At the time the first version of the benchmark set had been finalised (early 2025), the best performing AIs were Open AI o1 and Deepseek R1 which answered, respectively, 8% and 8.5% of the questions correctly. One year later, Gemini 3 Pro achieved a staggering 38.3%. In fact, the true performance might be even better since the benchmark set might still contain a small fraction of ambiguous questions and questions where the given expert answers are partially incomplete or incorrect, mainly in the areas of text-only chemistry and biology questions. The HLE team has therefore transitioned to a dynamic rolling basis for quality control and improvement over the coming years.

Read more about it in our article on the competition for the general public.</description>
            <category>Public news</category>
            <pubDate>Fri, 20 Mar 2026 00:00:00 +0100</pubDate>
            <guid>news5411</guid>
        </item>
        <item>
            <title>The effect of images on the ability of Large Language Models to predict human judgements of how ...</title>
            <link>https://www.seresearch.qmul.ac.uk/cfcs/news/5408/the-effect-of-images-on-the-ability-of-large-language-models-to-predict-human-judgements-of-how-acceptable-sentences-are/</link>
            <description>&lt;img src=&quot;https://www.seresearch.qmul.ac.uk/content/news/images/6378e282d9613b9582facc9245e495f9.jpg&quot; /&gt;

&lt;br&gt;A new paper, &quot;Predicting Sentence Acceptability Judgments in Multimodal Contexts&quot;, by a team including Shalom Lappin from the Centre for Fundamental AI and Computational Theory, explores the effect of images on the ability of Large Language Models (LLMs) to predict the ratings humans give to sentences over how acceptable they are. For example, how acceptable are each of these sentences (given an image to give context)?


    &quot;But the answer seems to be no: Reeves lets it be known she requests no costings on raising the three forbidden taxes.&quot; (original sentence)
    &quot;But the reply does not seem: reeves suggests that it would not cost to raise the three banned taxes.&quot; (modified sentence)


The team found that unlike when a written context is given, images that provide context had little if any impact on the ratings given by humans of how acceptable the sentences were.

Different kinds of LLMs were able to predict human acceptability judgments very accurately. However, in general, their performance was slightly better when images are removed! Moreover, the distribution of LLM judgments varies among models. The LLM Qwen resembled human patterns, but others diverged from them.

This experimental work suggests that a larger gap exists between the internal representations of LLMs and their generated predictions when images are present to give visual contexts. It suggests interesting points of similarity and of difference between the way humans and LLMs process sentences in multimodal contexts.

Read the paper &quot;Predicting Sentence Acceptability Judgments in Multimodal Contexts&quot; on arXiv</description>
            <category>Public news</category>
            <pubDate>Thu, 19 Mar 2026 00:00:00 +0100</pubDate>
            <guid>news5408</guid>
        </item>
        <item>
            <title>Przemek Wałęga has three papers published at Artificial Intelligence conference, AAAI 2026</title>
            <link>https://www.seresearch.qmul.ac.uk/cfcs/news/5520/przemek-waga-has-three-papers-published-at-artificial-intelligence-conference-aaai-2026/</link>
            <description>Przemek Wałęga had three papers publlished in the Proceedings of the AAAI Conference on Artificial Intelligence, 2026 with the papers presented at the conference January 20–27, 2026, in Singapore.

Abstract:

Graph Neural Networks (GNNs) address two key challenges in applying deep learning to graph-structured data: they handle varying size input graphs and ensure invariance under graph isomorphism. While GNNs have demonstrated broad applicability, understanding their expressive power remains an important question. In this paper, we propose GNN architectures that correspond precisely to prominent fragments of first-order logic (FO), including various modal logics as well as more expressive two-variable fragments. To establish these results, we apply methods from finite model theory of first-order and modal logics to the domain of graph representation learning. Our results provide a unifying framework for understanding the logical expressiveness of GNNs within FO.

Reference

B. Cuenca Grau, E. Feng, P. A. Wałęga, The Correspondence Between Bounded Graph Neural Networks and Fragments of First-Order Logic, AAAI, 2026
https://ojs.aaai.org/index.php/AAAI/article/view/38987

Abstract

In recent years, there has been growing interest in understanding the expressive power of graph neural networks (GNNs) by relating them to logical languages. This research has been initialised by an influential result of Barceló et al. (2020), who showed that the graded modal logic (or a guarded fragment of the logic C2), characterises the logical expressiveness of aggregate-combine GNNs. As a &quot;challenging open problem&quot; they left the question whether C2 characterises the logical expressiveness of aggregate-combine-readout GNNs. This question has remained unresolved despite several attempts. In this paper, we solve the above open problem by proving that aggregate-combine-readout GNNs can express logical classifiers beyond C2. This result holds over both undirected and directed graphs. Beyond its implications for GNNs, our work also leads to purely logical insights on the expressive power of infinitary logics.

Reference

S. P. Hauke, P. A. Wałęga, Aggregate-Combine-Readout GNNs Are More Expressive Than Logic C2, AAAI, 2026
https://ojs.aaai.org/index.php/AAAI/article/view/39308

Abstract

Definite descriptions are expressions of the form &quot;the unique x satisfying property C,&quot; which allow reference to objects through their distinguishing characteristics. They play a crucial role in ontology and query languages, offering an alternative to proper names (IDs), which lack semantic content and serve merely as placeholders. In this paper, we introduce two extensions of the well-known description logic ALC with local and global definite descriptions, denoted ALCiL and ALCiG, respectively. We define appropriate bisimulation notions for these logics, enabling an analysis of their expressiveness. We show that although both logics share the same tight ExpTime complexity bounds for concept and ontology satisfiability, ALCiG is strictly more expressive than ALCiL. Moreover, we present tableau-based decision procedures for satisfiability in both logics, provide their implementation, and report on a series of experiments. The empirical results demonstrate the practical utility of the implementation and reveal interesting correlations between performance and structural properties of the input formulas.

Reference

M. Sochański, P. A. Wałęga, M. Zawidzki, Description Logics with Two Types of Definite Descriptions: Complexity, Expressiveness, and Automated Deduction, AAAI, 2026
https://ojs.aaai.org/index.php/AAAI/article/view/39014</description>
            <category>Public news</category>
            <pubDate>Sat, 14 Mar 2026 00:00:00 +0100</pubDate>
            <guid>news5520</guid>
        </item>
        <item>
            <title>Søren Riis and Bei Zhou show AI's game-playing still has flaws</title>
            <link>https://www.seresearch.qmul.ac.uk/cfcs/news/5389/sren-riis-and-bei-zhou-show-ai-s-game-playing-still-has-flaws/</link>
            <description>&lt;img src=&quot;https://www.seresearch.qmul.ac.uk/content/news/images/1be1b907a0e35e193bdef4ebb38bb8d5.jpg&quot; /&gt;

&lt;br&gt;New research published in Machine Learning shows pattern learning is not enough to train AI to tackle games – and abstract representations or hybrid approaches may help.

See full story at: https://www.qmul.ac.uk/media/news/2026/science-and-engineering/se/ais-game-playing-still-has-flaws-research-shows.html</description>
            <category>Public news</category>
            <pubDate>Fri, 13 Mar 2026 00:00:00 +0100</pubDate>
            <guid>news5389</guid>
        </item>
        <item>
            <title>Przemek Wałęga: Preservation Theorems for Unravelling-Invariant Classes</title>
            <link>https://www.seresearch.qmul.ac.uk/cfcs/news/5517/przemek-waga-preservation-theorems-for-unravelling-invariant-classes/</link>
            <description>Przemek Wałęga working with Bernardo Grau at the University of Oxford has devised a new way to prove old and new results including an open problem about graph neural networks.

Paper Title

Preservation Theorems for Unravelling-Invariant Classes: A Uniform Approach for Modal Logics and Graph Neural Networks


Abstract

We have introduced a new approach in finite model theory of modal logics for proving so-called preservation theorems. Preservation theorems form a classical research topic in model theory, which was started by famous Łoś-Tarski theorem (1954,1955). What is interesting is that we introduce a new approach that allows us, in a uniform way, to re-prove well-known results (preservation theorems by Rosen and Abramsky+Reggio), prove new results, and even solve an open problem asking about the expressive power of Monotonic Graph Neural Networks. The approach is based on a new result about well-quasi-orders on trees, which I find interesting per se.

Reference

Przemysław Andrzej Wałęga and Bernardo Cuenca Grau (2026) Preservation Theorems for Unravelling-Invariant Classes: A Uniform Approach for Modal Logics and Graph Neural Networks. https://arxiv.org/pdf/2602.01856</description>
            <category>Public news</category>
            <pubDate>Mon, 02 Feb 2026 00:00:00 +0100</pubDate>
            <guid>news5517</guid>
        </item>
        <item>
            <title>A new efficient algorithm for counting network motifs</title>
            <link>https://www.seresearch.qmul.ac.uk/cfcs/news/5417/a-new-efficient-algorithm-for-counting-network-motifs/</link>
            <description>&lt;img src=&quot;https://www.seresearch.qmul.ac.uk/content/news/images/191c28f06c2b6a3fd33100ac9dc91051.jpg&quot; /&gt;

&lt;br&gt;A paper about a new algorithm that counts important patterns in networks, by a team including Marc Roth from the Centre has been accepted by SODA 2026, the Symposium on Discrete Algorithms (the premier venue for foundations of algorithms research).

&quot;Network Motifs&quot; are patterns in the structure of networks. A simple example of a motif is where two points are linked by a series of single hops, but where there is also a direct shortcut link that takes you there in one hop. If this pattern appears more often than expected in a network then it would be a motif of that network.

Motifs are important in real-life networks such as social networks or genetic networks (that control how molecules interact with each other inside cells). Network motifs appear more often in such real-life networks than in random networks.

The number of times certain motifs appear in a network correlates with a variety of important properties of that network as a whole, such as the ways nodes cluster together in communication networks, or how certain types of cancer are more likely in metabolic networks. Consequently, the computational problem of computing the number of times given motifs appears in large networks has received a lot of attention over the last 20 years.

So far, the state of the art applies almost exclusively to networks in which data can be represented as a graph, that is, via individual datapoints that are organised in binary connections where edges of the graph link only two nodes.

In this new paper, a novel algorithm has been developed that operates on &quot;hypergraphs&quot;, that is, on data organised in a way where one edge can link between more than two nodes, such as in relational databases.

Moreover, the authors have proved that the newly developed algorithm is optimal with respect to its running time under standard assumptions from complexity theory, related to the P vs NP conjecture. There can be no faster algorithm that does the same thing.

The paper &quot;The Parameterised Complexity of Counting Small Sub-Hypergraphs&quot; can be found at: https://doi.org/10.1137/1.9781611978971.72</description>
            <category>Public news</category>
            <pubDate>Sat, 31 Jan 2026 00:00:00 +0100</pubDate>
            <guid>news5417</guid>
        </item>
        <item>
            <title>Raymond Hu's research on the design and implementation of the Go programming language published ...</title>
            <link>https://www.seresearch.qmul.ac.uk/cfcs/news/5423/raymond-hu-s-research-on-the-design-and-implementation-of-the-go-programming-language-published-in-popl/</link>
            <description>&lt;img src=&quot;https://www.seresearch.qmul.ac.uk/content/news/images/28d0f1a424d86707ef0f8bf50467475d.jpg&quot; /&gt;

&lt;br&gt;Raymond Hu and his coauthors presented their work, that is part of an ongoing collaboration with industry, on &quot;Welterweight Go: Boxing, Structural Subtyping, and Generics&quot; at the 53rd ACM SIGPLAN Symposium on Principles of Programming Languages (POPL 2026) in Rennes, France. POPL is a top A* conference related to the theory, practice and implementation of programming languages.

This work is part of an ongoing collaboration with Google on the design and implementation of the Go programming language that is widely used in industry. It builds on their prior work that had the impact of shaping the design of generics for Go first released in Go 1.18.

In this paper, the team first develop WG, a formal model of Go expanded to include recently implemented features such as generic type unions and type sets. They also further extend it to show how popularly requested features such as generic methods can be safely incorporated. They then introduce LWG, a lower level formal model of the Go runtime and its mechanisms for managing coercions between interface types and struct types that are key to Go. Finally, they develop a compilation strategy from WG to LWG that supports the proposed extensions and lifts the expressiveness restrictions imposed by the existing monomorphisation approach.

Reference

Raymond Hu, Julien Lange, Bernardo Toninho, Philip Wadler, Robert Griesemer, Keith Randall (2026) Welterweight Go: Boxing, Structural Subtyping, and Generics. In Proceedings of the ACM on Programming Languages, Volume 10, Issue POPL, Article No.: 79, Pages 2295 - 2322. https://doi.org/10.1145/3776721</description>
            <category>Public news</category>
            <pubDate>Fri, 16 Jan 2026 00:00:00 +0100</pubDate>
            <guid>news5423</guid>
        </item>
        <item>
            <title>PhD student Ben Hayes wins best student paper award</title>
            <link>https://www.seresearch.qmul.ac.uk/cfcs/news/5405/phd-student-ben-hayes-wins-best-student-paper-award/</link>
            <description>&lt;img src=&quot;https://www.seresearch.qmul.ac.uk/content/news/images/4109a1986e232e3b528c4fbb2e263193.jpg&quot; /&gt;

&lt;br&gt;A paper co-authored by AI and Music (AIM) Centre for Doctoral Training PhD student Ben Hayes with supervisors George Fazekas and Charis Saitis  received the best student paper award at the 26th International Society for Music Information Retrieval Conference, ISMIR 2025 in Daejeon, Korea. The conference is a flagship conference in the field of Music Informatics. The paper addresses the challenging and ill-posed problem of estimating audio synthesiser parameters given sound examples, where multiple parameter configurations can produce identical sounds due to intrinsic symmetries in synthesiser design. By explicitly modelling these symmetries, particularly permutation invariance across repeated components like oscillators and filters using permutation equivariant continuous normalising flows, the method outperforms both regression-based approaches and symmetry-naive generative models on both synthetic tasks and a real-world synthesiser.


You can read the paper &quot;Audio Synthesizer Inversion in Symmetric Parameter Spaces With Approximately Equivariant Flow Matching&quot; on the conference website.

The AIM Centre for Doctoral Training in Artificial Intelligence and Music is funded by RCUK. Based at Queen Mary University of London, AIM students undertake a four year PhD focused on developing cutting-edge research in collaboration with our industry partners.</description>
            <category>Public news</category>
            <pubDate>Sun, 05 Oct 2025 23:00:00 +0100</pubDate>
            <guid>news5405</guid>
        </item>
        <item>
            <title>Przemek Wałęga: The Logical Expressiveness of Temporal Graph Neural Networks</title>
            <link>https://www.seresearch.qmul.ac.uk/cfcs/news/5521/przemek-waga-the-logical-expressiveness-of-temporal-graph-neural-networks/</link>
            <description>Przemek Wałęga presented a paper at the Thirty-Ninth Annual Conference on Neural Information Processing Systems, 2025 in San Diego. It is one of the top three conferences on Machine Learning.

Title:

The Logical Expressiveness of Temporal GNNs via Two-Dimensional Product Logics,

Abstract
In recent years, the expressive power of various neural architectures---including graph neural networks (GNNs), transformers, and recurrent neural networks---has been characterised using tools from logic and formal language theory. As the capabilities of basic architectures are becoming well understood, increasing attention is turning to models that combine multiple architectural paradigms. Among them particularly important, and challenging to analyse, are temporal extensions of GNNs, which integrate both spatial (graph-structure) and temporal (evolution over time) dimensions. In this paper, we initiate the study of logical characterisation of temporal GNNs by connecting them to two-dimensional product logics. We show that the expressive power of temporal GNNs depends on how graph and temporal components are combined. In particular, temporal GNNs that apply static GNNs recursively over time can capture all properties definable in the product logic of (past) propositional temporal logic PTL and the modal logic K. In contrast, architectures such as graph-and-time TGNNs and global TGNNs can only express restricted fragments of this logic, where the interaction between temporal and spatial operators is syntactically constrained. These provide us with the first results on the logical expressiveness of temporal GNNs.


Reference:
M. Sälzer, P. A. Wałęga, M. Lange, The Logical Expressiveness of Temporal GNNs via Two-Dimensional Product Logics, NeurIPS, 2025
https://openreview.net/forum?id=v13yQBxhut&amp;referrer=%5Bthe%20profile%20of%20Martin%20Lange%5D(%2Fprofile%3Fid%3D~Martin_Lange1)</description>
            <category>Public news</category>
            <pubDate>Wed, 17 Sep 2025 23:00:00 +0100</pubDate>
            <guid>news5521</guid>
        </item>
        <item>
            <title>George Fazekas awarded prestigious Research Fellowship</title>
            <link>https://www.seresearch.qmul.ac.uk/cfcs/news/5409/george-fazekas-awarded-prestigious-research-fellowship/</link>
            <description>George Fazekas has been awarded a prestigious Royal Academy of Engineering (RAEng) / Leverhulme Trust Research Fellowship 2025/26. The Fellowship explores smarter AI for music by combining data-driven learning with musical knowledge, making models faster, more efficient, and better at understanding and analysing music. 
More information can be found at the Royal Academy of Engineering  website.</description>
            <category>Public news</category>
            <pubDate>Sun, 14 Sep 2025 23:00:00 +0100</pubDate>
            <guid>news5409</guid>
        </item>
    </channel>
</rss>
 