Dr Thomas Roelleke

Thomas Roelleke

Senior Lecturer

School of Electronic Engineering and Computer Science
Queen Mary University of London
ORCID ACM Google Scholar

Research

information retrieval (IR) and probability theory, structured, semantic and knowledge-oriented IR, integration of data management technologies (DB+IR, In-DB IR/ML/AI), generalisations of probabilistic concepts

Interests

My research interest lies is in information retrieval (IR). IR is related to data and information management, database (DB) technology, machine learning (ML) and AI. My research expertise and contributions are in the following areas:
1. probabilistic IR models and probability theory
2. structured, semantic and knowledge-oriented retrieval
3. integration of technologies (DB+IR, In-DB IR/ML)
4. modelling of uncertainty in data (probabilistic databases)
5. generalisations of ranking functions and probabilistic reasoning

IR models (ranking functions, e.g. BM25) are rooted in probability and information theory, but apply some magic quantifications and logarithmic expressions to achieve good retrieval quality. My research focuses on explaining model, and achieving mathematical standards. Publications include "IR Models: Foundations and Relationships" (Morgan Claypool book 2013), Harmony Assumptions (Computer Journal 2015), TF-IDF Uncovered, (ACM SIGIR 2008), General Matrix Framework (IP&M Journal), The Probability of Being Informative, (ACM SIGIR 2003), etc. My long-term research aim is finding the undiscovered parts of mathematics that explain the connection between ranking functions and probability theory.

Database-oriented research includes the integration of DB and IR (and ML, and AI), and it is an ongoing research challenge. The areas and methods are closely related, but surprisingly different and separated. My contributions include probabilistic object-relational, logic-based knowledge representations (Retrieval of Complex Objects, and various publications) that are beneficial for solving tasks in the domain of semantic and knowledge-oriented (so-called complex) information management tasks. Under the remit of DB+IR (in recent terminology, In-DB IR/ML), this led to a patented technology: the "Relational Bayes" (VLDB Journal 2008, extended SQL, WHERE ASSUMPTION IS MAX_INFORMATIVE).

Recent publications focus on probabilistic, information-theoretic and structured IR in the context of investigative IR (Journal of Information Systems, 2023), and the Dirichlet-multinomial modelling of recommendation and urgency (Big Data, ML and Intelligent Systems, Frontiers of AI, 2021).

Publications

solid heart iconPublications of specific relevance to the Centre for Multimodal AI

2024

Relevant PublicationDocument structure-driven investigative information retrieval
Ketola T and Roelleke T
Information Systems, Elsevier vol. 121, 102315-102315.  
01-03-2024

2023

Relevant PublicationAutomatic and Analytical Field Weighting for Structured Document Retrieval
Ketola T and Roelleke T
In Advances in Information Retrieval, Springer Nature 489-503.  
01-01-2023

2022

Relevant PublicationFormal Constraints for Structured Document Retrieval
Ketola T and Roelleke T
Proceedings of the 2022 ACM SIGIR International Conference on Theory of Information Retrieval
23-08-2022

2021

bullet iconADOR: A New Medical Dataset for Sentiment-based IR
Bahrani M and Roelleke T
CIKM’21: Fourth Workshop on Knowledge-driven Analytics and Systems Impacting Human Quality of Life
01-11-2021
Relevant PublicationOpinion-Aware Retrieval Models Based on Sentiment and Intensity of Lexical Features
Bahrani M and Roelleke T
In Modern Management Based On Big Data II and Machine Learning and Intelligent Systems III, Ios Press 
29-10-2021

2020

bullet iconFDCM
Bahrani M and Roelleke T
Proceedings of the 29th ACM International Conference on Information & Knowledge Management
19-10-2020
bullet iconBM25-FIC: Information content-based field weighting for BM25F
Ketola T and Roelleke T
 
01-01-2020

2018

Relevant PublicationA systematic approach to normalization in probabilistic models.
Lipani A, Roelleke T, Lupu M and Hanbury A
Inf Retr Boston, Springer vol. 21 (6), 565-596.  
30-06-2018
bullet iconP/FDM
Gray PMD
In Encyclopedia of Database Systems, Springer Nature 2643-2644.  
01-01-2018
Relevant PublicationProbabilistic Retrieval Models and Binary Independence Retrieval (BIR) Model
Roelleke T, Wang J and Robertson S
In Encyclopedia of Database Systems, Springer Nature 2839-2845.  
01-01-2018

2016

bullet iconScalable DB+IR Technology: Processing Probabilistic Datalog with HySpirit
Frommholz I and Roelleke T
Datenbank-Spektrum, Springer Nature vol. 16 (1), 39-48.  
26-01-2016
bullet iconProbabilistic Retrieval Models and Binary Independence Retrieval (BIR) Model
Roelleke T, Wang J and Robertson S
In Encyclopedia of Database Systems, Springer Nature 1-7.  
01-01-2016

2015

bullet iconIR meets NLP
Milajevs D, Sadrzadeh M and Roelleke T
Proceedings of the 2015 International Conference on The Theory of Information Retrieval
27-09-2015
bullet iconHarmony Assumptions in Information Retrieval and Social Networks
Roelleke T, Kaltenbrunner A and Baeza-Yates R
The Computer Journal, Oxford University Press (Oup) vol. 58 (11), 2982-2999.  
14-05-2015

2013

bullet iconMathematical Specification and Logic Modelling in the context of IR
Martinez-Alvarez M, Bonzanini M and Roelleke T
Proceedings of the 2013 Conference on the Theory of Information Retrieval
29-09-2013
bullet iconIR Models
Roelleke T
Proceedings of the 2013 Conference on the Theory of Information Retrieval
29-09-2013
bullet iconOn the modelling of ranking algorithms in probabilistic datalog
Roelleke T, Bonzanini M and Martinez-Alvarez M
Proceedings of the 7th International Workshop on Ranking in Databases
30-08-2013
bullet iconExtractive summarisation via sentence removal
Bonzanini M, Martinez-Alvarez M and Roelleke T
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
28-07-2013
bullet iconInformation Retrieval Models
Roelleke T
, Springer Nature vol. 5 (3), 1-163.  
26-07-2013
bullet iconDocument Difficulty Framework for Semi-automatic Text Classification
Martinez-Alvarez M, Bellogin A and Roelleke T
 
01-01-2013
bullet iconThe D2Q2 framework: On the relationship and combination of language modelling and TF-IDF
Roelleke T, Azzam H, Bonzanini M, Martinez-Alvarez M and Lalmas M
 
01-01-2013
bullet iconInformation Retrieval Models, Foundations and Relationships
Roelleke T
 
01-01-2013

2012

bullet iconInvestigating the use of extractive summarisation in sentiment classification
Bonzanini M, Martinez-Alvarez M and Roelleke T
 
01-12-2012
bullet iconOpinion summarisation through sentence extraction
Bonzanini M, Martinez-Alvarez M and Roelleke T
Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
12-08-2012
bullet iconIR models
Roelleke T
Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
12-08-2012
bullet iconA Schema-driven Approach for Knowledge-oriented Retrieval and Query Formulation
Azzam H, Yayhaei , Roelleke and Bonzanini M
KEYS 2012, The 3rd International Workshop on Keyword Search and Structured Data Scottsdale, Arizona, USA 20 May 2012
01-01-2012
bullet iconSemi-automatic document classification
Martinez-Alvarez M, Yahyaei S and Roelleke T
 
01-01-2012

2011

bullet iconRanking-based processing of SQL queries
Azzam H, Roelleke T and Yahyaei S
Proceedings of the 20th ACM international conference on Information and knowledge management
24-10-2011
Relevant PublicationOn the probabilistic logical modelling of quantum and geometrically–inspired IR
Smeraldi F, Martinez-Alvarez M, Frommholz I and Roelleke T
 
01-01-2011
bullet iconA Generic Data Model for Schema-Driven Design in Information Retrieval Applications
Azzam H and Roelleke T
 
01-01-2011
bullet iconCross-Lingual Text Fragment Alignment Using Divergence from Randomness
Yahyaei S, Bonzanini M and Roelleke T
 
01-01-2011
bullet iconA Descriptive Approach to Classification
Martinez-Alvarez M and Roelleke T
 
01-01-2011
bullet iconLarge-Scale Logical Retrieval: Technology for Semantic Modelling of Patent Search
Azzam H, Klampanos IA and Roelleke T
In Current Challenges in Patent Information Retrieval, Springer Nature 181-195.  
01-01-2011
bullet iconTeaching IR: Curricular Considerations
Blank D, Fuhr N, Henrich A, Mandl T, Rölleke T, Schütze H and Stein B
In Teaching and Learning in Information Retrieval, Springer Nature 31-46.  
01-01-2011

2010

bullet iconAn attribute-based model for semantic retrieval
Azzam H and Roelleke T
 
01-12-2010
bullet iconSQR
Azzam H and Roelleke T
Proceedings of the third workshop on Exploiting semantic annotations in information retrieval
30-10-2010
bullet iconLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): Preface
Gurrin C, He Y, Kazai G, Kruschwitz U, Little S, Roelleke T, Rüger S and Van Rijsbergen K
 
20-05-2010
bullet iconModelling Probabilistic Inference Networks and Classification in Probabilistic Datalog
Martinez-Alvarez M and Roelleke T
 
01-01-2010
bullet iconRecent developments in information retrieval
Gurrin C, He Y, Kazai G, Kruschwitz U, Little S, Roelleke T, Rüger S and Van Rijsbergen K
 
01-01-2010
bullet iconLogic-Based Retrieval: Technology for Content-Oriented and Analytical Querying of Patent Data
Klampanos IA, Wu HZ, Roelleke T and Azzam H
, Editors: Cunningham H, Hanbury A and Ruger S. 
01-01-2010
bullet iconRecent Developments in Information Retrieval
Gurrin C, He YL, Kazai G, Kruschwitz U, Little S, Roelleke T, Ruger S and van Rijsbergen K
, Editors: Gurrin C, He Y, Kazai G, Kruschwitz U, Little S, Roelleke T, Ruger S and VanRijsbergen K. 
01-01-2010

2009

bullet iconA case for probabilistic logic for scalable patent retrieval
Klampanos IA, Azzam H and Roelleke T
Proceedings of the 2nd international workshop on Patent information retrieval
06-11-2009
bullet iconLess Is More: Maximal Marginal Relevance as a Summarisation Feature
Forst JF, Tombros A and Roelleke T
, Editors: Azzopardi L, Kazai G, Robertspm S, Ruger S, Shokouhi M, Song D and Yilmaz E. 
01-01-2009
bullet iconP/FDM
Gray PMD
In Encyclopedia of Database Systems, Springer Nature 2011-2012.  
01-01-2009
bullet iconProbabilistic Retrieval Models and Binary Independence Retrieval (BIR) Model
Roelleke T, Wang J and Robertson S
In Encyclopedia of Database Systems, Springer Nature 2156-2160.  
01-01-2009
bullet iconSemi-subsumed Events: A Probabilistic Semantics of the BM25 Term Frequency Quantification
Wu HZ and Roelleke T
, Editors: Azzopardi L, Kazai G, Robertspm S, Ruger S, Shokouhi M, Song D and Yilmaz E. 
01-01-2009

2008

bullet iconDB&IR integration
Amer-Yahia S, Hiemstra D, Roelleke T, Srivastava D and Weikum G
Acm Sigir Forum, Association For Computing Machinery (Acm) vol. 42 (2), 84-89.  
30-11-2008
bullet iconDB&IR integration
Amer-Yahia S, Hiemstra D, Roelleke T, Srivastava D and Weikum G
Acm Sigmod Record, Association For Computing Machinery (Acm) vol. 37 (3), 46-49.  
30-09-2008
bullet iconDB&IR Integration: Report on the Dagstuhl Seminar Ranked XML Querying
Amer-Yahia S, Hiemstra D, Roelleke T, Srivastava D and Weikum G
Sigmod Record vol. 37 (3), 46-49.  
01-09-2008
bullet iconTF-IDF Uncovered: A Study of Theories and Probabilities
ROELLEKE T and Wang J
31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval Singapore
01-01-2008
bullet iconModelling retrieval models in a probabilistic relational algebra with a new operator: the relational Bayes
Roelleke T, Wu H, Wang J and Azzam H
Vldb J vol. 17 (1), 5-37.  
01-01-2008
bullet iconDB&IR Integration: Report on the Dagstuhl Seminar Ranked XML Querying
Amer-Yahia S, Hiemstra D, Roelleke T, Srivastava D and Weikum G
 
01-01-2008

2007

bullet iconModelling a summarisation logic in probabilistic datalog
Forst JF, Roelleke T and Tombros A
 
01-12-2007
bullet iconTOIS reviewers January 2006 through May 2007

Acm Transactions On Information Systems, Association For Computing Machinery (Acm) vol. 25 (4), 15-es.  
01-10-2007

2006

bullet iconSolving the enterprise TREC task with probabilistic data models
Forst JF, Tombros A and Rölleke T
 
01-12-2006
bullet iconA Parallel Derivation of Probabilistic Retrieval Models
ROELLEKE T and Wang J
29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle, US
27-08-2006
bullet iconA general matrix framework for modelling Information Retrieval
Rolleke T, Tsikrika T and Kazai G
Information Processing & Management vol. 42 (1), 4-30.  
01-01-2006
bullet iconContext-specific frequencies and discriminativeness for the retrieval of structured documents
Wang J and Roelleke T
, Editors: Lalmas M, MacFarlane A, Ruger S, Tombros A, Tsikrika T and Yavlinsky A. 
01-01-2006

2005

bullet iconReport on the DB/IR panel at SIGMOD 2005
Amer-Yahia S, Case P, Rolleke T, Shanmugasundaram J and Weikum G
Sigmod Record vol. 34 (4), 71-74.  
01-12-2005
bullet iconThe QMUL team with probabilistic SQL at enterprise track
Roelleke T, Ashoori E, Wu H and Cai Z
 
01-12-2005
bullet iconRelevance Information: A Loss of Entropy but a Gain for IDF?
ROELLEKE T and de Vries A
28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Salvador, Brazil
17-08-2005
bullet iconBuilding and experimenting with a heterogeneous collection
Szlavik Z and Rolleke T
, Editors: Fuhr N, Lalmas M, Malik S and Szlavik Z. 
01-01-2005

2004

bullet iconThird edition of the XML and information retrieval workshop first workshop on integration of IR and DB (WIRD) jointly held at SIGIR'2004, Sheffield, UK, July 29th, 2004
Baeza-Yates R, Maarek YS, Roelleke T and de Vries AP
Acm Sigir Forum, Association For Computing Machinery (Acm) vol. 38 (2), 24-30.  
01-12-2004
bullet iconModelling vague content and structure querying in XML retrieval with a probabilistic object-relational framework
Lalmas M and Rolleke T
, Editors: Christiansen H, Hacid MS, Andreasen T and Larsen HL. 
01-01-2004

2003

bullet iconA Frequency-based and a Poisson-based Definition of the Probability of Being Informative
ROELLEKE T
26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Toronto, Canada
31-07-2003
bullet iconFour-valued knowledge augmentation for structured document retrieval
Lalmas M and Rolleke T
 
01-02-2003
bullet iconAbductive retrieval for multimedia information seeking
LALMAS M, Roelleke T and Ruthven I
10th International Conference on Human - Computer Interaction, HCI International, Crete, Greece, vol. 4
01-01-2003
bullet iconIntelligent Retrieval of Hypermedia Documents
Lalmas M, Rölleke T and Fuhr N
In Intelligent Exploration of The Web, Springer Nature 324-344.  
01-01-2003
bullet iconA Frequency-based and a Poisson-based Definition of the Probability of Being Informative
Roelleke T
 
01-01-2003

2002

bullet iconUsing MPEG-7 at the consumer terminal in broadcasting
Pearmain A, Lalmas M, Moutogianni E, Papworth D, Healey P and Rolleke Y
Eurasip J Appl Sig P vol. 2002 (4), 354-361.  
01-04-2002
bullet iconFour-valued knowledge augmentation for representing structured documents
Lalmas M and Roelleke T
, Editors: Hacid MS, Ras ZW, Zighed DA and Kodratoff Y. 
01-01-2002
bullet iconUsing MPEG7 at the Consumer Terminal in Broadcasting
Healey P, LALMAS M, Roelleke T, Papworth D, Moutogianni E and Pearmain A
European Association For Signal, Speech and Image Processing Journal of Applied Signal Processing vol. Issue 4, 354-361.  
01-01-2002
bullet iconThe accessibility dimension for structured document retrieval
Roelleke T, Lalmas M, Kazai G, Ruthven I and Quicker S
, Editors: Crestani F, Girolami M and VanRijsbergen CJ. 
01-01-2002
bullet iconIntelligent Hypermedia Retrieval
Lalmas L, ROELLEKE T and Fuhr N
In Intelligent Exploration of The Web, Springer-Verlag Group (Physica-Verlag 
01-01-2002
bullet iconFocussed Structured Document Retrieval
Kazai G, Lalmas M and Roelleke T
In String Processing and Information Retrieval, Springer Nature 241-247.  
01-01-2002

2001

bullet iconThe HySpirit retrieval platform
Rölleke T, Lübeck R and Kazai G
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
01-09-2001
bullet iconUsing MPEG-7 at the consumer terminal in broadcasting
Pearmain A, Lalmas M, Moutogianni E, Papworth D, Healey P and Rolleke T
, Editors: Izquierdo E. WIAMIS 2001 Workshop on Image Analysis for Multimedia Services Tampere, Finland 16 May 2001 - 17 May 2001
01-01-2001
bullet iconA model for the representation and focussed retrieval of structured documents based on fuzzy aggregation
Kazai G, Lalmas M and Rolleke T
 
01-01-2001
bullet iconConcepts for a graphical user interface for hypermedia retrieval
Lalmas M, Rolleke T, Turra F and Fuhr N
, Editors: Larsen HL, Kacprzyk J, Zadrozny S, Andreasen T and Christiansen H. 
01-01-2001

1998

bullet iconDOLORES: a system for logic-based retrieval of multimedia objects
Fuhr N, Gövert N and Rölleke T
, Association For Computing Machinery (Acm), 257-265.  
01-08-1998
bullet iconDOLORES: A System for Logic-Based Retrieval Objects
Fuhr N, Gövert N and Rölleke T
 
01-08-1998
bullet iconQuerying for facts and content in hypermedia documents
Rölleke T and Fuhr N
 
01-01-1998
bullet iconHySpirit — A probabilistic inference engine for hypermedia retrieval in large databases
Fuhr N and Rölleke T
 
01-01-1998

1997

bullet iconA probabilistic relational algebra for the integration of information retrieval and database systems
Fuhr N and Rölleke T
Acm Transactions On Information Systems, Association For Computing Machinery (Acm) vol. 15 (1), 32-66.  
01-01-1997

1996

bullet iconRetrieval of complex objects using a four-valued logic
Roelleke T and Fuhr N
 
01-12-1996