Experimental Evaluation in IR: Decades of Tradition not only to Steer Generative AI but also Quantum Computing
In this talk, I will go through the evolution of experimental evaluation in Information Retrieval (IR), namely the system-oriented approach based on the Cranfield paradigm, especially through the lenses of large-scale evaluation initiatives, such as CLEF (Conference and Labs of the Evaluation Forum, https://www.clef-initiative.eu/), in order to show how evaluation is fundamental to drive innovation and development in the field and foster the growth of a vibrant and interdisciplinary community. I will then present some challenges that experimental evaluation has to face when it comes to Generative Artificial Intelligence (AI), the hottest topic everyone is discussing about today. I will also discuss how experimental evaluation can allow us to explore other ground-breaking technlogies and apply them to IR, such as Quantum Computing (QC), by presenting the experience of QuantumCLEF (https://qclef.dei.unipd.it/), in order to remind us that our field should pursue as many innovative directions as possible for its evolution.
Nicola Ferro is Full Professor in Computer Science at the Department of Information Engineering at the University of Padua, Italy. His main research interests are information retrieval, data management and representation, and their evaluation. He chairs the Steering Committee of CLEF, the European evaluation initiative on multimodal and multilingual information access systems, and the Steering Committee of ESSIR, the European Summer School on Information Retrieval. He is Senior PC Member in top-tier conferences, like ECIR, ACM SIGIR, ACM CIKM, WSDM. He is General Co-Chair of SIGIR 2025 and of CIKM 2026. He was General Chair of ECIR 2016 and Associate Editor for ACM TOIS. He was awarded the SIGIR Academy in 2023 and the Tony Kent Strix Award in 2024.
Mark Raasveldt, DuckDB Labs, Netherlands
DuckDB Database System
DuckDB is a FOSS in-process analytical database management system, the first of its kind in a new class of systems. DuckDB’s excellent single-node performance and rich SQL interface allow it to be used for a wide variety of real-world data crunching tasks. By being in-process, DuckDB is extremely simple to run and install, and can also be used as a component of larger systems and applications. In this talk, I will present DuckDB, talk about its history, and discuss a wide variety of use cases for the system.
Mark Raasveldt is the co-founder and CTO of DuckDB Labs, a start-up company built around the development of the open source DuckDB database system. Previously Mark did both his PhD and a postdoc in the Database Architectures group at the Centrum Wiskunde & Informatica (CWI). During this time he worked together with Hannes Mühleisen to create the first version of the DuckDB database system.
Fabio Porto, LNCC, Brazil
Data Management Systems in the Age of Machine Learning
Database management systems have experienced a very successful story. Conceived initially to support Relational data and SQL queries it was soon extended to support generic UDF functions and different data models. We are now witnessing a new class of applications that integrate data with machine learning and AI models. However, building these ML applications is not easy. In this talk, I will argue that many of the functionalities needed to support ML applications are at the very heart of databases such as query optimization, data transparency, data heteroneity, views etc. As ML applications advance to more complex use of AI, as is the case of GenAI, more opportunities arise to revisit, explore and extend database techniques.
Fabio Porto is a Senior Researcher at the National Laboratory of Scientific Computing, where he coordinates the Data Extreme Lab (DEXL). He holds a DSc and MSc in Informatics from PUC-Rio, and a Mathematics-Informatics undergraduate degrees from UERJ. He was a posdoc at the École Polytéchniqué Fédérale de Lausanne, Swizerland (2004-2008). He currently holds an Inria International Chair (2024-2028) and coordinates the LNCC AI Institute. His main research interests are in ML Systems, Knowledge Graphs and Big Data flows.