Data-driven Business Process Simulation: From Event Logs to Tools and Techniques

Orlenys López Pintado is a Research Fellow in Information Systems at the University of Tartu, Estonia. His research interests include business process management (BPM), business process simulation, optimization, and blockchain. His current area of research is data-driven discovery, simulation, and optimization of business processes. He received a PhD in Computer Science in 2020 at the University of Tartu, for which he obtained the best dissertation award at the 33rd International Conference on Advanced Information Systems Engineering (CAiSE’21).

David Chapela-Campa obtained his PhD in Computer Science from the University of Santiago de Compostela in 2021. Currently, he is a postdoctoral researcher in the Information Systems group at the University of Tartu, Estonia. His research focuses on process mining (PM) and business process management (BPM), specifically on data-driven techniques for business process simulation and optimization.


Business Process Simulation (BPS) is a common approach to, among other goals, estimate the impact of changes to a business process on its performance measures, for example, by analyzing how the cycle time changes if the arrival rate of new cases doubles. Two main elements are needed to perform this task: i) a process model enhanced with simulation parameters describing the scenario and details of the process (i.e., a BPS model), and ii) a simulation engine that interprets a BPS model to mimic the behavior of the process. In this tutorial, we first introduce the fundamentals of BPS, i.e., what it consists of, its existent types, its potential use cases and benefits. Then, we deepen into one of the main types of BPS, discrete-event simulation, specifically through data-driven simulation techniques, and the existent approaches for modeling the different dimensions of a process (e.g., resource performance). These parts are followed by a hands-on exercise in which participants will explore how to use tools for the automated discovery of a BPS model and its simulation. Finally, we conclude the tutorial by discussing potential improvements and open challenges that draw future directions on the field. After the tutorial, participants will have understood the fundamentals of BPS, its potential to analyze and optimize business processes, and how to apply it to their own research or work.

Designing Virtual Knowledge Graphs

Diego Calvanese (http://www.inf.unibz.it/~calvanese/) is a full professor at the Research Centre for Knowledge and Data (KRDB) of the Faculty of Engineering, Free University of Bozen-Bolzano (Italy), where he leads the Intelligent Integration and Access to Data (In2Data) research group. He is also Wallenberg Guest Professor in Artificial Intelligence for Data Management at Umeå University (Sweden).  His research interests concern foundational and applied aspects in Artificial Intelligence and Databases, notably formalisms for knowledge representation and reasoning, Virtual Knowledge Graphs for data management and integration, Description Logics, Semantic Web, and modeling and verification of data-aware processes.  He is the author of more than 400 refereed publications, including ones in the most prestigious venues in Artificial Intelligence and Databases, with more than 37000 citations and an h-index of 77, according to Google Scholar. He is a Fellow of the European Association for Artificial Intelligence (EurAI), of the Asia-Pacific Artificial Intelligence Association (AAIA), and of the Association for Computing Machinery (ACM). He is the originator and a co-founder of Ontopic, the first spin-off of the Free University of Bozen-Bolzano, founded in 2019, and developing AI-based solutions and technologies for data management and integration.

Davide Lanti is an Assistant Professor at the Research Centre for Knowledge and Data (KRDB) of the Faculty of Engineering, Free University of Bozen-Bolzano (Italy), where he carries out research on Virtual Knowledge Graphs, Semantic Web, Databases, and Description Logics. He received his MSc degree in Computational Logic jointly from the Technische Universität Dresden (Germany) and the Free University of Bozen-Bolzano (Italy). He received his PhD at the Faculty of Computer Science at the Free University of Bozen-Bolzano, Italy.


Complex data processing tasks, including data analytics and machine/deep learning pipelines, in order to be effective, require to access large datasets in a coherent way. Knowledge graphs (KGs) provide a uniform data format that guarantees the required flexibility in processing and moreover is able to take into account domain knowledge. However, actual data is often available only in legacy data sources, and one needs to overcome their inherent heterogeneity. The recently proposed Virtual Knowledge Graph (VKG) approach is well suited for this purpose: the KG is kept virtual, and the relevant content of data sources is exposed by declaratively mapping it to classes and properties of a domain ontology, which users in turn can query. In this talk we introduce the VKG paradigm for data access, present the challenges encountered when designing complex VKG scenarios, and discuss possible solutions, in particular the use of mapping patterns to deal with the complexity of the mapping layer and its relationship to domain ontologies.

How to Conduct Valid Information Systems Engineering Research?

Henrik Leopold is a Full Professor for Data Science at the Kühne Logistics University (KLU) and a senior lecturer at the Hasso Plattner Institute (HPI) at the Digital Engineering Faculty, University of Potsdam. His research mainly focuses on leveraging technology from the field of artificial intelligence, such as machine learning and natural language processing, to develop techniques for process mining, process analysis, and process automation. He has published more than 100 research papers and articles, among others, in the journals IEEE Transactions on Knowledge and Data Engineering, IEEE Transactions on Software Engineering, Decision Support Systems, and Information Systems.


Algorithms play an important role for information systems engineering research as they are the foundational building block of many techniques. At the same time, specific methodological guidance on how to design, evaluate, and present algorithm-based research is scarce. This tutorial addresses the needs of doctoral students and early career researchers to understand how they can establish a solid research contribution based on established methodological guidelines. No specific background knowledge is required. The content of the tutorial focuses on general challenges of information systems engineering research. The objective of this tutorial is to provide early career researchers with a profound understanding of basic concepts from the philosophy of science and specific strategies how algorithms can be scientifically investigated and presented in a systematic manner.

FAIR Data Train: A FAIR-compliant distributed data and services platform.

Luiz Olavo Bonino is Associate Professor in the Semantics, Cybersecurity and Services group at the University of Twente and in the BioSemantics group at the Leiden University Medical Centre. His background is in ontology-driven conceptual modelling, semantic interoperability, service-oriented computing, requirements engineering and context-aware computing. Since 2014 Luiz has focused on research, design and development activities related to supporting the making, publishing, indexing, searching, evaluating and annotating of FAIR (meta)data and services. Luiz leads the national FAIR data team, responsible for the design and development of several technologic solutions supporting the realisation of the FAIR principles (since 2014).


The Findable, Accessible, Interoperable, and Reusable (FAIR) principles have become essential in modern data management practices. However, achieving FAIRness in data management remains a challenge, particularly in distributed environments where data and services are scattered across various platforms and organizations. This tutorial introduces the FAIR Data Train, a distributed data and services platform designed to promote FAIR practices in distributed environments. The goal of this tutorial is to provide participants with comprehensive insights into the FAIR Data Train, a FAIR-compliant distributed data and services platform. Participants will learn about the architecture, functionality, and implementation of the FAIR Data Train, as well as strategies for leveraging the platform to enhance data integration, collaboration, and knowledge discovery in distributed environments.

Engineering Information Systems with LLMs and AI-based techniques

Massimo Mecella, PhD in Engineering in Computer Science, is a full professor at Sapienza, where he is conducting research in the fields of information systems engineering, software architectures, distributed middleware and service oriented computing, mobile and pervasive computing, process management, data and process mining, big data analytics, advanced interfaces and human-computer interaction, focusing on smart applications, environments and communities. He is author of about 250 papers (h-index 42, cf. https://scholar.google.com/citations?user=x844E6sAAAAJ). He has been/is currently involved in several European and Italian research projects, and has been the technical manager of the projects WORKPAD and SM4All, coordinated by Sapienza.
He has a large experience in organizing scientific events. He was the General Chair of CAiSE 2019, BPM 2021, and ICSOC 2023 in Rome (just to name the last ones). He sits in the Steering Committees of the conference series CAiSE, ICSOC, Intelligent Environments (IE), AVI (Advanced Visual Interfaces), and SummerSOC. Currently, he is the vice-director of the the BSc in Engineering in Computer and Control Sciences, the MSc in Engineering in Computer Science, and the MSc in Artificial Intelligence and Robotics offered by Sapienza. He was director of the above degrees for the period 2020 – 2023.


The current evolution of AI, and of Generative AI in particular, namely Large Language Models (LLMs), makes it possible to adopt them as supporting tools for the engineering of information systems, in particular for their design, development and dimensioning. The goal of this tutorial is to instruct the attendees about AI and Generative AI and LLMs, with an IS engineering attitude, and then to focus on recent approaches and applications for their usage during the design of ISs and the development of ISs. At the current stage, neither studies nor surveys exist covering how systematically GenAI can be used for designing and developing information systems. Very recent works (in the last 9 months) are emerging on how to use ChatGPT (one of the most widespread applications of LLMs) for evaluating the dimensioning of ISs, the conceptual design, the software development, and extraction of business process specification from documents. But all of them are quite fragmented, and a unifying framework and pipeline is missing. The aim of this tutorial is to provide such a unifying principled view. Case studies, based on the presenters’ research activities, will be presented, and a systematic analysis of the literature and practice will be presented as well.