EXA4MIND relies on a co-design approach, where technology partners from computing centres and universities and application partners from industry, academia and SMEs design an Extreme Data infrastructure in close collaboration.
Authors: Yihong Xu, Victor Letzelter, Mickaël Chen, Éloi Zablocki, Matthieu Cord.
Publication date: 2025.
Published at the IEEE International Conference on Robotics and Automation. ICRA2025.
In autonomous driving, motion prediction aims at forecasting the future trajectories of nearby agents, helping the ego vehicle to anticipate behaviors and drive safely. A key challenge is generating a diverse set of future predictions, commonly addressed using data-driven models with Multiple Choice Learning (MCL) architectures and Winner-Takes-All (WTA) training objectives. However, these methods face initialization sensitivity and training instabilities. Additionally, to compensate for limited performance, some approaches rely on training with a large set of hypotheses, requiring a post-selection step during inference to significantly reduce the number of predictions. To tackle these issues, we take inspiration from annealed MCL, a recently introduced technique that improves the convergence properties of MCL methods through an annealed Winner-Takes-All loss (aWTA). In this paper, we demonstrate how the aWTA loss can be integrated with state-of-the-art motion forecasting models to enhance their performance using only a minimal set of hypotheses, eliminating the need for the cumbersome post-selection step. Our approach can be easily incorporated into any trajectory prediction model normally trained using WTA and yields significant improvements.
Authors: Arif Görkem Özer, Recep Firat Cekinel, Ismail Hakki Toroslu, Pinar Karagoz.
Publication date: 2025.
Published by Cambridge University Press.
Natural language querying allows users to formulate questions in a natural language without requiring specific knowledge of the database query language. Large language models have been very successful in addressing the text-to-SQL problem, which is about translating given questions in textual form into SQL statements. Document-oriented NoSQL databases are gaining popularity in the era of big data due to their ability to handle vast amounts of semi-structured data and provide advanced querying functionalities. However, studies on text-to-NoSQL systems, particularly on systems targeting document databases, are very scarce. In this study, we utilize large language models to create a cross-domain natural language to document database query dataset, DocSpider, leveraging the well-known text-to-SQL challenge dataset Spider. As a document database, we use MongoDB. Furthermore, we conduct experiments to assess the effectiveness of the DocSpider dataset to fine-tune a text-to-NoSQL model against a cross-language transfer learning approach, SQL-to-NoSQL, and zero-shot instruction prompting. The experimental results reveal a significant improvement in the execution accuracy of fine-tuned language models when utilizing the DocSpider dataset.
Authors: A. Vobecky, D. Hurych, O. Siméoni, S. Gidaris, A. Bursuc, P. Pérez and J. Sivic.
Publication date: 2025.
Published by the International Journal of Computer Vision.
Semantic image segmentation models typically require extensive pixel-wise annotations, which are costly to obtain and prone to biases. This work investigates learning semantic segmentation in urban scenes without any manual annotation. Researchers propose a novel method for learning pixel-wise semantic segmentation using raw, uncurated data from vehicle-mounted cameras and LiDAR sensors, thus eliminating the need for manual labeling. Researchers show the generalization capabilities of their method by testing on four different testing datasets (Cityscapes, Dark Zurich, Nighttime Driving, and ACDC) without any fine-tuning. They present an in-depth experimental analysis of the proposed model including results when using another pre-training dataset, per-class and pixel accuracy results, confusion matrices, PCA visualization, k-NN evaluation, ablations of the number of clusters and LiDAR’s density, supervised finetuning as well as additional qualitative results and their analysis.
Authors: Vojtěch Mlýnský, Petra Kührová, Martin Pykal, Miroslav Krepl, Petr Stadlbauer, Michal Otyepka, Pavel Banáš and Jiří Šponer.
Publication date: 2025.
Published by the Journal of Chemical Theory and Computation.
In this work, researchers present a comprehensive evaluation of widely used pair-additive and polarizable RNA ffs using the challenging UUCG tetraloop (TL) benchmark system. Extensive standard MD simulations, initiated from the NMR structure of the 14-mer UUCG TL, revealed that most ffs did not maintain the native state, instead favoring alternative loop conformations. Notably, three very recent variants of pair-additive ffs, OL3CP–gHBfix21, DES-Amber, and OL3R2.7, successfully preserved the native structure over a 10 × 20 μs time scale. To further assess these ffs, researchers performed enhanced sampling folding simulations of the shorter 8-mer UUCG TL, starting from the single-stranded conformation. Estimated folding free energies (ΔG°fold) varied significantly among these three ffs, with values of 0.0 ± 0.6, 2.4 ± 0.8, and 7.4 ± 0.2 kcal/mol for OL3CP–gHBfix21, DES-Amber, and OL3R2.7, respectively. The ΔG°fold value predicted by the OL3CP–gHBfix21 ff was closest to experimental estimates, ranging from −1.6 to −0.7 kcal/mol. In contrast, the higher ΔG°fold values obtained using DES-Amber and OL3R2.7 were unexpected, suggesting that key interactions are inaccurately described in the folded, unfolded, or misfolded ensembles. These discrepancies led them to further test DES-Amber and OL3R2.7 ffs on additional RNA and DNA systems, where further performance issues were observed. The results emphasize the complexity of accurately modeling RNA dynamics and suggest that creating an RNA ff capable of reliably performing across a wide range of RNA systems remains extremely challenging. In conclusion, our study provides valuable insights into the capabilities of current RNA ffs and highlights key areas for future ff development.
Authors: Yihong Xu, Loïck Chambon, Éloi Zablocki, Mickaël Chen, Alexandre Alahi, Matthieu Cord, Patrick Pérez.
Publication date: 2024
Published at the IEEE International Conference on Robotics and Automation. ICRA2024.
Motion forecasting is crucial in enabling autonomous vehicles to anticipate the future trajectories of surrounding agents. To do so, it requires solving mapping, detection, tracking, and then forecasting problems, in a multi-step pipeline. In this complex system, advances in conventional forecasting methods have been made using curated data, i.e., with the assumption of perfect maps, detection, and tracking. This paradigm, however, ignores any errors from upstream modules. Meanwhile, an emerging end-to-end paradigm, that tightly integrates the perception and forecasting architectures into joint training, promises to solve this issue. However, the evaluation protocols between the two methods were so far incompatible and their comparison was not possible. In fact, conventional forecasting methods are usually not trained nor tested in real-world pipelines (e.g., with upstream detection, tracking, and mapping modules). In this work, we aim to bring forecasting models closer to the real-world deployment. First, we propose a unified evaluation pipeline for forecasting methods with real-world perception inputs, allowing us to compare conventional and end-to-end methods for the first time. Second, our in-depth study uncovers a substantial performance gap when transitioning from curated to perception-based data. In particular, we show that this gap (1) stems not only from differences in precision but also from the nature of imperfect inputs provided by perception modules, and that (2) is not trivially reduced by simply finetuning on perception outputs. Based on extensive experiments, we provide recommendations for critical areas that require improvement and guidance towards more robust motion forecasting in the real world.
Authors: Patrik Vacek, David Hurych, Karel Zimmermann, Patrick Perez, Tomas Svoboda.
Publication date: 2024
Published in Arxiv 2024.
Learning without supervision how to predict 3D scene flows from point clouds is essential to many perception systems. We propose a novel learning framework for this task which improves the necessary regularization. Relying on the assumption that scene elements are mostly rigid, current smoothness losses are built on the definition of “rigid clusters” in the input point clouds. The definition of these clusters is challenging and has a significant impact on the quality of predicted flows. We introduce two new consistency losses that enlarge clusters while preventing them from spreading over distinct objects. In particular, we enforce \emph{temporal} consistency with a forward-backward cyclic loss and \emph{spatial} consistency by considering surface orientation similarity in addition to spatial proximity. The proposed losses are model-independent and can thus be used in a plug-and-play fashion to significantly improve the performance of existing models, as demonstrated on two most widely used architectures. We also showcase the effectiveness and generalization capability of our framework on four standard sensor-unique driving datasets, achieving state-of-the-art performance in 3D scene flow estimation.
Authors: Patrik Vacek, David Hurych, Tomáš Svoboda, Karel Zimmermann.
Publication date: 2024
Published at IEEE Transactions on Intelligent Vehicles.
We study the problem of self-supervised 3D scene flow estimation from real large-scale raw point cloud sequences, which is crucial to various tasks like trajectory prediction or instance segmentation. In the absence of ground truth scene flow labels, contemporary approaches concentrate on deducing optimizing flow across sequential pairs of point clouds by incorporating structure based regularization on flow and object rigidity. The rigid objects are estimated by a variety of 3D spatial clustering methods. While state-of-the-art methods successfully capture overall scene motion using the Neural Prior structure, they encounter challenges in discerning multi-object motions. We identified the structural constraints and the use of large and strict rigid clusters as the main pitfall of the current approaches and we propose a novel clustering approach that allows for combination of overlapping soft clusters as well as non-overlapping rigid clusters representation. Flow is then jointly estimated with progressively growing non-overlapping rigid clusters together with fixed size overlapping soft clusters. We evaluate our method on multiple datasets with LiDAR point clouds, demonstrating the superior performance over the self-supervised baselines reaching new state of the art results. Our method especially excels in resolving flow in complicated dynamic scenes with multiple independently moving objects close to each other which includes pedestrians, cyclists and other vulnerable road users.
Authors: Spyros Gidaris, Andrei Bursuc, Oriane Simeoni, Antonin Vobecky, Nikos Komodakis, Matthieu Cord, Patrick Pérez.
Publication date: 2024
Published in Transactions on Machine Learning Research (TMLR) 2024.
Self-supervised learning can be used for mitigating the greedy needs of Vision Transformer networks for very large fully-annotated datasets. Different classes of self-supervised learning offer representations with either good contextual reasoning properties, e.g., using masked image modeling strategies, or invariance to image perturbations, e.g., with contrastive methods. In this work, we propose a single-stage and standalone method, MOCA, which unifies both desired properties using novel mask-and-predict objectives defined with high-level features (instead of pixel-level details). Moreover, we show how to effectively employ both learning paradigms in a synergistic and computation-efficient way. Doing so, we achieve new state-of-the-art results on low-shot settings and strong experimental results in various evaluation protocols with a training that is at least 3 times faster than prior methods.
Authors: Viktoria Pauwa, David Číž, Vojtě ch Mlý nský, Pavel Banáš, Michal Otyepka, Stephan Hachinger and Jan Martinovič.
Publication date: 2024
Published by the Proceedings of Science journal.
The on-going work presented in this article explores different technical approaches and systems for management and analysis of data obtained from large physics simulations, optimising the respective data-driven workflows across Cloud-Computing (IaaS) and HPC systems. The work is carried out in the context of the EXA4MIND Horizon Europe project, which produces an Extreme Data processing platform, bringing together specialised data management systems and powerful computing infrastructures. We evaluate two typical use cases with physics simulations carried out on supercomputing systems at LRZ and IT4Innovations. These use cases come from different areas of physics – they focus on the treatment of low energy many-body systems of molecules, and of high-energy (relativistic) elementary particles, respectively.
Authors: R. F. Cekinel, Ç. Çöltekin, P. Karagoz.
Publication date: 2024.
Published at the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation.
The rapid spread of misinformation through social media platforms has raised concerns regarding its impact on public opinion. While misinformation is prevalent in other languages, the majority of research in this field has concentrated on the English language. Hence, there is a scarcity of datasets for other languages, including Turkish. To address this concern, we have introduced the FCTR dataset, consisting of 3238 real-world claims. This dataset spans multiple domains and incorporates evidence collected from three Turkish fact-checking organizations. Additionally, we aim to assess the effectiveness of cross-lingual transfer learning for low-resource languages, with a particular focus on Turkish.
Authors: G. Puy, S. Gidaris, A. Boulch, O. Siméoni, C. Sautier, P. Pérez, A. Bursuc, R. Marlet
Publication date: 2024.
Published at the CVPR Conference.
Self-supervised image backbones can be used to address complex 2D tasks (e.g., semantic segmentation, object discovery) very efficiently and with little or no downstream supervision. Ideally, 3D backbones for lidar should be able to inherit these properties after distillation of these powerful 2D features. The most recent methods for image-to-lidar distillation on autonomous driving data show promising results, obtained thanks to distillation methods that keep improving. Yet, we still notice a large performance gap when measuring the quality of distilled and fully supervised features by linear probing. In this work, instead of focusing only on the distillation method, we study the effect of three pillars for distillation: the 3D backbone, the pretrained 2D backbones, and the pretraining dataset. In particular, thanks to our scalable distillation method named ScaLR, we show that scaling the 2D and 3D backbones and pretraining on diverse datasets leads to a substantial improvement of the feature quality. This allows us to significantly reduce the gap between the quality of distilled and fully-supervised 3D features, and to improve the robustness of the pretrained backbones to domain gaps and perturbations.
Authors: M. M. Khosravi, P. Karagoz, I. H. Toroslu.
Publication date: 2024.
Published at the IEEE International Conference on Big Data.
In this work, we consider the automated index selection for NoSQL databases and investigate the feasi- bility of supervised learning and reinforcement learning based solutions. The experiments conducted on the YCSB dataset show that reinforcement learning improves index selection per- formance as in relational databases, and supervised learning gives promising results and can be considered applicable under sufficient amount of training data.
Authors: A. Vobecky, O. Siméoni, D. Hurych, S. Gidaris, A. Bursuc, P. Pérez, J. Sivic.
Publication date: 2023.
Published at the Advances in Neural Information Processing Systems (Neurips) Conference.
This research describes an approach to predict open-vocabulary 3D semantic voxel occupancy map from input 2D images with the objective of enabling 3D grounding, segmentation and retrieval of free-form language queries. This is a challenging problem because of the 2D-3D ambiguity and the open-vocabulary nature of the target tasks, where obtaining annotated training data in 3D is difficult. The contributions of this work are three-fold: a new model architecture for open-vocabulary 3D semantic occupancy prediction; a tri-modal self-supervised learning algorithm that leverages three modalities: (i) images, (ii) language and (iii) LiDAR point clouds, and enables training the proposed architecture using a strong pre-trained vision-language model without the need for any 3D manual language annotations; and a quantitative demonstration of the strengths of the proposed model on several open-vocabulary tasks.
Authors: V. Mlýnský, P. Kührová, P. Stadlbauer, M. Krepl, M. Otyepka, P. Banás, J. Šponer.
Publication date: 2023.
Published by the Journal of Chemical Theory and Computation.
Molecular dynamics (MD) simulations represent an established tool to study RNA molecules. The outcome of MD studies depends, however, on the quality of the force field (ff). Here researchers suggest a correction for the widely used AMBER OL3 ff by adding a simple adjustment of the nonbonded parameters. The research suggests that the combination of OL3 RNA ff and NBfix0BPh modification is a viable option to improve RNA MD simulations.
Authors: P. Harsh, S. Hachinger, M. Derquennes, A. Edmonds, P. Karagoz, M. Golasowski, M. Hayek and J. Martinovič.
Publication date: 2023.
Published by the Proceedings of Science journal.
In this contribution, researchers sketch an application of Earth System Sciences and Cloud-/Big-Data-based IT, which shall soon leverage European supercomputing facilities: smart viticulture, as put into practice by Terraview. TerraviewOS is a smart vineyard ‘operating system’, allowing wine cultivators to optimise irrigation, harvesting dates and measures against plant diseases. The system relies on satellite and drone imagery as well as in-situ sensors where available. The substantial need for computing power in TerraviewOS, in particular for training AI-based models to generate derived data products, makes the further development of some of its modules a prime application case for the EXA4MIND project.
The Data Management Plan lays out the planning for handling main aspects of the life cycle of the project data (data organisation and long-term storage, access, preserva- tion, and sharing). This document also includes a preliminary specification of outputs (what data will be generated during the project). It is a living document and will be continuously updated during the project.
The EXA4MIND project connects pre-eminent databases and data management systems to supercomputing systems and European Data Spaces as well as the world of FAIR research data. The core purpose of this endeavour is running next-generation Extreme Data workfows, with emphasis on data analytics, Machine Learning / Artifcial Intelligence, or classical simulations. This deliverable reports on the Data and Workfow Management Toolbox provided for this purpose, building upon the successful LEXIS Platform (delivered by the H2020 project, GA 825532). Furthermore, it illustrates the first workfows run by our application cases at supercomputing centres.
Welcome to the seventh newsletter of the EXA4MIND Project – Prague Plenary sets the stage for 2025. In this edition, you will find all about our last Plenary Meeting, the latest Scientific Contributions and new inyerviews of the ‘Faces of EXA4MIND’ campaign.
Welcome to the sixth newsletter of the EXA4MIND Project – Two years of Extreme Data innovations and progress. In this edition, you will find an editorial by by Stephan Hachinger, Science and Co-design Coordinator of EXA4MIND, our participation in international events such as EBDVF together with the DataNexus Cluster, ELIXIR CZ Annual Conference 2024 and SC24, new videos of the ‘Faces of EXA4MIND’ campaign and the latest DataNexus Cluster videos.
Welcome to the fifht newsletter of the EXA4MIND Project – In the middle of the journey. In this edition, you will find an editorial by Jan Martinovič, the project coordinator, who reflects and takes stock of what has been achieved so far in his letter ‘In the middle of the journey’, an article about how EXA4MIND contributes to the development of European Data Spaces, the latest scientific development from our partner Valeo: ‘Valeo4Cast: A Modular Approach to End-to-End Forecasting’, and our latest Synergies and Partnerships: the DataNexus cluster and the collaboration with StandICT.eu
Welcome to the fourth newsletter of the EXA4MIND Project – time to reach the next level! In this edition, you will find all about our last plenary meeting, a new method to improve the safety and reliability of autonomous driving by our consortium member Antonín Vobecký from Czech Technical University in Prague, EXA4MIND’s participation in international events, new videos of the ‘Faces of EXA4MIND’ campaign featuring our consortium members, and upcoming events.
Welcome to the third newsletter of the EXA4MIND Project – On to the second year of the project!! In this edition, you will find the consortium partners review the first year of the project and the objectives for the second year, presentation of the EXA4MIND External Advisory Board, highlights of international events attended by the project in the last months, and our last campaign ‘Faces of EXA4MIND’.
Welcome to the second newsletter of the EXA4MIND Project – The journey continues! In this edition, you will find details about the last plenary meeting and co-design meeting with Application Cases partners, highlights from all national and international events attended by the EXA4MIND project in the last months, and a preview of upcoming events.
Welcome to the first newsletter of the EXA4MIND project. We are glad to have you on board! In this edition, you will find a warm welcome from the EXA4MIND project coordinator, information about the organisations driving the project and their expectations of EXA4MIND, the presentation of our application cases, a recap of the events in which EXA4MIND has been actively involved, and interesting news about TerraviewOS, a consortium partner, which has emerged as the winner of the Gravity05 global sustainability challenge.