EXA4MIND relies on a co-design approach, where technology partners from computing centres and universities and application partners from industry, academia and SMEs design an Extreme Data infrastructure in close collaboration.
Authors: Viktoria Pauwa, David Číž, Vojtě ch Mlý nský, Pavel Banáš, Michal Otyepka, Stephan Hachinger and Jan Martinovič.
Publication date: 2024
The on-going work presented in this article explores different technical approaches and systems for management and analysis of data obtained from large physics simulations, optimising the respective data-driven workflows across Cloud-Computing (IaaS) and HPC systems. The work is carried out in the context of the EXA4MIND Horizon Europe project, which produces an Extreme Data processing platform, bringing together specialised data management systems and powerful computing infrastructures. We evaluate two typical use cases with physics simulations carried out on supercomputing systems at LRZ and IT4Innovations. These use cases come from different areas of physics – they focus on the treatment of low energy many-body systems of molecules, and of high-energy (relativistic) elementary particles, respectively.
Authors: R. F. Cekinel, Ç. Çöltekin, P. Karagoz.
Publication date: 2024
The rapid spread of misinformation through social media platforms has raised concerns regarding its impact on public opinion. While misinformation is prevalent in other languages, the majority of research in this field has concentrated on the English language. Hence, there is a scarcity of datasets for other languages, including Turkish. To address this concern, we have introduced the FCTR dataset, consisting of 3238 real-world claims. This dataset spans multiple domains and incorporates evidence collected from three Turkish fact-checking organizations. Additionally, we aim to assess the effectiveness of cross-lingual transfer learning for low-resource languages, with a particular focus on Turkish.
Authors: G. Puy, S. Gidaris, A. Boulch, O. Siméoni, C. Sautier, P. Pérez, A. Bursuc, R. Marlet
Publication date: 2024
Self-supervised image backbones can be used to address complex 2D tasks (e.g., semantic segmentation, object discovery) very efficiently and with little or no downstream supervision. Ideally, 3D backbones for lidar should be able to inherit these properties after distillation of these powerful 2D features. The most recent methods for image-to-lidar distillation on autonomous driving data show promising results, obtained thanks to distillation methods that keep improving. Yet, we still notice a large performance gap when measuring the quality of distilled and fully supervised features by linear probing. In this work, instead of focusing only on the distillation method, we study the effect of three pillars for distillation: the 3D backbone, the pretrained 2D backbones, and the pretraining dataset. In particular, thanks to our scalable distillation method named ScaLR, we show that scaling the 2D and 3D backbones and pretraining on diverse datasets leads to a substantial improvement of the feature quality. This allows us to significantly reduce the gap between the quality of distilled and fully-supervised 3D features, and to improve the robustness of the pretrained backbones to domain gaps and perturbations.
Authors: M. M. Khosravi, P. Karagoz, I. H. Toroslu.
Publication date: 2024
In this work, we consider the automated index selection for NoSQL databases and investigate the feasi- bility of supervised learning and reinforcement learning based solutions. The experiments conducted on the YCSB dataset show that reinforcement learning improves index selection per- formance as in relational databases, and supervised learning gives promising results and can be considered applicable under sufficient amount of training data.
Authors: A. Vobecky, O. Siméoni, D. Hurych, S. Gidaris, A. Bursuc, P. Pérez, J. Sivic.
Publication date: 2023
This research describes an approach to predict open-vocabulary 3D semantic voxel occupancy map from input 2D images with the objective of enabling 3D grounding, segmentation and retrieval of free-form language queries. This is a challenging problem because of the 2D-3D ambiguity and the open-vocabulary nature of the target tasks, where obtaining annotated training data in 3D is difficult. The contributions of this work are three-fold: a new model architecture for open-vocabulary 3D semantic occupancy prediction; a tri-modal self-supervised learning algorithm that leverages three modalities: (i) images, (ii) language and (iii) LiDAR point clouds, and enables training the proposed architecture using a strong pre-trained vision-language model without the need for any 3D manual language annotations; and a quantitative demonstration of the strengths of the proposed model on several open-vocabulary tasks.
Authors: V. Mlýnský, P. Kührová, P. Stadlbauer, M. Krepl, M. Otyepka, P. Banás, J. Šponer.
Publication date: 2023
Molecular dynamics (MD) simulations represent an established tool to study RNA molecules. The outcome of MD studies depends, however, on the quality of the force field (ff). Here researchers suggest a correction for the widely used AMBER OL3 ff by adding a simple adjustment of the nonbonded parameters. The research suggests that the combination of OL3 RNA ff and NBfix0BPh modification is a viable option to improve RNA MD simulations.
Authors: P. Harsh, S. Hachinger, M. Derquennes, A. Edmonds, P. Karagoz, M. Golasowski, M. Hayek and J. Martinovič.
Publication date: 2023
In this contribution, researchers sketch an application of Earth System Sciences and Cloud-/Big-Data-based IT, which shall soon leverage European supercomputing facilities: smart viticulture, as put into practice by Terraview. TerraviewOS is a smart vineyard ‘operating system’, allowing wine cultivators to optimise irrigation, harvesting dates and measures against plant diseases. The system relies on satellite and drone imagery as well as in-situ sensors where available. The substantial need for computing power in TerraviewOS, in particular for training AI-based models to generate derived data products, makes the further development of some of its modules a prime application case for the EXA4MIND project.
The EXA4MIND project connects pre-eminent databases and data management systems to supercomputing systems and European Data Spaces as well as the world of FAIR research data. The core purpose of this endeavour is running next-generation Extreme Data workfows, with emphasis on data analytics, Machine Learning / Artifcial Intelligence, or classical simulations. This deliverable reports on the Data and Workfow Management Toolbox provided for this purpose, building upon the successful LEXIS Platform (delivered by the H2020 project, GA 825532). Furthermore, it illustrates the first workfows run by our application cases at supercomputing centres.
Welcome to the fourth newsletter of the EXA4MIND Project – time to reach the next level! In this edition, you will find all about our last plenary meeting, a new method to improve the safety and reliability of autonomous driving by our consortium member Antonín Vobecký from Czech Technical University in Prague, EXA4MIND’s participation in international events, new videos of the ‘Faces of EXA4MIND’ campaign featuring our consortium members, and upcoming events.
Welcome to the third newsletter of the EXA4MIND Project – On to the second year of the project!! In this edition, you will find the consortium partners review the first year of the project and the objectives for the second year, presentation of the EXA4MIND External Advisory Board, highlights of international events attended by the project in the last months, and our last campaign ‘Faces of EXA4MIND’.
Welcome to the second newsletter of the EXA4MIND Project – The journey continues! In this edition, you will find details about the last plenary meeting and co-design meeting with Application Cases partners, highlights from all national and international events attended by the EXA4MIND project in the last months, and a preview of upcoming events.
Welcome to the first newsletter of the EXA4MIND project. We are glad to have you on board! In this edition, you will find a warm welcome from the EXA4MIND project coordinator, information about the organisations driving the project and their expectations of EXA4MIND, the presentation of our application cases, a recap of the events in which EXA4MIND has been actively involved, and interesting news about TerraviewOS, a consortium partner, which has emerged as the winner of the Gravity05 global sustainability challenge.