High-performance Computing System for Machine Learning Models

Subject Area Computer Science

Term Funded in 2026

Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 577348670

Project Description

Humankind is generating copious amounts of unstructured data (e.g., websites, blogs, books, photos, videos, etc.). Modern machine learning (ML) and natural language processing (NLP) algorithms can consolidate and compile this wealth of information to achieve tractable, complete, and robust knowledge representations. Doing so provides humans with the relevant information to make evidence-based decisions and avoid errors (e.g., in high-stakes contexts such as journalism, clinical medicine, or personal finance). Nevertheless, specifically in specialist domains, few-to-no generative models in modern decoder architectures (such as GPT) exist dedicated to German language resources. As a result, most cutting-edge research, innovation, and translational implementation in this critical area are dominated by efforts from the Anglo-American or Asian continents. We rely almost exclusively on English language models (sometimes with multilingual capacities) and data, which, when applied to the German context, result in a significant loss of fidelity. The purpose of this proposal is to establish high-performance ML capabilities at the University of Tübingen (UT). We will thereby help to close a critical gap in the German language space, provide a unique resource for numerous research groups (including the DFG-funded Cluster of Excellence “Machine Learning: New Perspectives for Science”), strengthen the position of the UT as a university of excellence in the field of AI, and overall reinforce the digital sovereignty of Germany. The main research objectives for this hardware are (1) to develop transparent and explainable German language models in the life sciences domain; (2) to establish learning optimizers for novel view synthesis in vision models; (3) to increase the robustness of language models to adversarial attacks; and 4) to develop scalable simulation environments and reinforcement learning frameworks for training and evaluating language and vision models in dynamic, interactive scenarios. The current proposal outlines the highly specialized computer systems necessary to attain these goals. Modern ML approaches rely heavily on the existence of very large transformer-based neural network models, which require specialized Graphics Processing Unit (GPU) cards with the largest possible local memory (i.e., ≥ 80 GB per card), and multi-node mergers of cards via seamless inter-node connections (e.g., via Infiniband technology). Such hardware architectures are at present not available in many existing scientific computing centers in Germany. To address these issues, we propose extending the existing DFG and EU-funded ML Cloud environment in Tübingen with additional compute nodes and storage units designed explicitly for massive-scale ML and NLP applications. Adding the requested extension to the ML Cloud will broadly advance basic and applied research at the Tübingen site and beyond.

DFG Programme Major Research Instrumentation

Major Instrumentation Hochleistungs-Computersystem für maschinelle Lernmodelle

Instrumentation Group 7040 Vektorrechner

Applicant Institution Eberhard Karls Universität Tübingen

Leader Professor Dr. Carsten Eickhoff

Servicenavigation

Hauptnavigation

High-performance Computing System for Machine Learning Models

Additional Information

Servicenavigation

Hauptnavigation

High-performance Computing System for Machine Learning Models

Additional Information

Textvergrößerung und Kontrastanpassung