Detailseite
Projekt Druckansicht

Ultraschnelle Haplotyp- und Genotyp-Schätzung von genomweiten Daten auf einem FPGA-GPU Hybridsystem

Fachliche Zuordnung Datenmanagement, datenintensive Systeme, Informatik-Methoden in der Wirtschaftsinformatik
Bioinformatik und Theoretische Biologie
Epidemiologie und Medizinische Biometrie/Statistik
Rechnerarchitektur, eingebettete und massiv parallele Systeme
Förderung Förderung von 2017 bis 2022
Projektkennung Deutsche Forschungsgemeinschaft (DFG) - Projektnummer 351403079
 
Erstellungsjahr 2022

Zusammenfassung der Projektergebnisse

In this project, we developed the software EagleImp, which combines haplotype phasing and genotype imputation in a single convenient tool. EagleImp is a fast and accurate stand-alone software based on the concept of the established tools Eagle2 for phasing and PBWT for imputation. Due to the introduction of algorithmic and technical improvements, including changes in the data structure, EagleImp is 2 to 10 times faster than the combination of Eagle2 and PBWT and provides the same or better phasing and imputation quality in all tested scenarios. For common variants investigated in typical GWAS studies, EagleImp also yielded equal or higher imputation accuracy (r2) than the Sanger Imputation Service (SIS), the Michigan Imputation Server (MIS) and the TOPMed Imputation Server that use larger (not freely available) reference panels. Because of technical optimizations and improvements in the stability of the software, EagleImp can perform phasing and imputation for upcoming very large reference panels with more than 1 million genomes. EagleImp is freely available at GitHub. With EagleImp-Web, we further provide the community with a free and easy-to-use web service that runs an FPGA-accelerated version of the EagleImp software in the background. The FPGA hardware design results in a further speed increase of up to 66% compared to CPU-only processing. Our imputation web service provides a fast, secure and high-quality service for genome-wide genotype phasing and imputation, with many security and convenience features that other services lack, e.g. users can select algorithmic parameters (such as the K parameter for phasing) and tailor input and output data to their needs by selecting tolerance for ref/alt swaps and strand flips, as well as the required output information (such as allele dosage, genotype dosages and genotype probabilities). Further, EagleImp-Web provides transparent monitoring of the user’s jobs and makes all result files (including log files) available for download. All files belonging to a user are protected from unauthorized access via user accounts. Security can optionally be enhanced by 2-factor authentication. EagleImp-Web complies with the General Data Protection Regulation (GDPR) of the European Union and is available at our website.

Projektbezogene Publikationen (Auswahl)

 
 

Zusatzinformationen

Textvergrößerung und Kontrastanpassung