Project Details
Projekt Print View

Analysis and prediction of N-terminal protein sorting signals based on proteogenomics data

Subject Area Bioinformatics and Theoretical Biology
Term from 2011 to 2018
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 207965315
 
Final Report Year 2017

Final Report Abstract

The aims of the project were i) to expand our knowledge about cleavable N-terminal signals, including various types of signal peptides, mitochondrial targeting signals and chloroplast transit peptides, ii) to better understand sequence requirements imposed on signal sequences in different cellular contexts, iii) to gain a more complete knowledge of their structure and function, and iv) to discover novel archetypes of targeting sequences that have been elusive so far. Furthermore, we intended to shed light on the factors determining export pathway specificity, secretion efficiency, and disease propensity by analyzing mutation spectra, evolutionary conservation of cleavage site positions as well as of gain and loss of entire signals along the evolutionary tree. Finally, we planned to develop novel prediction algorithms for cellular sorting signals. In the first phase of the project we analyzed the variety of signal peptides in bacteria based on proteogenomics data, in close collaboration with the colleagues from the Pacific Northwest National Laboratory and University of San Diego. Discoveries include fundamentally distinct signal peptide motifs from Alphaproteobacteria, Spirochaetes, Thermotogae and Euryarchaeota. In these novel motifs, alanine is no longer the dominant residue but has been replaced in a different way for each taxon. Surprisingly, divergent motifs correlate with a proteome-wide reduction in alanine. Computational analyses of ~1,500 genomes reveal numerous major evolutionary clades which have replaced the canonical signal peptide sequence with novel motifs. In a follow-up study we provided updated estimates of the number of signal peptides in bacteria. A single proteogenomics experiment recovered more than a third of all signal peptides that had been experimentally determined during the past three decades and confirmed at least 31 additional signal peptides, mostly in the known exported proteins, which had been previously predicted but not validated. The filtering of putative signal peptides for the peptide length and the presence of an eight-residue hydrophobic patch and a typical signal peptidase cleavage site proved sufficient to eliminate the false-positive hits. Surprisingly, the results of this proteogenomics study, as well as a re-analysis of the E. coli genome with the latest version of SignalP program, showed that the fraction of proteins containing signal peptides is only about 10%, or half of previous estimates. Next, we attempted to apply contact prediction methods specifically developed for transmembrane proteins in order to better distinguish between hydrophobic transmembrane segments and signal peptides. While we did observe some informative signal in the patterns of residue co-variation, it was not sufficient to compete with the SignalP method. In the final part of the project, we analyzed signal peptide gain and loss in evolution. The central question of this study was whether or not orthologous proteins can differ in terms of their ability to be secreted. To answer this question, we investigated the distribution of signal peptides within the orthologous groups of Enterobacterales. Parsimony analysis and sequence comparisons revealed a large number of signal peptide gain and loss events, in which signal peptides emerge or disappear in the course of evolution. Signal peptide losses prevail over gains, an effect which is especially pronounced in the transition from the free-living or commensal to the endosymbiotic lifestyle. The disproportionate decline in the number of signal peptide-containing proteins in endosymbionts cannot be explained by the overall reduction of their genomes. Signal peptides can be gained and lost either by acquisition/elimination of the corresponding N-terminal regions or by gradual accumulation of mutations. The evolutionary dynamics of signal peptides in bacterial proteins represents a powerful mechanism of functional diversification.

Publications

 
 

Additional Information

Textvergrößerung und Kontrastanpassung