Project Details
Projekt Print View

The DNA from a Coding Perspective

Subject Area Electronic Semiconductors, Components and Circuits, Integrated Systems, Sensor Technology, Theoretical Electrical Engineering
Term from 2017 to 2023
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 328529227
 
Final Report Year 2022

Final Report Abstract

DNA was investigated from the perspective of information theory and communications in general to gain a deeper insight into the properties of DNA sequences linking those engineering perspectives to biological aspects. Several new aspects about the DNA could be elucidated and known properties could be understood and actually formally proven. Seeing mutations and their probabilities as a communications channel and computing mutual information along time, realized as a channel matrix exponent, allowed to prove the mapping between codons and amino acids, especially synonymous mappings and the Wobble rule have become directly visible. Even more, we had to realize that at some point, the information content of a base pair that is a quaternary representation, will only offer a single bit of information, maybe only allowing to distinguish between purines and pyrimidines. We could show that Shannon entropy is a good general indicator for biologically relevant DNA features. With a screening based on Shannon entropy, we could identify promoter features sensitive to DNA 3D structure determinative for temporal gene expression. With the same approach we identified repetitive sequence features able to modulate basal levels of gene expression. With this information about promoter design, custom-made promoters with desired features are within reach for synthetic biology and allow for more predictable engineering approaches in biology and biotechnology. Furthermore, we were able to quantify rearrangements in the order of genes forming patterns of gene migration during evolution of bacterial chromosomes. This work revealed a fundamental driving force for bacterial chromosome evolution which also paves the road to a more reliable construction of stable synthetic chromosomes. Based on information-theoretic features, just as entropies (Shannon and Gibbs), mutual information, conditional mutual information, Kullback-Leibler divergence, and Markov models, we managed to perform intra-organism and cross-organism prediction of essential genes, in bacteria, archaeon, and eukaryotes. With such a simple sequence-based approach, we obtained AUC performances comparable to much more elaborate methods, e.g., CRISPR-based ones. For the Markov modeling, we had to, of course, estimate the suitable order. Our studies on essentiality lead to a cooperation with colleagues in Israel and joint publications. It hence developed to become a bigger share of our project than once anticipated. As a consequence, studies on gene regulatory network as the presumably highest layer of protection in a kind of graph-related error correction mechanism were then only touched upon, especially looking into synthetic lethality networks. Synthetic lethality can be regarded as a repetition code, but this is not just limited to gene duplications or simple functional replacements. There are pathways that provide redundancy and hence, a more complicated multi-level “code” graph appears appropriate. Jointly looking into co-regulatory networks did so far not lead to an understanding of connection degrees of certain genes, especially of hubs with many connections. However, more studies will certainly be needed on this aspect. The project not only lead to international cooperations, but also to centrally contributing to an NSF-Workshop that we got invited to. NSF is trying to initiate a cooperative initiative between engineering and life-sciences, just as DFG’s earlier framework project. Also NSF had realized that such a cooperation can lead to a significant enhancement in the understanding of the genetic structure and function, which we have indeed experienced, as well.

Publications

 
 

Additional Information

Textvergrößerung und Kontrastanpassung