iConcept Press Logo
Email Password Remember me
Genomics III: Methods, Techniques and Applications
Genomics III: Methods, Techniques and Applications
iConcept Press

Chapter 9

Genomics III: Methods, Techniques and Applications

Identification of Coevolving Amino Acids Using Mutual Information

by Elin Teppa, Diego Javier Zea and Cristina Marino Buslje

Viewed: 2498


The knowledge of important residues in proteins was a matter of concern and many experimental and computational approaches have been applied to tackle this problem. Changes due to mutations of amino acids at various positions in a protein are passed on to the next generation. Those changes do not occur randomly but functionality and structure impose different constrains to different positions. Multiple sequence alignments (MSAs) of homologous proteins carry that evolutionary information. Important amino acid positions are often conserved; however, mutation studies have shown that many non-conserved positions may also be functionally important. In particular, there may be compensatory mutations such as a mutation in a certain position in a protein (one column of the MSA) induces a coordinated mutation in another(s) position(s) elsewhere in the protein (other columns of the MSA). These co-evolving mutations are of key interest as they identify residues that interact within the protein, engaged to a particular function as examples: catalytic reaction, structure stabilization, protein-protein and substrate interaction and allosteric regulation. Mutual information (MI) theory is often applied to predict positional correlations in a MSA to make possible the analysis of those positions structurally or functionally important. Accurate identification of coevolving positions in protein sequences is difficult due to the high background signal imposed by phylogeny and noise. Identification of catalytic residues (CR) is essential for the characterization of enzyme function. Catalytic residues are in general conserved and located in the functional site of a protein in order to attain their function. However, many non-catalytic residues are highly conserved and not all CR are conserved throughout a given protein family making identification of true catalytic residues a challenging task. We put forward the hypothesis that CR carry a particular signature defined by networks of close proximity residues with high MI, and that this signature can be applied to distinguish functional from other non-functional conserved residues. We demonstrate that networks of residues with high MI provide a distinct signature on CR and propose that such a signature should be present in other classes of functional residues where the requirement to maintain a particular function places limitations on the diversification of the structural environment along the course of evolution. Furthermore, we analyzed the properties of the networks of coevolved residues in enzymes and describe the relationship between co-evolution and the 3D structure of the protein. We introduce a new concept, the analysis of MI3D clusters which combine both evolutionary and three-dimensional information. We observed that networks of coevolving residues tend to be close, forming a sector when mapped onto the 3D structure. Moreover, we found that, amongst the many MI3D clusters usually present in a protein domain, those containing catalytic residues have distinguishable network properties and we also observed that these clusters usually evolve independently, which could be related to a fail-safe mechanism. Finally, we discovered a significant enrichment of functional residues (e.g. metal binding, susceptibility to detrimental mutations) in the clusters, which could be the foundation of new prediction tools. In summary, in this work we analyses different MI methods and describe their application in several biological questions. We study the MI networks in the proximity of the catalytic site and create a method to predict catalytic residues. Finally, we mapped the MI networks in the 3D structure and analyzed their topological properties and found that this information might be useful to improve functional residues prediction methods by reducing the search space.

Author Details

Elin Teppa
Bioinformatics Unit, Fundación Instituto Leloir, Buenos Aires, Argentina, Algeria
Diego Javier Zea
Structural Bioinformatics Unit, Universidad Nacional de Quilmes, Buenos Aires, Argentina, Algeria
Cristina Marino Buslje
Bioinformatics Unit, Fundación Instituto Leloir, Buenos Aires, Argentina, Algeria


Elin Teppa, Diego Javier Zea and Cristina Marino Buslje. Identification of Coevolving Amino Acids Using Mutual Information. In Genomics III: Methods, Techniques and Applications. ISBN:978-1-922227-416. iConcept Press. 0000.