Molecular Evolution of protein COMplexes
- First, for each subunit from a given protein complex, structural data are processed so that each residue is labelled as “Exposed” or “Buried” depending on its accessibility to the solvent. Equally, each residue is labelled as “Contact” or “NonContact” depending on its physical proximity to residues from a different chain. The information regarding the contacted residue (and chain) is retrieved and can be used to further characterize the contacting residue. For instance, using the cytochrome c oxidase complex, which is formed by nDNA- and mtDNA-encoded subunits, one can distinguish the residues contacting with mtDNA-encoded amino acids from those interacting with nDNA-encoded residues.
Second, once each residue has been conveniently labelled, the program, using a nucleotide multiple sequence alignment provided by the user, carries out a codon sorting and it returns files that contain multiple sequence alignment for each subset (for instance, “Exposed NonContact”, “Buried NonContact”, etc).
- Third, using these alignment subsets, MECOM calls the program ‘yn00’ from the PAML package to calculate, among other statistics, the synonymous (dS) and nonsynonymous (dN) substitutions for all the pairwise comparison within a subset. The sum of these nonsynonymous sequence divergence for all the pairwise comparisons is denoted as ΣdN[i], where i indicates the considered subset.
- Finally, MECOM calculates the so-called interaction ratio, ΣdN[i]/ΣdN[j], and it runs a Z-Test to assess the significance of any departure from 1.
As an open source software, this program can be improved by collaborative development. Other structural, evolutive and statistical analyses could be easily implemented to gain deeper insights into the molecular evolution of protein complexes.