MECOM is a command line program. It can be launched just by typing
mecom in the command line. However, several parameters are request in order to carry out a proper analysis.
Once MECOM is correctly installed, this manual can be obtained by typing:
$ man mecom
Also, you can obtain a summarized help typing:
$ mecom --help
$ mecom [--pdb <pdbfile> --contactfile <strfile>] --chain <chainid> --alignment <msafile>
[ --proximityth <float> --exposureth <float> --exposuretherror <float> --informat <msaformat>
--oformat <msaformat> --gc <int> --ocontact <filepath> --report <htmpath> --struct]
|Command line option||Description|
|--help||Display a summarized help document|
|--pdb ||A valid PDB file path|
|--contactfile ||A valid *.str file path. See Contact File section for further explanation.|
|--alignment (*)||A valid multiple sequence alignment (MSA) file path|
|--chain (*)||Chain Id annotated at the PDB|
|--ocontact||A valid file path where output structural analysis will be written (default 'data.str'). See Contact File section for further explanation.|
|--exposureth||Exposure threshold. The value used to distinguish between exposed and buried residues (default 0.05)|
|--exposuretherror||Exposure threshold margin (default 0)|
|--proximityth||The value in Angstroms for the maximum distance between two atoms to be considered as contact pair (default 4)|
|--informat||File format for multiple alignment provided by the user (default 'fasta'). See MSA Valid format section for a complete list of readable MSA formats|
|--oformat||File format for multiple alignment retrieved by MECOM (default 'clustalw'). See MSA Valid format section for a complete list of readable MSA formats|
|--gc||Genetic code for sequences within the input alignment (default 0). See Genetic codes section for a complete list of genetic codes available|
|--report||File path where the program will write a HTML report with the results (default ./report.html)|
|--struct||Carry out just the structural analysis|
(*) Required arguments
 Just one of them is required
Before running MECOM, input data (PDB or MSA files) must meet certain criteria:
The so-called "Contact file", which usually uses the extension .str, is a plain text file that contains a table with the results of the structural analysis carried out during the first step of the program. It contains the information regarding which subunits and residues are involved in intermolecular contacts, as well as information about exposure and residue type.
Raw Table for subunit M
|ChainID||ChainID2||Res num.||AA||AA2||Contact (th=4)||Exposition (th=0.05)|
--proximityth) to other subunit, in any other case
--exposureth) and 0 if the residue is buried
This information is used as a conventional database. MECOM extracts fractions from this table in order to build multiple sequence alignments.
MECOM will write this file into the specified path through the argument
--ocontact or as data.str by default.
For subsequent analyses, the user may wish to use this contact file instead of the pdb, as an input file. In this way, the more heavy computational process is bypassed.
There are 11 different genetic codes, corresponding to transl_table of GENEBANK. As value for the argument
--gc, the user must provide one of the following integers to specify the genetic code used to translate DNA alignments. The default value is 0 (Standard).
|8||Alternative yeast nuclear|
If the selected genetic code do not correspond with the origin of the user provided MSA, stop codons may be introduced in translation. If that occurs, the program will not work correctly and an unexpected error will be dumped.
Through the argument
--oformat, the user must give a valid MSA format (see above). The valid MSA formats are listed below:
|bl2seq||Bl2seq Blast output|
|clustalw||clustalw (.aln) format|
|emboss||EMBOSS water and needle format|
|maf||Multiple Alignment Format|
|mase||mase (seaview) format|
|msf||msf (GCG) format|
|nexus||Swofford et al NEXUS format|
|pfam||Pfam sequence alignment format|
|phylip||Felsenstein PHYLIP format|
|prodom||prodom (protein domain) format|
|selex||selex (hmmer) format|
Specifically, mase, stockholm and prodom have only been implemented for input.
If no format is specified and a filename is given, then the module will attempt to deduce the format from the filename suffix. If this is unsuccessful, a fasta format is assumed.
The format name is case insensitive; FASTA, Fasta and fasta are all treated equivalently.
$ mecom --pdb 2OCC.pdb --chain M --alignment ChainM_alignment.fas
In this example, MECOM will carry out the analysis for the subunit M (COX 8B) from the cytochrome c oxidase complex (also referred to as complex IV), also called subunit 8B.
--gc, whose default value is 0 (Standar) must be set to 1 (Vertebrate mitochondrial) in the case of the example alignments. Thus, to analyse a mtDNA-encoded subunit, the command line instruction must be as follows:
$ mecom --pdb 2OCC.pdb --chain A --alignment ChainA_alignment.fas --gc 1
--ocontact(default data.str). Thus, for subsequent analyses, this file can be provided by the user as an input file through the option
--contactfile. In this case, the option
--pdbbecomes optional, and the analysis will be faster:
$ mecom --pdb 2OCC.pdb --contactfile data_for_chain_A.str --chain A --alignment ChainA_alignment.fas --gc 1
By this way, the option
--pdb becomes optional.
--contactwith. This option allows to focus on those interactions with residues belonging to the specified chains. For example, if the user wants to ignore contacts from chain A with the others mtDNA-encoded subunits (B and C), and restrict the analyses to contacts with residues from other chains, then the user can specify the chains to be included in the analysis. For instance, to analyse the contacts of chain A with nDNA-encoded chains, we must type:
$ mecom --pdb 2OCC.pdb --contactfile data_for_chain_A.str --chain A --alignment ChainA_alignment.fas --gc 1 --contactwith "D E F G H I J K L M Q R S T U V W X Y Z"
Ignoring chains A, B, C, N, O and P, which are encoded by the mitochondrial genome.
MECOM implements a Bioperl method to concatenate alignments. Thus, the program can carry out evolutive analysis of several subunits simultaneously.
$ mecom --pdb 2OCC.pdb --contactfile data_for_chains_ABC.str --chain "A B C" --alignment "ChainA_alignment.fas ChainB_alignment.fas ChainC_alignment.fas" --gc 1 --contactwith "D E F G H I J K L M Q R S T U V W X Y Z" --report reportABC_nu.html
To avoid problems during the execution of the process, it is important to realise the following: i) The order of the options
--alignment, given between quotes, should not been alterated, ii) The same number chain identifiers and MSA files should be provided. In this example, the results will be written in the file
This program provides multiple output files in order to report a detailed view at each step. Four different classes of files are created after running:
Any feedback from users will be very welcome. For this purpose a contact form can be found here. They will take into account every questions and suggestions.