martinize

Martinize is a python script to generate Martini protein topology and structure files based on an atomistic structure file. It replaces the old seq2itpatom2cg and ElNeDyn scripts. The produced topology and structure files are in a format suitable for Gromacs.

The current version (2.x) has been used rather extensively, however it might still contain errors or bugs. Any feedback is more than welcome! The script is "concatenated": all the different classes, modules and forcefields are in one file. If you want to make changes yourself or add a forcefield we have a modular version available. If you would like to use this, send us an e-mail.

You can now also download the latest martinize from GitHub. The major updates are always added also here below:

martinize.py and python 3 version martinize.py (version 2.6, May 12 2016)
- The option for the elastic bond lower cutoff (-el) is now correctly recognized.
- Cys bonds in gro-files and pdb-files without chain identifier are now correctly identified.
- Many, many code clean-ups and restructuring.

martinize.py (version 2.5, August 11 2015)
-Removed warnings about beta status of Martini 2.2.
-Bug fix: Fixed cases where Cys-Cys constraints were not recognized as such

martinize.py (version 2.4, August 18 2013)
-Inverted "define NO_RUBBER_BANDS" behavior.
-Changed protein backbone constraints to bonds
-Changed HIS BB-SC constraint to bonds
-Bug fix: Cys-bond length and force constant
-Bug fix: Position restraints are correctly written out when multiple chains are merged.

martinize.py (version 2.3, February 13 2013)
-Bug fix: Correctly call dssp.
-Bug fix: Correct error message when atoms are missing.
-Bug fix: Correctly merge topologies of multiple change in case of Martini 2.2P

martinize.py (version 2.2, November 27 2012)
-Added charged His to all forcefields and options to choose the His-charge state.
-Bug fix: correctly handle .gro files
-Bug fix: Correctly handle .pdb files containing hydrogens.
-Bug fix: bead types correctly set in helix starting at first residue.
-Fixed small inconsistencies in elnedyn forcefields.
-Cleaned up and added help text and warning messages.

martinize.py (version 2.1, August 23 2012)
-Bug fix: bond length in Martini 2.2 & 2.2p
-Bug fix: assignment of secondary structure

martinize.py (version 2.0, July 25 2012)
-Major clean-up and restructuring of the code
-Changed forcefield selection. Forcefield now available: Martini 2.1, Martini 2.1P, Martini 2.2, Martini 2.2P, Elnedyn, Elnedyn 2.2 and Elnedyn 2.2P
-Added function to handle new polar and charged residues in Martini 2.2P
-Several small bug fixes.

martinize-1.2.py (version 1.2, May 22th 2012)
-Fixed bug with counter in multi chain topologies.
-Corrected wrong collagen parameters.
-Fixed bug involving BBBB dihedrals in extended regions.
-Fixed bug when giving secondary structure as string.
-A test set is now available.

martinize-1.1.py (version 1.1)
-Fixed bug in pdb read-in.
-Clean up of code.

martinize-1.0.py (version 1.0)

 

 


MARTINIZE.py is a script to create Coarse Grain Martini input files of
proteins, ready for use in the molecular dynamics simulations package
Gromacs. For more information on the Martini forcefield, see:
www.cgmartini.nl
and read our papers:
Monticelli et al., J. Chem. Theory Comput., 2008, 4(5), 819-834
de Jong et al., J. Chem. Theory Comput., 2013, DOI:10.1021/ct300646g

Primary input/output
--------------------
The input file (-f) should be a coordinate file in PDB or GROMOS
format. The format is inferred from the structure of the file. The
input can also be provided through stdin, allowing piping of
structures. The input structure can have multiple frames/models. If an output
structure file (-x) is given, each frame will be coarse grained,
resulting in a multimodel output structure. Having multiple frames may
also affect the topology. If secondary structure is determined
internally, the structure will be averaged over the frames. Likewise,
interatomic distances, as used for backbone bond lengths in Elnedyn
and in elastic networks, are also averaged over the frames available.

If an output file (-o) is indicated for the topology, that file will
be used for the master topology, using #include statements to link the
moleculetype definitions, which are written to separate files. If no
output filename is given, the topology and the moleculetype
definitions are written to stdout.

Secondary structure
-------------------
The secondary structure plays a central role in the assignment of atom
types and bonded interactions in MARTINI. Martinize allows
specification of the secondary structure as a string (-ss), or as a
file containing a specification in GROMACS' ssdump format
(-ss). Alternatively, DSSP can be used for an on-the-fly assignment of
the secondary structure. For this, the option -dssp has to be used
giving the location of the executable as the argument.
The option -collagen will set the whole structure to collagen. If this
is not what you want (eg only part of the structure is collagen, you
can give a secondary structure file/string (-ss) and specifiy collagen
as "F". Parameters for collagen are taken from: Gautieri et al.,
J. Chem. Theory Comput., 2010, 6, 1210-1218.
With multimodel input files, the secondary structure as determined with
DSSP will be averaged over the frames. In this case, a cutoff
can be specified (-ssc) indicating the fraction of frames to match a
certain secondary structure type for designation.

Topology
--------
Several options are available to tune the resulting topology. By
default, termini are charged, and chain breaks are kept neutral. This
behaviour can be changed using -nt and -cb, respectively.

Disulphide bridges can be specified using -cys. This option can be
given multiple times on the command line. The argument is a pair of
cysteine residues, using the format
chain/resn/resi,chain/resn/resi. For disulphide bridges, the residue
name is not required, and the chain identifier is optional. If no
chain identifier is given, all matching residue pairs will be checked,
and pairs within the cutoff distance (0.22 nm) will be linked. It is
also possible to let martinize detect cysteine pairs based on this
cut-off distance, by giving the keyword 'auto' as argument to -cys.
Alternatively, a different cut-off distance can be specified, which
will also trigger a search of pairs satisfying the distance
criterion.

In addition to cystine bridges, links between other atoms can be
specified using -link. This requires specification of the atoms, using
the format
chain/resi/resn/atom,chain/resi/resn/atom,bondlength,forceconstant.
If only two atoms are given, a constraint will be added with length
equal to the (average) distance in the coordinate file. If a bond
length is added, but no force constant, then the bondlength will be
used to set a constraint.

Linking atoms requires that the atoms are part of the same
moleculetype. Therefore any link between chains will cause the chains
to be merged. Merges can also be specified explicitly, using the
option -merge with a comma-separated list of chain identifiers to be
joined into one moleculetype. The option -merge can be used several
times. Note that specifying a chain in several merge groups will cause
all chains involved to be merged into a single moleculetype.

The moleculetype definitions are written to topology include (.itp)
files, using a name consisting of the molecule class (e.g. Protein)
and the chain identifier. With -name a name can be specified instead.
By default, martinize only writes a moleculetype for each unique
molecule, inferred from the sequence and the secondary structure
definition. It is possible to force writing a moleculetype definition
for every single molecule, using -sep.

The option -p can be used to write position restraints, using the
force constant specified with -pf, which is set to 1000 kJ/mol
by default.

For stability, elastic bonds are used to retain the structure of
extended strands. The option -ed causes dihedrals to be used
instead.

Different forcefields can be specified with -ff. All the parameters and
options belonging to that forcefield  will be set (eg. bonded interactions,
BB-bead positions, Elastic Network, etc.). By default martini 2.1 is
used.

Elastic network
---------------
Martinize can write an elastic network for atom pairs within a cutoff
distance. The force constant (-ef) and the upper distance bound (-eu)
can be speficied. If a force field with an intrinsic Elastic
network is specified (eg. Elnedyn) with -ff, -elastic in implied and
the default values for the force constant and upper cutoff are used.
However, these can be overwritten.

Multiscaling
------------
Martinize can process a structure to yield a multiscale system,
consisting of a coordinate file with atomistic parts and
corresponding, overlaid coarsegrained parts. For chains that are
multiscaled, rather than writing a full moleculetype definition,
additional [atoms] and [virtual_sitesn] sections are written, to
be appended to the atomistic moleculetype definitions.
The option -multi can be specified multiple times, and takes a chain
identifier as argument. Alternatively, the keyword 'all' can be given
as argument, causing all chains to be multiscaled.
========================================================================


-f  Input file (PDB|GRO)
-o  Output topology (TOP)
-x  Output coarse grained structure (PDB)
-n  Output index file with CG (and multiscale) beads.
-nmap  Output index file containing per bead mapping.
-v  Verbose. Be load and noisy.
-h  Display this help.
-ss  Secondary structure (File or string)
-ssc  Cutoff fraction for ss in case of ambiguity (default: 0.5).
-dssp  DSSP executable for determining structure
-collagen  Use collagen parameters
-his  Interactively set the charge of each His-residue.
-nt  Set neutral termini (charged is default)
-cb  Set charges at chain breaks (neutral is default)
-cys  Disulphide bond (+)
-link  Link (+)
-merge  Merge chains: e.g. -merge A,B,C (+)
-name  Moleculetype name
-p  Output position restraints (None/All/Backbone) (default: None)
-pf  Position restraints force constant (default: 1000 kJ/mol/nm^2)
-ed  Use dihedrals for extended regions rather than elastic bonds)
-sep  Write separate topologies for identical chains.
-ff  Which forcefield to use: martini21 ,martini21p ,martini22 ,martini22p ,elnedyn ,elnedyn22 ,elnedyn22p
-elastic  Write elastic bonds
-ef  Elastic bond force constant Fc
-el  Elastic bond lower cutoff: F = Fc if rij < lo
-eu  Elastic bond upper cutoff: F = 0  if rij > up
-ea  Elastic bond decay factor a
-ep  Elastic bond decay power p
-em  Remove elastic bonds with force constant lower than this
-eb  Comma separated list of bead names for elastic bonds
-multi  Chain to be set up for multiscaling (+)