Martini 3.0 tutorials: Small molecule binding
In this tutorial, we will set up a coarse-grained (CG) simulation in the new Martini 3.0 force field of a protein – ligand system. The ligand will be initially placed in the solvent phase of the system, and throughout the simulation find the protein binding pocket. As an example, we will simulate the binding of benzene to the L99A mutant of T4 lysozyme. Setting up a CG simulation consists of two main steps, followed by the analysis:
Note1: This tutorial, now based on the Martini 3 open-beta version (v.3.0.b.3.2), will be updated as soon as the final parameters of Martini 3 are available (the paper is under review right now).
Note2: Martinize2 can also be used to martinize your protein. However this new version is still under development. You can take martinize2 from https://github.com/marrink-lab/vermouth-martinize. Any issue should be reported in the github page.
Note3: For protein complexes (e.g., dimers, tetramers, etc.) you should consider using the "–merge" flag in martinize. In this situation, an elastic network can be added between the proteins. These two approaches are important to better mimic hydrogen bonds and to perfect packing, which is essential for a stable protein complex (for the same reasons for which it is needed for a folded protein).
Alternatively, you can visit the PDB website (https://www.rcsb.org/), search for the structure, and click on Download Files.
1B. Generate Martini topology and coordinates
The atomistic structure of T4 lysozyme (PDB: 181L) can be used directly as input for
martinize.py to produce both the coordinates (
CG.pdb) and topology (
CG.top) of the CG model. Have a look at the help function to list the available options by running
martinize -h. Please be carefull and use the version or martinize distributed with the Martini 3 open-beta version. The martini303v.partition file should be in the same folder where you are executing the command.
./martinize -f 181L.pdb -ff martini303v.partition -x CG.pdb -o topol.top -dssp dssp -elastic
Here we are using the version 3.0 of the Martini force field (
-ff). The secondary structure of Martini proteins should be explicitly defined. This can be achieved by adding a set of long harmonic bonds (
-elastic). The magnitude of the elastic force constant can be specified with
-ef. A force constant of 500–750 kJ/(mol nm2) is suitable in most of the cases. An easy way of determining the secondary structure of each residue in the protein is by using the DSSP program. When using the
-dssp option, a path to the binary file must be provided. You can download the binary file from the CMBI website (https://swift.cmbi.umcn.nl/gv/dssp/). Note that the downloaded PDB file contains not only protein residues but also three additional molecules: water (
HOH), benzene (
BNZ), and 2-hydroxyethyl disulfide (
HED). You can delete them prior to generating the CG structure, or leave them in the file.
HETATM atom types are ignored by
1C. Add dihedral corrections to Martini topology
If all went well, the previous step should have generated three files:
chain_A.ssd. Take a look into the
*.ssd file, the secondary structure each residue belongs to is specified under
*.itp file contains almost all bonded parameters required to define the behavior of the protein. We just need to add some dihedral corrections to describe
SC-BB-BB-SC interactions. You should make a backup copy of
./bbsc.sh Protein_A.itp 181L.pdb
This step will have appended additional angle (
SC-BB-BB-SC) and dihedral (
BB-BB-SC) parameters to the
.itp file. Note that
bbsc.sh is only there to call the actual VMD script (
addDihedral.tcl) handing it over the
CG.pdb as arguments.
Since we now have both the coordinates and topology of the Martini 3 protein, we can put the protein in a simulation cell, and add water and ions.
/usr/bin/python2.7 insane.py -f CG.pdb -o CG.gro -pbc cubic -box 10,10,10 -salt 0.15 -charge auto -sol W
You can look for more details about the insane.py script here. The output file (
W residues. Check how many and add specify it in the
[ molecules ] ; name number Protein_A 1 WN 8846 TNA 93 TCL 101
Don’t forget to also include the Martini 3 force field files at the beginning of the
#include "martini_v3.0.3.itp" #include "martini_v3.0_ions.itp" #include "martini_v3.0_solvents.itp”
*.itp file contains information about non-bonded interactions between all types of Martini 3 beads. The second and third files are more specific and define standard Martini 3 ions and solvents.
2B. Add the ligand to the simulation box
The crystal structure downloaded from the PDB website contains a benzene molecule crystalized in the binding site. We want to see if a CG model of benzene is capable of correctly finding the binding site of T4 lysozyme and bind to it in a pose that resembles the crystallographic binding mode. The Martini 3 topology of benzene is already provided in
martini_v3.0_solvents.itp. Martini 3 benzene consists of three equidistant beads type TC4, bonded together by constraints of length 0.291 nm. Make a copy of
CG.gro, rename it
CG_benzene.gro and open it in a text editor. At the end of the file (just before the cell length), add the coordinates of the ligand from the file
benzene.gro. Don’t forget to update the total number of atoms in the first line of the
CG_benzene.gro file. You should try to place the benzene molecule in the solvent part of the simulation box. Don’t worry too much about the exact ligand coordinates; they will be adjusted in the minimization phase. Just try to place the ligand in a reasonable spot. The final
CG.top file should look like this:
#include "martini_v3.0.3.itp" #include "martini_v3.0_ions.itp" #include "martini_v3.0_solvents.itp” [ molecules ] ; name number Protein_A 1 WN 8846 TNA 93 TCL 101 BENZ 1
2C. Start the simulation
gmx grompp -p CG.top -f min.mdp -c CG_benzene.gro -maxwarn 10 gmx mdrun -v -c CG-em.gro gmx grompp -p CG.top -f eq.mdp -c CG-em.gro -maxwarn 5 -n index.ndx
gmx mdrun -v -c CG-eq.gro -x traj_comp-eq.xtc gmx grompp -p CG.top -f md.mdp -c CG-eq.gro -maxwarn 5 -n index.ndx
gmx mdrun -v -c CG-md.gro -x traj_comp.xtc
More information about Martini simulations can be found here.
Once the simulation has finished, a trajectory (
traj_comp.xtc) should have been generated. An RMS alignment of the protein will take care of the rotational and translational motion.
gmx trjconv -pbc mol -center -ur compact -s topol.tpr -f traj_comp.xtc -o temp.xtc gmx trjconv -fit rot+trans -s topol.tpr -f temp.xtc -o trajfitted.xtc -n index.ndx
To do the alignment, an index file containing the list of backbone atoms (
BB) must be provided. It can be generated with:
gmx make_ndx -f CG-eq.gro -n index.ndx << EOF a BB q EOF
You can visualize the trajectory in VMD. More information on visualizing CG Martini simulations is provided here.
vmd CG-eq.gro trajfitted.xtc
Do you see any binding events taking place? Additionally, the occupancy of the benzene molecule across the trajectory can be calculated by issuing the following commands in VMD:
>> set sel0 [atomselect 0 "resname BNZ"] >> volmap occupancy atomselect0 -res 1.0 -allframes -combine avg -o volmap_out.dx
Load the generated file (
volmap_out.dx) in VMD, and select the isosurface representation. The “isovalue” control allows you to adjust the occupancy threshold. What regions of the protein have the highest occupancy? Is the binding site / binding tunnel among them?
For this particular system we obtained around 2-9 binding events per 30 μs of simulation time. If you run the simulation long enough that several binding events are observed, the kon of benzene – T4 lysozyme can be calculated as follows: