10.21 MD Operation

Written in front

Please remember that this is a very rough tutorial (honestly, I’m somewhat ashamed to call it a tutorial because it’s too rough). The only purpose of this "tutorial" is to help you navigate the entire MD procedure. I won’t go into much detail about the principles, and the command details are not shown step by step. My aim is for everyone following this tutorial to gain confidence in their ability to perform an MD simulation and to obtain results easily.
To keep the process from becoming too tiring, I’ll provide pre-processed files for some steps in this tutorial, but I will still explain how they were generated. I hope you enjoy the journey!  :)
The entire tutorial is hosted on server 172.21.85.9.

1. protein preparation

A common way to obtain a protein's 3D structure is by downloading the file directly from the PDB website. However, if you open the downloaded file in a text editor or view the sequence in PyMOL, you’ll often find some missing residues, side chains, or even parts of the main chain.
When performing molecular dynamics (MD) simulations, it’s important to ensure that the protein is complete, meaning every residue must be present, including both side chains and main chains. To address this, we can use software to fill in the missing segments, and one commonly used tool for this is pdb2pqr.
 
The pre-processed file is available here:
  • /home/chpeng/Sambunu/zyzhou-md-operation-example/pro/6lk0fix_H.pdb
When you revise, you can try generating a new one using the methods discussed above. For now, let's move on to the next step.

2. ligand preparation

2.1. Obtain the 3D structure of the ligand

If we can download the 3D structure of the ligand directly from a database like PubChem, we can use that. However, if it's not available, we can easily generate the 3D structure from the 2D structure using Chem3D (just open the file and save it).
The pre-processed file is available here:
  • /home/chpeng/Sambunu/zyzhou-md-operation-example/ligand/F368-0168-chem3d.sdf

2.2. Generate the ligand topology using Antechamber

First, I suggest that you carefully check your 3D structure (including the charged state and bonds) before proceeding to the next steps.
generate gjf file using antechamber
  • cd /home/chpeng/Sambunu/zyzhou-md-operation-example/sys-pre/ligprep
  • mkdir 0168
  • cd 0168
  • antechamber -i ../../../ligand/F368-0168-chem3d.sdf -fi sdf -o 0168.gjf -fo gcrt
  • sed -i '1,3d' 0168.gjf
  • sed -i '1i %mem=4GB\n%nproc=4\n#B3LYP/6-31G* Pop=MK iop(6/33=2) iop(6/42=6) opt' 0168.gjf
you will get sth like this 
Next, we will use Gaussian16 to optimize this structure.
It takes a lot of time, so here are the results you can expect.
  • /home/chpeng/Sambunu/zyzhou-md-operation-example/sys-pre/ligprep/0168-zyzhou/0168.log
Run Antechamber to prepare the molecular structure and calculate charges, then use Parmchk2 to generate the necessary force field parameters.
  • antechamber -i 0168.log -fi gout -o 0168.prep -fo prepi -c resp
  • parmchk2 -i 0168.prep -f prepi -o 0168.frcmod -a y
You will generate many files, but the most important ones for the next steps are as follows:
  • 0168.prep : includes information such as atomic coordinates, bonding information, and partial charges calculated (often using the RESP method).
  • 0168.frcmod : a force field parameter file generated by parmchk2. It includes additional parameters needed for the force field, such as bond lengths, angles, dihedrals, and non-bonded interactions for the small molecule.
  • NEWPDB.PDB : contains the coordinates of the molecule and can be used for visualization, further editing ...

3. system (pro-lig) preparation

For a molecular dynamics (MD) simulation, we need an initial structure (at time 0 ns). For conventional MD simulations, the best choice is the complex crystal structure. However, in many situations (like the one we are currently dealing with), we may lack the crystal complex structure, meaning we don’t know the actual binding site of the ligand. What can we do?
Here comes docking ! 
Actually, docking is straightforward, but guess what: the atom index in NEWPDB.PDB, which you obtain from "2.2. Generate the ligand topology using Antechamber," is always different from the docked results. This discrepancy requires us to check and manually modify these files, which can be tiring!
Maybe one day someone will find an easier way...
For now, we just need to remember one thing: we require both topology information and positional information for our ligand. Thus, we need the atom index in NEWPDB.PDB to ensure that the software can recognize the atoms in the topology files (0168.prep and 0168.frcmod) and the position information (x, y, z coordinates) in the docking results.
Here is the combined file:
  • /home/chpeng/Sambunu/zyzhou-md-operation-example/sys-pre/tleap/0168-tleap/lig-0168.pdb
You need an environment to load the force field ff19sb
  • source ~/zzy/zyzhou-ff19sb.sh
Use tleap (part of the AMBER suite) to combine the protein and ligand, generating their topology and coordinate files for the MD simulation.
One thing to note is that my habit is to use tleap to generate the AMBER topology file (complex.prmtop) and coordinate file (complex.inpcrd), and then use GROMACS to perform the MD simulation. This means I need to convert the AMBER files (complex.prmtop and complex.inpcrd) to GROMACS files (topol.top and gromacs.gro), which is the function of change.py.
After the conversion, we need to add some information to topol.top to perform position restraints during the pre-equilibration phase before the production MD simulation.
I have written a script for the entire process discussed above, and you can easily run it:
  • cd /home/chpeng/Sambunu/zyzhou-md-operation-example/sys-pre/tleap
  • sh tleap.sh
Inside tleap.sh, the command for tleap is 'tleap -f tleap.in'. You can find detailed information about what tleap does in tleap.in. Here are the two scripts:
  • /home/chpeng/Sambunu/zyzhou-md-operation-example/sys-pre/tleap/tleap.sh
  • /home/chpeng/Sambunu/zyzhou-md-operation-example/sys-pre/tleap/tleap.in
 I strongly recommend that you read them and understand their functions after you complete the entire process.
 
 After running  'tleap.sh', we have the initial files for the MD simulation:
  • /home/chpeng/Sambunu/zyzhou-md-operation-example/sys-pre/tleap/0168-tleap-zyzhou/gmx/topol.top
  • /home/chpeng/Sambunu/zyzhou-md-operation-example/sys-pre/tleap/0168-tleap-zyzhou/gmx/gromacs.gro

4. pre-equilibration

The initial file is not yet ready for a long MD simulation (such as 200 ns). We need to perform some simulations with specific parameters to stabilize it and prepare it for a long-term MD simulation.
First, we need to check which GPUs are available.
  • nvidia-smi
You can see that GPU2 is available, so inform the server to use this GPU.
  • export CUDA_VISIBLE_DEVICES=2
  • export OMP_NUM_THREADS=5
now, it's time to run! 
  • cd /home/chpeng/Sambunu/zyzhou-md-operation-example/pre-equilibration/0168
  • source ~/.gmx2022.5-remd.sh

4.1. Energy Minimization

  • mkdir em
  • cd em
  • gmx_mpi grompp -f ../mdp/minim.mdp -c ../../../sys-pre/tleap/0168-tleap-zyzhou/gmx/gromacs.gro -p ../../../sys-pre/tleap/0168-tleap-zyzhou/gmx/topol.top -o em.tpr
  • gmx_mpi mdrun -v -deffnm em
  • cd ../

4.2. NVT Equilibration

It will take a few minutes.
  • mkdir nvt
  • cd nvt
  • gmx_mpi grompp -f ../mdp/nvt.mdp -c ../em/em.gro -r ../em/em.gro -p ../../../sys-pre/tleap/0168-tleap-zyzhou/gmx/topol.top -o nvt.tpr
  • gmx_mpi mdrun -v -deffnm nvt
  • cd ../

4.2. NPT Equilibration

It will take a few minutes.
  • mkdir npt
  • cd npt
  • gmx_mpi grompp -f ../mdp/npt.mdp -c ../nvt/nvt.gro -r ../nvt/nvt.gro -t ../nvt/nvt.cpt -p ../../../sys-pre/tleap/0168-tleap-zyzhou/gmx/topol.top -o npt.tpr
  • gmx_mpi mdrun -v -deffnm npt
  • cd ..

5. Production MD (2ns)

  • cd /home/chpeng/Sambunu/zyzhou-md-operation-example/mdtest/0168                                             
  • gmx_mpi grompp -f 2ns-md.mdp -c /home/chpeng/Sambunu/zyzhou-md-operation-example/pre-equilibration/0168/npt/npt.gro -t /home/chpeng/Sambunu/zyzhou-md-operation-example/pre-equilibration/0168/npt/npt.cpt -p ../../sys-pre/tleap/0168-tleap-zyzhou/gmx/topol.top -n ../../sys-pre/tleap/0168-tleap-zyzhou/gmx/index.ndx -o md.tpr
Background submission task
  • nohup gmx_mpi mdrun -v -deffnm md &
Wait around 1 hour, and you can view the MD results!
  • /home/chpeng/Sambunu/zyzhou-md-operation-example/mdtest/0168/md.xtc
  • /home/chpeng/Sambunu/zyzhou-md-operation-example/mdtest/0168/md.gro

Last words

Congratulations on successfully processing a complete MD simulation yourself !
However, there are also many important knowledge you need to learn. Here are some suggestions:
  • Read more about MD principles. Mature MD software tutorials and manuals, such as those for GROMACS and AMBER, describe these topics in great detail.
  • Examine the contents of tleap.in in this tutorial; it provides specifics on how to prepare an MD system (not only the protein and ligand but also solvents and ions). Why do we need them? Please read more articles on this topic.
  • Understand why pre-equilibration is necessary before production MD. What’s the difference between NVT equilibration, NPT equilibration, and production MD simulation? Please read their mdp files carefully and investigate every parameter included.
  • ... ...
I hope you enjoy the fantastic MD simulation and have a pleasant journey!
Note: The system discussed in this tutorial is connected with an unpublished article, so please keep this information confidential.