Molecular Dynamics Group, University of Queensland
Molecular Dynamics Group, University of Groningen

MD

Tutorial

Step 1: Conversion of the PDB File



To begin we will first convert the pdb file to the internal format used by the Gromacs package, the gromos file type (*.gro) Note, it is possible to use a pdb file as an input file for an MD simulations with Gromacs, but it is better to convert it first. This is because pdb files often contain ambiguous data, such as two alternative coordinates for the same atom. Also using x-ray crystallography the positions of hydrogens cannot be observed directly. The positions of the hydrogens have either been modelled or are simply not included in the structures. In the Gromos96 force field, that we will use, polar and aromatic hydrogens are treated explicitly and must be generated if necessary. In contrast, non-polar hydrogens bound to carbon are treated implicitly and must be removed if present in the pdb file.

The conversion program pdb2gmx checks each residue in the structure file against a database and adds hydrogens as appropriate. The program pdb2gmx also creates a topology file for the molecules which contains a description of the connectivity interactions between the atoms.

pdb2gmx can be used by simply typing it at the prompt. To get a list of available options type:

pdb2gmx -h

This shows a description of the program and the options available. For example, it is possible to change the hydrogens to deuterium (by changing the mass). Most of these options are not important in the current context. For the conversion of the structure file 1AKI.pdb we will use the default settings and only the main options, -f, -p and -o will be used to stipulate the structure file (input), topology file (output) and the structure file (output), respectively:

pdb2gmx -f 1AKI.pdb -p aki.top -o aki.gro

This will first show the description of the program and generate a warning. The program will ask to select a force field:

Select the Force Field:
 0: GROMOS96 43a1 force field
 1: GROMOS96 43b1 vacuum force field
 2: GROMOS96 43a2 force field (improved alkane dihedrals)
 3: GROMOS96 45a3 force field (Schuler JCC 2001 22 1205)
 4: GROMOS96 53a5 force field (JCC 2004 vol 25 pag 1656)
 5: GROMOS96 53a6 force field (JCC 2004 vol 25 pag 1656)
 6: OPLS-AA/L all-atom force field (2001 aminoacid dihedrals)
 7: [DEPRECATED] Gromacs force field (see manual)
 8: [DEPRECATED] Gromacs force field with hydrogens for NMR
 9: Encad all-atom force field, using scaled-down vacuum charges
10: Encad all-atom force field, using full solvent charges

Choose the GROMOS96 53a6 force field which is the latest version of the official Gromos96 force field and press Enter. The program generates a string of output to the screen. Look at this output. First you will see that that the program has taken a series of default options. For example, the end groups which are not stipulated in the pdb file have been set to NH3+ and COO-.

Question 1: What does the program assume in regard to the pH of the system? A major limitation of classical simulations is that the protonation states of particular residues must be predefined. What would need to be changed to simulate this system at pH 2.0?

When the program has finished, list the files in your directory ('ls'). You should see four files, the original pdb file, and 3 files generate by pdb2gmx. These are the topology (.top) the gromos (.gro) file and a file called posre.itp. This file is created automatically together with the topology and contains information that can be used to restrain the positions of the atoms in the structure if needed. For a more detailed description of these files follow this link.

To examine the .gro file type 'more aki.gro'. You will see that the .gro file is similar to the original pdb file. In particular it contains the same information regarding the positions of the atoms, but the layout is different, hydrogens have been added and units have been converted to nm.

Now examine the topology file ('more aki.top'). This file contains information describing the molecule and its interactions including the names, types, masses and charges of each atom, as well as information on the bonds, angles, dihedrals, etc between atoms.

Step 2: Energy Minimization



We now have converted the structure into an appropriate form, (hydrogens have been added) and generated a topology file that describes the interactions between the atoms. The next step is to minimize the energy of the system. Normally, the system needs to be energy minimised before it is possible to perform a molecular dynamics simulation in order to remove any local strain. Localized strain can be present due to small errors in the original structure such as bad Van der Waals contacts (atomic overlap). Even small differences between the force field used to refine pdb structure and Gromos force field can give rise to unrealistically high forces between atoms. Such strain can be removed by energy minimization of the structure.

The system can be energy minimized using the program 'mdrun'. This is the same program that will be used to perform molecular dynamics simulations. Mdrun uses a single file called a .tpr file as input. The .tpr file is generated by combining the topology (aki.top), structure (aki.gro) and a parameter file (minim.mdp). To generate the .tpr file the program grompp has to be used.

A description of grompp can be obtained by giving the command:

grompp -h

A template minim.mdp parameter file has been created for you to use. This must be copied into your personal subdirectory. This can be performed using the command 'cp' (usage cp [source/]file loc):

cp /home/MDEducation/mdcourse/minim.mdp ./

The file can also be found here for people who do not have access to the network.

You will need to edit the contents of this file (use either nedit minim.mdp or emacs minim.mdp).

Modify the title to include you name and the date. As you can see the file contains a range of parameters. A full list of all the parameters and the options that can be used in an .mdp file can be found by following the links here.

Use grompp to generate the mdrun input file (.tpr) from the three files mentioned:

grompp -f minim.mdp -c aki.gro -p aki.top -o eminput.tpr

Question 2: Grompp also uses a number of hidden files containing critical information not in the other files. What information is missing? (Hint: The program mdrun must get all information regarding nature of the molecule from the .tpr file. Look again at the .top file, is all information required present?)

The file input.tpr has now been created (to verify this, type 'ls'). This is a so called binary file and is not in a human readable form.

To view the contents of this file you can use the command 'gmxdump -s eminput.tpr | more '.

(Hint: Looking at this file will help you answer question 2.)

The .tpr file contains all information that is needed to perform the run. The basic options for the mdrun program can be viewed using:

mdrun -h

The -s option is obligatory and followed by the name of the run input file. Output is generated by specifying the -o, -c and -e options. This will create a trajectory file, a structure file of the minimized structure and an energy file, with all the energies for every step. Perform the run using the following command (the -v option is used to output all messages to the screen and the option -nice sets the priority level):

mdrun -nice 0 -v -s eminput.tpr -o minim_traj.trr -c minimized.gro -e minim_ener.edr

The energy minimization may take some time, depending on the CPU in and the load of the computer. The trajectory file is not very important in energy minimizations, but the generated structure file (minimized.gro) will serve as input for the simulation.

During the minimization the potential energy decreases. A plot from the energy over time can be made from the minim_ener.edr file using g_energy. This program also has several options:

g_energy -h

Simply make a plot from the .edr file by executing:

g_energy -f minim_ener.edr -o minim_ener.xvg

This will display something like the following:

Opened minim_ener.edr as single precision energy file
Select the terms you want from the following list
End your selection with 0
1= G96Bond 2= G96Angle 3= Proper Dih. 4= Improper Dih. 5= LJ-14 6= Coulomb-14 7= LJ (SR) 8= Coulomb (SR) 9= Potential 10= Kinetic En. 11= Total Energy 12= Temperature 13=Pressure (bar) 14= Vir-XX 15= Vir-XY 16= Vir-XZ 17= Vir-YX 18= Vir-YY 19= Vir-YZ 20= Vir-ZX 21= Vir-ZY 22= Vir-ZZ 23= Pres-XX (bar) 24= Pres-XY (bar) 25= Pres-XZ (bar) 26= Pres-YX (bar) 27= Pres-YY (bar) 28= Pres-YZ (bar) 29= Pres-ZX (bar) 30= Pres-ZY (bar) 31= Pres-ZZ (bar) 32= #Surf*SurfTen 33= Mu-X 34= Mu-Y 35= Mu-Z 36=Coul-SR:Protein-Protein 37=LJ:Protein-Protein 38=Coul-14:Protein-Protein 39=LJ-14:Protein-Protein 40=Coul-SR:Protein-rest 41=LJ:Protein-rest 42=Coul-14:Protein-rest 43=LJ-14:Protein-rest 44=Coul-SR:rest-rest 45= LJ:rest-rest 46=Coul-14:rest-rest 47=LJ-14:rest-rest 48= T-rest
49= Lamb-rest

Note, in the latest version of the program the numbers are not shown and you must copy and past the key words. Select 9 (potential), which codes for the total potential energy and then type either a 0 or enter to finish the selection, thus:

It is also possible to select more than one option, e.g.  1(G96Bond), 2(G96Angle), 3 (Proper Dih.), 0 (enter) <enter> will generate a graph of the bond, angle and torsion  energy.

The program g_energy produces a .xvg graph, which can be viewed and edited with xmgrace:

xmgrace -nxy minim_ener.xvg

Help on this program can be found at http://plasma-gate.weizmann.ac.il/Grace/doc/UsersGuide.html. The current graph has an error in it, the label for the horizontal axis should be 'step' instead of 'time (ps)'. Correct this by going to 'plot', select 'axis properties' and in 'label string' delete Time (ps) and write Step. Then click Accept and Close. To exit xmgrace go to File and select Exit.

Question 3: How long did the system take to converge (has it converged)? It this structure likely to be the global energy minima? If you reduce emtol from 10 kJ/mol to 1 kJ/mol would you get a significantly better answer? If not why not?

The system you have just minimized contains the protein plus a small number of water molecules. Effectively you have removed one molecule from a crystal latice and performed the minimization in vacuum. In order to be able to simulate the protein in a more realistic environment (i.e. in aqueous solution) we must first place the system in an appropriately sized periodic box and fill the empty space with water.

The program editconf can be used to create an appropriately sized box :

editconf -h 

Use a standard cubic box and be sure that the protein does not see its images. For this the minimum distance between the wall of the box and any atom of the protein must be greater than half of the cut-off (1.4 nm)

editconf -f minimized.gro -o minimized_box.gro -d 0.75 -bt cubic

To fill the box with solvent the program genbox will be used.

genbox -h 

genbox works by taking a small periodic box of solvent and copying this box many times until the volume of the box is completely filled. Then the program removes any solvent molecules which lie within a certain cutoff distance (the sum of the van der Waals radii) of any atom of the protein. We will use water as the solvent. There are, however, many different models to describe water. The model that we will be using is the three site SPC (Simple Point Charge) water model developed by Herman Berendsen.

genbox needs as input the protein in a box (minimized_box.gro) and a solvent box (spc216.gro) and the topology of the solute (aki.top) and of the solvent (spc.itp).

cp /home/MDEducation/mdcourse/spc216.gro ./
cp /home/MDEducation/mdcourse/spc.itp ./

genbox -cp minimized_box.gro -cs spc216.gro -o aki_water.gro -p aki.top

The topology file (aki.top) is updated automatically by genbox:  the topology file of the water (spc.itp) is included  and the number of water molecules is added. Look breifly at the new version of aki.top (less aki.top).

We now need to mimimize the system again. Follow the same steps as above.

You will need to use grompp to generate the mdrun input file (.tpr) using the new aki.top, the aki_water.gro file you just created and an appropriately modified version of the minim.mdp.

A template minim.mdp parameter file has been created for you to use. This must be copied into your personal subdirectory. This can be done using the command 'cp' (usage cp [source/]file loc):

cp /home/MDEducation/mdcourse/minim_water.mdp ./

The file can also be found here for people who don't have access to the network.

You will need to edit the contents of this file (use either nedit minim_water.mdp or emacs minim_water.mdp).

You should note that the option for periodic boundary conditions (pbc) has now been activated. You will will need to add in appropriate values for the the number of steps of energy minimization to be performed (nsteps) and the frequency with which the energies and coordinates are written (nstenergy and nstxtcout). You should also change the title. As noted above a full list of all the parameters and the options that can be used in an .mdp file can be found by following the links here.

grompp -f minim_water.mdp -c aki_water.gro -p aki.top -o eminput_water.tpr

mdrun -nice 0 -v -s eminput_water.tpr -o minim_traj_water.trr -c minimized_water.gro -e minim_ener_water.edr

Step 3: Molecular Dynamics Simulation



The files needed to performing an MD simulation are very similar to those used to perform the energy minimization although in this case we will use the structure file generated after minimization (minimized.gro). The topology file is the same (aki.top).

We run a short initial simulation of 50 ps.
 
The parameter file (fullmd_sol.mdp) which can be downloaded here, or copied using:

cp /home/MDEducation/mdcourse/fullmd_sol.mdp ./

This file has information on settings to use for the simulations: the duration, writing of output files, treatment of long-range interactions and thermodynamic quantities, initial velocities and constraints on the protein. Take a look at the file (less fullmd_sol.mdp).

You will notice there are a number of additional parameters.
The integrator has been changed to md.
The time step is set to 0.002ps (this is appropriate for water at 300K).
The temperature is set to 300K.
The pressure is set to 1 atmosphere.
We also need to generate velocities. This is done using a random number gnerator for which you will need to give a new initial seed (a 5 figure number). This ensures youe simulation will be different to everyone elses.

Edit the contents of the fullmd_sol.mdp file (use either nedit fullmd_sol.mdp or emacs fullmd_sol.mdp). Change the random number seed and set the value for the number of steps to that needed to perform a 50ps simulation.

Use the preprocessor grompp to combine the input files to one run input file (.tpr):

grompp -f fullmd_sol.mdp -c minimized_water.gro -p aki.top -o fullmd.tpr

After grompp has finished the simulation can begin:

mdrun -nice 0 -v -s fullmd.tpr -o md_traj.trr -x md_traj.xtc -c md_final.gro -e md_ener.edr

Most of the options have also been used for the energy minimization, except for -x. This will generate a trajectory file in the .xtc format, which contains information on the trajectories of the atoms in a compact manner (only containing Cartesian coordinates).

While the simulation is running, use VMD to look the protein solvated in water (minimized_water.gro).
To view the system, select the option 'File' in the main menu. Then select 'New Molecule'. Type minimized_water.gro in filename section. Finally click on 'Load'. Try to visualize the solvent accessible surface.
Notice the box wall dimension and the number of water molecules around the protein.

You can now try to visualize the trajectory of the system. Select the option 'File' in the main menu and go to 'New Molecule'. Set the 'Load files for' to minimized_water.gro. Now use the browse option to find the file 'md_traj.xtc' or type the full path to this file in the 'Filename' section and click on 'Load'.

VMD is now reading your trajectory. Use the arrows at the bottom of the main menu to start, to stop and to change the speed of your animation.

To visualize your trajectory better, to go 'Graphics' select 'Representations'. First type in the 'Selected Atoms': 'protein' and then enter. Only the  protein  should be visible. Now, click on 'Create Rep' and type in the 'Selected Atoms' : 'water and within 3 of protein' and then enter. This command will display only the water molecules around the protein within a distance of 3 Å.  Choice the 'orthographic' option in the 'Display' menu and then zoom using 'Scale mode' in the 'Mouse' menu to see better the details.
You should be able to see the water and the protein and moving.  Try to visualize the hydrogen bonds. This can be done using options in the 'Draw style' menu

Try to explicitly animate the residues involved in the active site (Glu 35, Asp 52, Gln 57 and Trp 108). Also try to show the hydrogen bonds during the animation. VMD has many more features and it is encouraged to explore the program and the structure by trying some. Create a nice movie of your simulation, using Draw style menu  ('Selected atoms: Backbone' and 'Draw style Cartoon'). Remember that you can save your work.

Now the simulation is finished. Take a look what happens to the total energy.

Use g_energy to check the total energy of your systems. 

g_energy -f md_ener.edr -o md_ener_50ps.xvg

Select 12(total_energy) and type 0 or enter to quit. Remember it is also possible to select more than one option. will generate a graph of the potential, kinetic and total energy.

xmgrace -nxy md_ener_50ps.xvg

Question 4: This simulation is being conducted under a specific set of boundary conditions. These include the box itself but also factors such as the temperature and pressure. Try to list other conditions you think would effect the outcome of the simulation (some have been mentioned before). The system has been placed in a cubic box and aligned so that its long axis is aligned with the box. What is a basic problem with this for a non-spherical molecule in a periodic box? Why do we bother using a periodic box. Would it be more realistic to simulate a protein in a drop of water? If not, what artifacts might you expect?

The next section provides more informatio on the analysis of simulations.

Home
Introduction
Lysozyme
Getting Started
Visualization
MD Theory
MD Practice
Analysis
Finally

©2005/2007 T.A.Wassenaar/A.E.Mark