Вы находитесь на странице: 1из 17


Internship report submitted to


BY V.RAGHAVENDRA I M.Sc. Photonics and Bio Photonics NCUFP

Worked under the guidance of

Dr. V. SUBRAMANIAN, SCIENTIST EII, Chemical Laboratory, Central Leather Research Institute, (Council of Scientific & Industrial Research) Adyar, Chennai-600 020


I sincerely express my deep sense of gratitude to Dr. V. Subramaniam, Scientist, CLRI, for giving me an opportunity to get trained at Chemical Laboratory CLRI. My special thanks to Mr. Azhagiya Singam, for his guidance, valuable advice, patience and encouraging presence through out my internship.

Molecular Dynamics is a computer simulation technique where the time evolution of a set of interacting atoms is followed by integrating their equations of

motion. It follows Newtons second law of motion. It is a simulation of the timedependent behavior a molecular system.

F = ma
F- Force in Newton m - Mass in KG a Acceleration in m/s2 With initial positions and velocities given, the subsequent time evolution is completely determined. The atoms and molecules would move in the computer screen colliding with each other, vibrating about its mean position when constrained or wander around when the system under consideration is fluid, oscillate in waves in concert with their neighbors, or even evaporate away from the system if there is a free surface and so on in a way similar to what real atoms and molecules would do.

After a very significant work on quantum chemistry by Linus Carl Pauling computational chemistry gradually started to evolve. The books that were influential in the early development of computational chemistry include Linus Pauling and E.Bright Wilsons Introduction to Quantum Mechanics with Applications to Chemistry,

Henry Eyring, Walter and Kimballs Quantum Chemistry, Heitlers Elelmentary Wave Mechanics with Applications to Chemistry. All these books served as primary references for the chemists in the years to follow.

The dawn of high speed computers in 1950s altered the picture by introducing a new concept right in between the experiment and theory, the computational experiments. In the early 1950s, the first semi-empirical atomic orbital calculations were carried out. Theoretical chemists became extensive users of the early digital computers. In the computer experiment the model is provided by the theorists but the calculations are done by the machine by following a recipe - the algorithm implemented in a suitable programming language.

Molecular Dynamics and Molecular Mechanics:

In molecular dynamics, atoms interact with each other. These interactions are due to forces which act upon every atom, and which originate from all other atoms. Atoms move under the action of these instantaneous forces. As the atoms move, their relative positions change and forces change as well. Typical MD simulations can be performed on systems containing thousands of atoms, and for simulation times ranging from a few picoseconds to hundreds of nanoseconds

Many of the cases that we would like to work out using molecular modeling are so large that it cannot be considered using quantum mechanics. Because quantum mechanics deals with the electrons of the system and even if the some of the electrons are

ignored as seen in the semi empirical schemes, a large number of particles must still be considered and the calculations are really time-consuming. Force fields, otherwise known as the molecular mechanics ignore the electronic motion and calculate the energy of the system based on the nuclear positions. In some cases the forced fields can provide answers that are as accurate as even the highest-level quantum mechanical calculations, in a fraction of the computer time.

Force field is a collection of functional forms and associated constants. With that collection in hand, the energy of a given molecule (whose atomic connectivity in general must be specified) can be evaluated by computing the energy associated with every defined type of interaction occurring in the molecule. Because there are typically a rather large number of such interactions, the process is facilitated by the use of digital computer but the mathematics is really extraordinary simple and straight forward. Force field functions and parameter sets are derived from both experimental work and high-level quantum mechanical calculations.

Functional forms depend on internal co ordinates and on atomic properties Molecular potential energy surface can be constructed as a sum of energies from chemically intuitive functional forms. Individual parameters include (Force constants, equilibrium co-ordinate values and phase angles etc.)

Some Important Terms:

Periodic Boundary Conditions (PBC): The classical way to minimize edge effects in a finite system is to apply periodic boundary conditions. The atoms of the system to be simulated are put into a space-filling box, which is surrounded by translated copies of itself (Fig. 1.1). Thus there are no boundaries of the system; the artifact caused by unwanted boundaries in an isolated cluster is now replaced by the artifact of periodic conditions. If the system is crystalline, such boundary conditions are desired (although motions are naturally restricted to periodic motions with wavelengths fitting into the box). If one wishes to simulate nonperiodic systems, such as liquids or solutions, the periodicity by itself causes errors. The errors can be evaluated by comparing various system sizes; they are expected to be less severe than the errors resulting from an unnatural boundary with vacuum. If the particle leaves a box during simulation then it is replaced by an image particle that enters from opposite side. The number of particles in the central box remains constant.

Cut off Radius: The radius limit after which the molecular interactions are disregarded is called as the cut off radius. When PBC are being used the cut off radius should not be so large that a particle sees its own image or indeed the same molecule twice.

Time Step: The engine of a molecular dynamics program is its time integration algorithm. The time interval at which the integration is done is called as the time step.

Ensembles: The properties like heat capacity depend on the positions and momenta of the particles that comprise the system. Due to repeated integration with respect to the time the value obtained is the time average. The following are the different types of ensembles. In order to deal with collection of molecules in statistical mechanics, one typically requires that certain macroscopic conditions be held constant by external influence. The enumeration of these conditions is defined as ensemble.

a) Microcanonical Ensemble (NVE): Here the Number of moles, Volume and Energy of the system are held constant.

b) Isothernal Isobaric (NPT): Here Number of moles, Pressure and Temperature of the system are held constant.

c) Canonical Ensemble (NVT): Number of moles, volume and temperature are held constant here.

Classical Mechanics:
The molecular dynamics simulation method is based on Newtons second law of motion which says F=ma. F- Force in Newton m - Mass in KG a Acceleration in m/s2 By knowing the force on each atom in the system the acceleration of each atom of the system can be calculated. Integration of equations of motions yields a trajectory that describes the positions, velocities and accelerations of the particles with respect to time. From the trajectory obtained the average values of properties can be calculated. Once the position and velocity of the atoms are known the system could calculate the position and velocity of the atom at any given time.

According to Newtons second law of motion,

F = ma

Where F is the force acting on a particle, m is the mass of the particle and the acceleration of the particle.

The force can also be expressed as the gradient of the potential energy,

Combining the above two equations we get

Where V is the potential energy of the system. Newtons equation of motion can then relate the derivative of the potential energy to the changes in position as a function of time. A simple application of Newtons second of law of motion can be written as

Assuming the acceleration to be constant,

After integration we obtain the value for the velocity as


After integrating we obtain

Combining this equation with the expression for the velocity, we obtain the following relation which gives the value of x at time t as a function of the acceleration, a, the initial position, x0, and the initial velocity, v0.

Theaccelerationisgivenasthederivativeofthepotentialenergywithrespectto theposition,r

Therefore to calculate the trajectory we need only the initial positions of the atoms, an initial velocity and acceleration which is determined by the gradient of potential energy functions. The initial positions can be obtained from experimental structures, such as the x-ray crystal structure of the protein or the solution structure determined by NMR spectroscopy. Optimization Algorithm: There are many different algorithms for finding the set of co-ordinates corresponding to the minimum energy. These are called optimization algorithms as they can be used equally well for finding the minimum and maximum of a function.

Verlet Algorithm: Verlet integration algorithm is the numerical method used to calculate the trajectories of the particles in the molecular dynamics simulations. The potential energy of a system is the function of atomic positions (3- dimensional) of all the atoms in the system. Since it depends on the 3 dimensional positions of the atoms in the system, it is a complicated picture. There is no direct analytical solution to the equations of motions; they must be solved numerically.

The Verlet algorithm uses positions and accelerations at time t and the positions from time t-dt to calculate new positions at time t+dt. The Verlet algorithm uses no explicit velocities. The advantages of the Verlet algorithm are, i) it is straightforward, and ii) the storage requirements are modest. The disadvantage is that the algorithm is of moderate precision.

The Leap-frog algorithm: Leapfrog is another method for integrating the differential equations in the case of dynamical simulations. Leapfrog integration calculates positions and velocities at interleaved time points, interleaved in such a way that they 'leapfrog' over each other. For example, the position is known at integer time steps and the velocity is known at integer plus half time steps. In this algorithm, the velocities are first calculated at time t+1/2dt; these are used to calculate the positions, r, at time t+dt. In this way, the velocities leap over the

positions, and then the positions leap over the velocities. The advantage of this algorithm is that the velocities are explicitly calculated, however, the disadvantage is that they are not calculated at the same time as the positions.

The Velocity Verlet algorithm: This method of integration is similar to the Leapfrog method except for the fact that velocity and position are calculated at the same value of the time variable. It yields positions, velocities and accelerations at time t. Theaccuracyisverymuchreliable.

Some of Available Force Fields:

GROMACS: GROningen MAchine for Chemical Simulations CFF - Consistent force fields

CHEAT - Carbohydrate hydroxyls represented by external atoms ECEPP - Empirical conformational energy program for peptides EFF - Empirical force field

GROMOS Gronigen molecular simulation MMFF Merck Molecular force field OPLS- Optimized potential for liquid simulation

GROMACS: GROningen MAchine for Chemical Simulations, abbreviated as GROMACS is a molecular dynamics simulation package developed in the University of Groningen, Netherlands. Its a high end, high performance research tool developed based on the

classical molecular dynamics theory for the study of protein dynamics. Its license free software and runs on Unix, Linux and Windows.

First of all the 3D structure of the protein to be simulated has to be downloaded from the website for Protein Data Bank. Then the structure is viewed using molecular viewer such as PyMol. This is done to find out residues with missing side chains. Viewing through DeepView software will replace any missing side chains.

Basically molecular dynamics simulations consists of 3 stages. First, the input data has to be prepared, second, the simulation needs to be run and the third the analysis of the results.

Preparation: After downloading the structure from the PDB, we need to check if any residue is missing from the structure. In case any residue is missing from the structure it needs to be added. Structure conversion and topology: A molecule is defined by its coordinates of the atoms as well as by a description of the bonded and non - bonded interactions. The structure obtained from the PDB only has coordinates; therefore we need a topology, which describes the system in terms of atom types, charges, bonds, etc. A topology is always specific to a certain force field. It is important that the topology matches with the structure, which means that the structure needs to be converted too, to adhere to the force field used. To convert the

structure and construct the topology, the program pdb2gmx can be used. This program is designed to build topologies for molecules consisting of distinct building blocks, such as amino acids. It uses a library of building blocks for the conversion and will fail to recognize molecules or residues not present in the library. The following command is to be presented to convert the structure. < pdb2gmx -f protein.pdb -o protein.gro -p protein.top -ignh > Here ignh indicates that the hydrogen atoms in the file is removed and is rebuilt according to the description in the force field. OPLS (Optimized potentials for liquid simulations) is the force field selected from the list of force fields. Now that we have the desired structure and force field, before starting simulation we have a few more things to be taken care of. Applying of periodic boundary condition (PBC) is to the system. This enables the simulation to be performed with small number of particles. The particles near the walls of the box do not experience the edge effect i.e. they do not feel that the they are at the wall of the box rather they experience the force as any particle in the bulk of the system would do. As already mentioned, if the particle leaves a box during simulation then it is replaced by an image particle that enters from opposite side. The number of particles in the central box remains constant. Cubical, octahedron, rhombic dodecahedron are some of the PBCs available. The following command is used to add the PBC to the system. < editconf -f protein-EM-vacuum.gro -o protein-PBC.gro -bt cubical -d 1.2 >

Solvent molecules are added now to the system using the following command. The solvent added in this case is water. < genbox -cp protein-PBC.gro -cs spc216.gro -p protein.top -o protein-water.gro > The next step is the addition of ions. The solvated protein has some charge on its own. We need to make the protein neutral, in order to that we add counter ions to the protein system. The following command is used to add the counter ions. < grompp -v -f minim.mdp -c protein-water.gro -p protein.top -o protein-water.tpr> Due to all the above process that we have done to the system, there is a strain in the system. This strain needs to be reduced. The strain may be due to the presence of the atoms very close to each other, overlapping atoms, equal charges too close together. This strain can be reduced or overcome by energy minimization step. Energy minimization is technically a simulation step. The structure and the topology are combined into a single description of the system, together with a number of control parameters. This yields a run input file, which can be used as the single input for the simulation program mdrun. There are two types of energy minimizations via steepest descent and conjugate radiant. The rate of simulation is faster in the former but it is less accurate when compared to the latter which gives finer structure at slower rate. The following command is used to do the energy minimization process. <grompp -v -f minim.mdp -c protein.gro -p protein.top -o protein-EM-vacuum.tpr >

Equilibration step follows the energy minimization step. To make the protein relax in the solvent environment, so that they show free movement as they would do in the real system.

Now that we have the equilibrated system, we can start the simulation. Before starting the simulation it is good to have the following things ready.

a) How long should the simulation run b) What is the time step c) Is there a need to save velocities d) Should we save all the atoms in the output or only the protein co ordinates need to be saved etc.

The following command is used to run the simulation. < grompp -v -f md.mdp -c protein-NPT.gro -p protein.top -o topol.tpr > Once the simulation is finished, the changes such as bond angle, bond length , other conformational details are studied.

Important Applications of Molecular Dynamics:

Discover and design new molecules. Computer representations of molecules and chemical databases and 2D substructure searching Protein structure prediction, sequence analysis and protein folding

MD simulations are used to probe the mechanisms of viral assembly MD has proved to be very useful in the field of drug designing Study of functional properties of biological molecules at the atomic level

Molecular Modelling: Principles and Applications (2nd Edition) Andrew R. Leach A Practical Guide for Applying Techniques to Real World Problems: David Young

Internet Sources: http://www.ch.embnet.org/MD_tutorial/ http://en.wikipedia.org/wiki/Wiki