AutoDock: Protein-Ligand Docking Software

AUTODOCK
An Automated Docking Software for

Predicting Optimal Protein-Ligand
Interaction
By
Susan McClatchy, Milind Misra,
Chandreyee Mukherjee, Indu Shrivastava
Introduction
Chandreyee Mukherjee
Automated Docking:
Importance
Interaction between biomolecules lie at the core of all

metabolic processes and life activities
The number of solved protein structures available in the
databases is expanding exponentially
To understand their functions it is essential to elucidate the
interaction mechanisms between the different molecules
Primary importance lies in rational drug design
Depending upon the success of the docked molecules the
docking ligand may be redesigned or its structure further
refined.
Also important in the area of immunology to study antigenantibody interaction.
Inhibitor bound to active site of HIVPR
Surface structure of HIVPR with bound

inhibitor
What is docking?
Prediction of the optimal physical configuration and
energy between two molecules
The docking problem optimizes:
Binding between two molecules such that their

orientation maximizes the interaction
Evaluates the total energy of interaction such

that for the best binding configuration the
binding energy is the minimum
The resultant structural changes brought about

by the interaction
Categories of docking
1. Protein-Protein Docking:
Both molecules are rigid
Interaction produces no change in
conformation
Similar to lock-and key model
2. Protein-Ligand Docking:
Ligand is flexible but the receptor protein is
rigid
Interaction produces conformational
changes in ligand
1. Protein-Protein Docking
2. Protein-Ligand Docking
optimized
Docking uses a search

and score method
It involves:
Finding useful ways of representing the molecules and
molecular properties.
Exploration of the configuration spaces available for
interaction between ligand and receptor.
Evaluate and rank configurations using a scoring system, in
this case the binding energy
However, since it is difficult to evaluate the binding energy

because the binding sites may not be easily accessible, the
binding energy is modeled as follows:
G
bind
= Gvdw + Ghbond + Gelect + G
conform
+ G
tor
+ G
sol
The AutoDock Software

Developed by AJ Olsons group in 1990.
AutoDock uses free energy of the docking
molecules using 3D potential-grids
Uses heuristic search to minimize the energy.

Search Algorithms used:
Simulated Annealing
Genetic Algorithm
Lamarckian GA (GA+LS hybrid)
Algorithms Overview
Simulated Annealing
Based on temperature effects

Start with high temperature and global search
Lower temperature local search
Genetic Algorithm
Charles Darwins Theory of Evolution
Genotype
Phenotype
Lamarckian Algorithm ( Jean Baptiste de
Lamarck)
Phenotype
Genotype
Project Goal
Study algorithms used to perform the
searches and to calculate minimum
energy
Discuss why GA+LS hybrid better than

SA
Look at an example, i.e., dock a ligand

to a protein molecule using latest
AutoDock version
The Algorithms
Sue McClatchy
Simulated Annealing
Algorithm modeled after the cooling of a solution to

form glass, though its better explained by crystal
formation
Given a long enough cooling time, molecules will relax
into their lowest energy state to form the largest
crystals
Quick cooling - highly disordered system
Slow cooling - highly ordered crystal, with each
molecule in its lowest energy state
Algorithm simulates either linear or proportional slow
cooling
The SA Algorithm
Uses neighborhood operator N(s) to generate a set of solutions

according to a fixed distribution
New solution compared to preceding solution, and is accepted if
its energy is lower than that of previous solution
If new solution has higher energy, it is accepted probabilistically
according to Boltzmann distribution (see figure above)
At high temperatures, many higher energy solutions will be
accepted; at low temps., majority of probabilistic moves rejected
Boltzmann probability distribution = e exp(delta E/T) where
delta E = energy difference between two solutions,
T = temperature
Boltzmann finds p(of finding a system with energy E at temp T)
Pseudocode for SA
Compute a random initial state s
n=0, x*n = s
// initialize best solution to s and first state to 0
Repeat i = 1, 2,
// specify number of temperatures to try
Repeat j = 1, 2, , mi // no. of steps to perform for each temp. Ti
Compute a neighbor s = N(s) // s = new solution from N(s)
if (f(s) <= f(s)) then
// if energy of s <= energy of s
s = s
// accept new solution s
if (f(s) < f(x*n)) then
// if energy of new solution <
x*n = s
// energy of best solution of
n=n+1
// state n, replace best with new
endif
else
// otherwise replace s with s using
s = s with probability e (f(s) - f(s))/Ti
// Boltzmann dist.
endif
EndRepeat
EndRepeat
How Genetic Algorithms

Work - A Simple Example
Initial population of
binary creatures
having 6 genes
Each gene has two
different alleles,
either a 0 or a 1
Three operators:
crossover, mutation
and selection
Selection based on a
fitness function f(x)
This operator chooses
those individuals with
the lowest values
Those with higher
values chosen with a
very low probability
Sco
re
Selection
20
13
48
52
Crossover
1 1 1
1 0 0
0 1 0 0
0 0 0
0 0 1
Mutation
0
0 1 0 0
1 1 0 0
0 1
1 1
Lower scoring individuals

create more offspring, higher
scoring ones create fewer or
none at all
Offspring replace parental
generation
Elitism function allows best
individual from parent
generation to persist, if it is a
better solution than new
individuals created
Cycle of selection, mutation,
crossover and replacement
repeated
Sco
re
#o
fs
p
Replacement
0
15
22
Pseudocode for GA
Select an initial population set xi0 = {x10 , x20,, xM0}
Determine fitness values f(xi0) for each individual
Repeat for g = 1, 2, # of generations
Perform selection
Perform crossover with probability
Perform mutation with probability
Determine fitness f(xig) for new individuals
xg* = argmini=1,M f(xig) and yg* = f(xg*)
Perform replacement
Until stopping criterion (# of generations) is reached
How GA works in
AutoDock
Ligands genes are its

x, y and z coordinates
These form a unit vector,
which is given a random
rotation angle between
0o and 360o to form a
quaternion
Additional genes may
represent torsion angles
between bonds of the
ligand
Mapping
In standard GA, the

genotype (x,y,z coordinates
plus rotation and any
torsion angles) are mapped
to the fitness function f(x)
The fitness function value
corresponds to each
individuals phenotype
According to the right hand
side of the figure,
genotypes of parents with
high f(x) values are
mutated to form genotypes
of children with lower f(x)
values
Selection, Crossover &

Mutation
Selection chooses ligands

with the lowest fitness
(energy) values
Crossover exchanges x, y,
z coordinates, or rotations
or torsions between these
ligands
Example: Two ligands
with xyz coordinates Abc
and aBc Crossover results
in new individuals with
coordinates abc and ABc
Mutation operator
mutates coordinate or
other angle values by
adding a random real
number according to a
Cauchy distribution,
which is similar to a
Gaussian but has thicker
tails
Replacement
Individuals with betterthan-average fitness

receive proportionally
more offspring
no= (fw fi)/(fw - <f>),

fw != <f>
where
no= number of offspring
fi = fitness of individual
(energy of ligand)
fw = fitness of worst
individual in last g
generations (typically 10)
<f> = mean fitness of
population
Lamarckian Genetic
Algorithm
According to left hand side

of figure, LGA finds lowest
fitness function (energy)
values first, then maps
these values to their
respective genotypes
Genetic algorithm plus Solis
and Wets local search
Better performance than
either simulated annealing
or genetic algorithm alone
The Application
Milind Misra
HIV-1 Protease and

AHA006
HIV-1 Protease in complex with the

cyclic sulfamide inhibitor, AHA006
Source: Protein Data Bank
Authors: K. Backbro, T. Unge
Exp. Method: X-ray Diffraction (2 res.)
Primary Citation: Backbro et al, J Med
Chem 40 pp. 898 (1997)
Polymer Chains: A, B; Residues: 198;
Atoms: 1632
Protein (HIV-1 Protease)
Ligand
(AHA006)
(Source: PDB)
HIV-1 Protease dimer
(Rasmol)
Initial X-Ray
crystallographic
positions of protein
and ligand
(SYBYL)
Docking Preparation
Ligand
Assign
charges
Define rotatable bonds
Rename aromatic carbons
Merge non-polar hydrogens
Write .pdbq ligand file
Docking Preparation
Protein
Add
essential hydrogens
Load charges
Merge lone-pairs
Add solvation parameters
Write .pdbqs protein file
Docking Preparation Grid

AutoDock
uses
grid-based
docking
Ligand-protein
interaction
energies are precalculated and
then used as a
look-up table
Grid maps are

constructed based
on atoms of
interest in ligand
(here CANOSH)
(AutoDockTools)
Docking Simulated
Annealing
Runs = 100
Cycles = 50
Initial Temp (RT) = 1,000
Temp reduction factor = .95
Linear temperature reduction
Translation reduction factor = 1
Quaternion reduction factor = 1
Torsional reduction factor = 1
# rotatable bonds = 12
Initial coordinates = Random
Initial quaternion = Random
Initial dihedrals = Random
Translation step = 2.0
Quaternion step = 50 deg
Torsion step = 50 deg
Results:
100 different clusters
Energy range: -0.63 to
+64,000
Conformation #81: -0.63
Conformation #67: +20.02
Conformation #68: +10.74
Lowest energy conf not close

to position but similar to
original
Conf #67 closest to position
and conformation of original
ligand; higher energy
Conf #68 close to position but
not conformation of original
ligand; not as high energy
Original ligand conf

SA conformation #67
(SYBYL)

SA conformation #67
Close-up of previous
(SYBYL)

SA conformation #67
(SYBYL)
100 Clustered SA
Conformations
(gOpenMol)
Docking Genetic
Algorithm
Runs = 50
# Evaluations = 250,000
Population size = 50
Elitism count = 1
Mutation rate = 0.02
Crossover rate = 0.8
Window size = 10
Cauchy alpha = 0
Cauchy beta = 1
Results:
Energy range: -18.66 to
+86.28
Lowest energy conformation

overall closest to original
ligand conformation
If only 10 runs had been used
instead of 50, then conf #9
would have been the lowest
energy conformation.
Docking Local Search
Runs = 50
Solis-Wets iterations = 300
Consecutive successes = 4
Consecutive failures = 4
Rho = 1
Lower bound on rho = 0.01
LS frequency = 0.06
Results:
Energy range: +35.92 to
+215,200
Confs #20, 21, 22, 23: +35.92

was most dissimilar to original
ligand conformation
Better results could have been
obtained by reducing the step
sizes
Docking Lamarckian GA
Runs = 10
Max # Evaluations = 250,000
Max # Generations = 27,000
Population size = 50
Elitism count = 1
Mutation rate = 0.02
Crossover rate = 0.8
Window size = 10
Cauchy alpha = 0
Cauchy beta = 1
Solis-Wets iterations = 300
Consecutive successes = 4
Consecutive failures = 4
Rho = 1
Lower bound on rho = 0.01
LS frequency = 0.06
* Gray options *
Results:
Energy range: -18.10 to 8.38

fairly similar to original ligand
conformation
If the number of runs was
restricted to 10 for both GA
and LGA, LGA would have
generated the best structure

Best GA conf
Best LGA conf
Best SA conf
Best LS conf
(SYBYL)

Best GA conf
Best LGA conf
Best SA conf
(SYBYL)
References
http://cmgm.stanford.edu/biochem218/Projects%201998/Apaydin.pdf
http://www.biz.uiowa.edu/class/6K299_menczer/PPT/Hart/sld018.html
http://cs.felk.cvut.cz/~xobitko/ga/
http://www.bch.msu.edu/labs/kuhn/web/projects/screening/solvation.html
http://wwwcmc.pharm.uu.nl/gillies/thesis/
http://www.chem.uidaho.edu/~honors/boltz.html
S.Kumar et.al. Protein Flexibility and Electrostatic Interactions. IBM Journal of
Research and Development Vol45. No 2001.
G. Morris et.al. Automated Docking Using a Lamarckian Genetic Algorithm and an
Empirical Binding Free Energy Function. Journal of Computational Chemistry, Vol.
19, No. 14, 1639-1662 (1998)
C. Rosin et.al. A Comparison of Global and Local Search Methods in Drug Docking.
UCSD CSE Technical Report #CS97-522 (1997)
C. A. Sotriffer et.al. Automated Docking of Ligands to Antibodies: Methods and
Applications. Methods 20, 280-291 (2000)
M. Vieth et.al. Assessing Search Strategies for Flexible Docking.
Practical Handbook of Genetic Algorithms. Edited by Lance Chambers
An Introduction to Genetic Algorithms. Melanie Mitchell.
Goodsell and Olson Prot. Struct. Func. Genet, 8, 195(1990).
Principals of Biochemistry: Lehninger
R. Durbin, S Eddy, A. Krogh, G. Mitchison Biological sequence analysis
Wm. E. Hart. A Theoretical Comparison of Genetic Algorithms and Simulated
Annealing Sandia National Laboratories, www.cs.sandia.gov/~wehart.

AutoDock: Protein-Ligand Docking Software

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

AutoDock: Protein-Ligand Docking Software

Загружено:

Авторское право:

Доступные форматы

AUTODOCK

An Automated Docking Software for

Interaction between biomolecules lie at the core of all

Inhibitor bound to active site of HIVPR

Surface structure of HIVPR with bound

Binding between two molecules such that their

Evaluates the total energy of interaction such

The resultant structural changes brought about

Docking uses a search

Finding useful ways of representing the molecules and

However, since it is difficult to evaluate the binding energy

= Gvdw + Ghbond + Gelect + G

The AutoDock Software

Uses heuristic search to minimize the energy.

Based on temperature effects

Discuss why GA+LS hybrid better than

Look at an example, i.e., dock a ligand

Algorithm modeled after the cooling of a solution to

Uses neighborhood operator N(s) to generate a set of solutions

How Genetic Algorithms

Lower scoring individuals

Ligands genes are its

In standard GA, the

Selection, Crossover &

Selection chooses ligands

Individuals with betterthan-average fitness

no= (fw fi)/(fw - <f>),

According to left hand side

HIV-1 Protease and

HIV-1 Protease in complex with the

Protein (HIV-1 Protease)

HIV-1 Protease dimer

Docking Preparation Grid

Grid maps are

Lowest energy conf not close

Original ligand conf

Original ligand conf

Original ligand conf

Lowest energy conformation

Docking Local Search

Lowest energy conformation

Lowest energy conformation

Original ligand conf

Original ligand conf

Вам также может понравиться