PAM-1?

Look at 71 groups of protein sequences

where the proteins in each group are at

least 85% similar (Why these groups?)

Compute relative mutability of each amino

acid probability of change

From relative mutability, compute

mutability probability for each amino acid

pair X,Y probability that X will change to Y

over a certain evolutionary time

Normalize the mutability probability for

each pair to a value between 0 and 1

the Likelihood that an Amino Acid Will Mutate

For each amino acid

Changes (p) = number of times the amino acid

changed into something else

exposure to mutation =

(percentage occurrence of the amino acid in the

group of sequences being analyzed) * (frequency of

amino acids changes in the group based on the

phylogenetic tree)

relative mutability =

(changes/exposure to mutation) / 100

Computing

Mutability Probability Between Amino Acid Pairs

r = relative mutability of X

c = num times X becomes Y or vice versa

p = num changes involving X

mutability probability of X to Y =

(r * c) / p

changes = # times A changes into something else = 4

% occurrence of A in group = 10 / 63 = 0.159

frequency of all amino acid changes in group = 6 * 2 = 12

(Note: Count changes backwards and forwards.)

exposure to mutation = (% occurrence of A in group)

* (frequency of all amino acid changes in group)

= 12 * 0.159

relative mutability = (changes / exposure to mutation) / 100

= (4 / (12 * 0.159)) = 2.09 / 100 = 0.0209

1 substitution per 100 residues.

Example from Fundamental Concepts of Bioinformatics by Krane

and Raymer.

A will change to G:

r = relative mutability of A = .0209

c = num times A becomes G or vice versa

=3

p = num changes involving A = 4

mutability probability of A to G =

(r * c) / p = (0.0209 * 3) / 4 = 0.0156

Normalizing

Mutability Probability, X to Y

For each Y among all amino acids,

compute mutability probability of X to Y

as described above

Get a total of these 20 probabilities.

Divide them by a normalizing factor such

that the probability that X will NOT

change is 99% and the sum of

probabilities that it will change to any

other amino acid is 1%

Converting

Mutability Probabilities to

Log Odds Score for X to Y

Y as follows:

Get the X to Y mutability probability

Divide by the % frequency of X in the sequence

data

Convert to log base 10, multiply by 10

log10(.098)

To compute log10(.098) solve for x:

10x = 0.098

x = -1.01

A score of 0 indicates that the change

from one amino acid to another is what is

expected by chance

A negative score means that the change is

probably due to chance

A positive score means that the change is

more than expected by chance

Because the scores are in log form, they

can be added (i.e., the chance that X will

change to Y and then Y to Z)

Disadvantages of PAM

Matrices

A

phylogenetic

tree

must

be

constructed

first,

implying

some

circularity in the analysis

Disadvantage:

The original PAM-1

matrix was based on a limited number

of

families,

not

necessarily

representative of all protein families

The Markov model does not take into

account that multi-step mutations

should be treated differently from

single-step ones

