Вы находитесь на странице: 1из 16

SEMINAR PERSENTATION

TOPIC:

UPGMA

Submitted To: Submitted By:


   
Dr. KOEL MUKHERJEE INDRANIL SARMAH -(MCA/10011/19)
KUSHAGRA GUPTA - (MCA/10026/19)
Assistant Professor
SHUBHAM PATIDAR -(MCA/10037/19)
JAYDEEP KUMAR SILAWAT -(MCA/10051/19)
SWAGAT SONEWANE -(MCA/10024/19)
Phylogenetic tree construction
2 methods

• Distance-based methods –

Examples : UPGMA, Neighbor joining, Fitch-Margoliash method, minimum evolution

• Character-based methods –

Input: Aligned sequences

Output: Phylogenetic tree

Examples : Parsimony ,
Maximum Likelihood
UPGMA
UPGMA : Unweighted Pair Group Method with Arithmetic Mean
Developed by Sokal and Michener in 1958.
It is a Sequential clustering method
Type of distance based method for Phylogenetic Tree construction
UPGMA is the simplest method for constructing trees.
Generates rooted trees
 Generates ultra metric trees from a distance matrix
Uses a simplest algorithm
Input: Distance matrix containing pairwise statistical estimation of
aligned sequences

Output: Phylogenetic tree


UPGMA Algorithm
• UPGMA starts with a matrix of pairwise distances.

• Each sample is denoted as a 'cluster'.

• Assigns all clusters to a star-like tree.

• The algorithm constructs a rooted tree that reflects the structure present in a
pairwise similarity matrix.

• At each step, the nearest two clusters are combined into a higher-level
cluster.

• It assumes an ultra-metric tree in which the distances from the root to


every branch
tip are equal.
Steps
Find the i and j with the smallest distance Dij.
Create a new group (ij) which has n(ij) = ni + nj members.
Connect i and j on the tree to a new node (ij).
Give the edges connecting i to (ij) and j to (ij) same length so that the depth of group
(ij) is Dij/2.
Compute the distance between the new group and all other groups except i and j by
using

𝐷 Dik +𝐷 𝑗𝑘
𝑖𝑗 , 𝑘 = 2

Delete columns and rows corresponding to i and j and add one for (ij). If there are
two or more groups left, go back to the first step
Computational tools
• MEGA
• PHYLIP
• MVSP
• MVSP87
• SAS
• SYN-TAX
• NTSYS
• DendroUPGMA
Advantages

 simple algorithm
 Fastest method
 easy to compute by hand or a variety of software
 Trees reflect phenotypic similarities by phylogenetic distances
 Data can be arranged in random order prior to analysis
 Rooted trees are generated that are easy to analyze
Disadvantages

 It assumes the same evolutionary speed on all lineages


 It frequently generates wrong tree topologies
 Re-rooting is not allowed
 Algorithm does not aim to reflect evolutionary
descent
 It assumes a randomized molecular clock.
Application
s
• In ecology, it is one of the most popular methods for the classification
of sampling units (such as vegetation plots) on the basis of their
pairwise
similarities in relevant descriptor variables (such as species
composition).[3]
• In bioinformatics, UPGMA is used for the creation of phenetic trees
(phenograms). UPGMA was initially designed for use in protein
electrophoresis studies, but is currently most often used to produce
guide trees for more sophi sticated algorithms. This algorithm is for
example
used in sequence alignment procedures, as it proposes one order in
which the sequences will
be aligned. Indeed, the guide tree aims at grouping
the most similar sequences, regardless of their evolutionary rate or
phylogenetic affinities, an d that is exactly the goal of UPGMA.[4]
• In phylogenetics, UPGMA assumes a constant rate of evolution
(molecular clock hypothesis), and is not a wellregarded method for
Example
1. Calculate the pairwise distance matrix

A B C D E F
A 0 1 3 6 7 10
B 1 0 3 6 7 10
C 3 3 0 5 6 9
D 6 6 5 0 1 7
E 7 7 6 1 0 8
F 10 10 9 7 8 0
2. Group the 2 most closely related sequences

A B C D E F

A 0 1 3 6 7 10 0.5
A
B 1 0 3 6 7 10
0.5
C 3 3 0 5 6 9 B

D 6 6 5 0 1 7

E 7 7 6 1 0 8

F 10 10 9 7 8 0
3. Recalculate the distance matrix and take the next
smallest distance

A/B C D E F

A/B 0 3 6 7 10 0.5
A
C 3 0 5 6 9
0.5 B
D 6 5 0 1 7

E 7 6 1 0 8 0.5
D
F 10 9 7 8 0
0.5 E
3. Recalculate the distance matrix and take the next
smallest distance

A/B C D/E F

0.5
A/B 0 3 6.5 10 A
1
C 3 0 5.5 9 0.5 B

D/E 6.5 5.5 0 7.5


1.5
C
F 10 9 7.5 0

0.5
D

0.5 E
3. Recalculate the distance matrix and take the next
smallest distance

A/B/C D/E F

0.5
A/B/C 0 6 9.5 A
1

D/E 6 0 7.5 0.5 B


1.5

F 9.5 7.5 0 1.5


C

0.5
D
2.5
0.5 E
3. Recalculate the distance matrix and take the next
smallest distance

A/B/C/D/E F 0.5
A
A/B/C/D/E 0 8.5 1
0.5
1.5 B
F 8.5 0
1.5
C
1.25

0.5
D
2.5

0.5
E

4.25
F

Вам также может понравиться