Beruflich Dokumente
Kultur Dokumente
ics
IMAGE: http://www.howe.k12.ok.us/~jimaskew/bclasfy.htm
Phylogenetic Analysis
Relationship within a species (HIV-1 vs HIV-2)
IMAGE: http://www-hto.usc.edu/~cbmp/2002/PhylogeneticAnalysis/Distance%20Methods.html
Phylogenetic Analysis
Overview
C-Terminal Motor Kinesin sequences
http://www.proweb.org/kinesin/BE4_Cterm.html
Phylogenetic Analysis
Overview
Objective: determine branch length and figure
out how the tree should be drawn
Example: influenza
Study rapidly changing genes
Next year’s strain can be predicted
Flu vaccination can be developed
Tree of Life
Phylogenies study the evolution of a species
Image: http://microbialgenome.org/primer/tree.html
Tree of Life
Traditionally, morphological (visible features)
characters used to classify organisms
Living organisms
Fossil records
http://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html
/
Tree of Life
http://tolweb.org/tree/phylogeny.html
Evolutionary Trees
Two dimensional graph showing evolutionary
relationship among a set of items
seahorses seahorses
sharks frogs
owls
frogs
Root
Root owls
crocodiles
crocodiles armadillos
armadillos 5% change
50 million years bats
bats
Tony Weisstein, http://bioquest.org:16080/bedrock/terre_haute_03_04/phylogenetics_1.0.ppt
Ultrametricity Additivity
All tips are an equal Distance between any two tips
distance from the root. equals the total branch length
between them.
X
a a X
b Y
b Root e
e Y c d
Root c d
a=b+c+d+e XY = a + b + c + d + e
In simple scenarios, evolutionary trees are ultrametric and
phylograms are additive.
Tony Weisstein, http://bioquest.org:16080/bedrock/terre_haute_03_04/phylogenetics_1.0.ppt
Homology: identical character due to shared ancestry
(evolutionary signal)
lizards +flight
birds worms
snakes
snakes
rodents lizards
primates rodents
+hair snakes
bats +legs
–legs
+flight
Homology Homoplasy
Homoplasy
(Reversal)
(Convergence)
Tony Weisstein, http://bioquest.org:16080/bedrock/terre_haute_03_04/phylogenetics_1.0.ppt
Rooted Trees
One sequence (root) defined to be
common ancestor of all other sequences
Image source:
http://www.ncbi.nlm.nih.gov/About/primer/phylo.html
Unrooted Tree (Star)
Indicates evolutionary relationship without
revealing location of oldest ancestry
Image source:
http://www.shef.ac.uk/english/language/quantling/images/quantling1.jp
Image:
http://www.ncbi.nlm.nih.gov/About/primer/phylo.html
Number of Trees
Unrooted trees Rooted trees
# # pairwise # branches # branches
sequences distances # trees /tree # trees /tree
3 3 1 3 3 4
4 6 3 5 15 6
5 10 15 7 105 8
6 15 105 9 945 10
10 45 2,027,025 17 34,459,425 18
30 435 8.69 × 1036 57 4.95 × 1038 58
N N (N - 1) (2N - 5)! 2N - 3 (2N - 3)! 2N - 2
2 2N - 3 (N - 3)! 2N - 2 (N - 2)!
3 2 3
1 1 1
2 4 3 4 4 2
Maximum Parsimony
Example
Some sites are informative, others are not
4 A C G 2 4 3 4 4 2
Column 2
Tree 1: 4 1 3 1 2 1 3
Tree 2: 5 2 4 3 4 4 2
Tree 3: 6 Column 3
Is a substitution
Maximum Parsimony
Columns representing greater variation
dominate the analysis
http://evolution.genetics.washington.edu/phylip.html
Distance Programs in
Phylip
FITCH: estimates phylogenetic tree assuming
additivity of branch lengths using the Fitch-
Margoliash method
A ACGCGTTGGGCGATGGCAAC
B ACGCGTTGGGCGACGGTAAT
C ACGCATTGAATGATGATAAT
D ACACATTGAGTGATAATAAT
Example of Distance
Analysis
Distances can be shown as a table
A ACGCGTTGGGCGATGGCAAC
B ACGCGTTGGGCGACGGTAAT
C ACGCATTGAATGATGATAAT
D ACACATTGAGTGATAATAAT
Example of Distance
Analysis
Using this information, a tree can be drawn:
A ACGCGTTGGGCGATGGCAAC
B ACGCGTTGGGCGACGGTAAT
C ACGCATTGAATGATGATAAT
D ACACATTGAGTGATAATAAT
A C
2 1
4
1 2
B D
Fitch and Margoliash
Algorithm (3 sequences)
Distance table used
A a
C
c
B b
Fitch and Margoliash
Algorithm (3 sequences)
1) Calculate lengths of tree branches
algebraically:
distance from A to B = a + b = 22 (1)
distance from A to C = a + c = 39 (2)
distance from B to C = b + c = 41 (3)
subtracting (3) from (2) yields:
b + c = 41
-a – c = -39
__________
b – a = 2 (4)
adding (1) and (4) yields 2b = 24; b = 12
so a + 12 = 22; a = 10
10 + c = 39; c = 29
Fitch and Margoliash
Algorithm (3 sequences)
3) Resulting tree:
A 10
C
29
12
B
Fitch and Margoliash
Algorithm (5 sequences)
Algorithm can be extended to more
sequences. Consider the distances:
C
A c
a f
D
d
b g
B
e E
Fitch and Margoliash
Algorithm (5 sequences)
Locate most closely related sequences
Fitch and Margoliash
Algorithm (5 sequences)
create a new table by combining
remaining sequences
B
e E
Fitch and Margoliash
Algorithm (5 sequences)
Treat DE as a single sequence
By algebra, c = 9; g = 5
Fitch and Margoliash
Algorithm (5 sequences)
Continue process until all lengths are found:
C
A c
a 10 f 9
D
20 d
b g5 4
B 12 6
e E
Summary of Fitch-
Margoliash
1) Find the mostly closely related pairs of
sequences (A, B).
B C
A
E
Neighbor Joining
Tree modified by joining pairs of
sequences
∑ d im d mn ∑ d ij
+ d in
S mn = + +
2( N − 2) 2 N −2
Neighbor Joining
If A and B are joined:
B C
A
E
∑ d im d mn ∑ d ij
+ d in
S mn = + +
2( N − 2) 2 N −2
Neighbor Joining
Pair with smallest branch length chosen to
be joined
1 2 6
8
7
3
4
1 2 4 5 3
5
UPGMA
Rooted trees provide distance measures that
can be used with constant molecular clock
Good for generating Best option when Good for very small data
tentative tree, or choosing tractable (<30 taxa, sets and for testing trees
among multiple trees homoplasy rare) built using other methods