Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Maximum Likelihood
Greg Ewing
CIBIV
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Outline
1 Introduction
Markov Process
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Outline
1 Introduction
Markov Process
2 The Likelihood
The Rate Matrix
Rates and Probabilities
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Outline
1 Introduction
Markov Process
2 The Likelihood
The Rate Matrix
Rates and Probabilities
3 Optimisation
Local Maxima
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Outline
1 Introduction
Markov Process
2 The Likelihood
The Rate Matrix
Rates and Probabilities
3 Optimisation
Local Maxima
4 Bootstrap
Introduction
Nonparametric Bootstrap
Parametric bootstrap
Consensus and interpretation
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Outline
1 Introduction
Markov Process
2 The Likelihood
The Rate Matrix
Rates and Probabilities
3 Optimisation
Local Maxima
4 Bootstrap
Introduction
Nonparametric Bootstrap
Parametric bootstrap
Consensus and interpretation
5 Hypothesis testing
LRT
KH & SH
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Outline
1 Introduction
Markov Process
2 The Likelihood
The Rate Matrix
Rates and Probabilities
3 Optimisation
Local Maxima
4 Bootstrap
Introduction
Nonparametric Bootstrap
Parametric bootstrap
Consensus and interpretation
5 Hypothesis testing
LRT
KH & SH
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Stochastic Models
Mathematical Model
A mathematical description of the process of interest, usually
describing how things change over time.
Mathematically define how things change over time.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Stochastic Models
Mathematical Model
A mathematical description of the process of interest, usually
describing how things change over time.
Mathematically define how things change over time.
So if we have a given state, we can predict what will
happen next how the system will behave.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Stochastic Models
Mathematical Model
A mathematical description of the process of interest, usually
describing how things change over time.
Mathematically define how things change over time.
So if we have a given state, we can predict what will
happen next how the system will behave.
Sometimes we can only predict the probability that
something will happen at some time in the future.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Stochastic Models
Mathematical Model
A mathematical description of the process of interest, usually
describing how things change over time.
Mathematically define how things change over time.
So if we have a given state, we can predict what will
happen next how the system will behave.
Sometimes we can only predict the probability that
something will happen at some time in the future.
This is a stochastic model.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Stochastic Models
Mathematical Model
A mathematical description of the process of interest, usually
describing how things change over time.
Mathematically define how things change over time.
So if we have a given state, we can predict what will
happen next how the system will behave.
Sometimes we can only predict the probability that
something will happen at some time in the future.
This is a stochastic model.
Allows a more rigorous mathematical treatment of the
problem of tree reconstruction.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Introduction: ML on Coin Tossing
Given a box with 3 coins of different fairness 1
3, 1
2, 2
3
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Introduction: ML on Coin Tossing
Given a box with 3 coins of different fairness 1
3, 1
2, 2
3
We take out one coin and toss it 20 times:
H, T, T, H, H, T, T, T, T, H, T, T, H, T, H, T, T, H, T, T
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Introduction: ML on Coin Tossing
Given a box with 3 coins of different fairness 1
3, 1
2, 2
3
We take out one coin and toss it 20 times:
H, T, T, H, H, T, T, T, T, H, T, T, H, T, H, T, T, H, T, T
Probability
p(k heads in n tosses|)
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Introduction: ML on Coin Tossing
Given a box with 3 coins of different fairness 1
3, 1
2, 2
3
We take out one coin and toss it 20 times:
H, T, T, H, H, T, T, T, T, H, T, T, H, T, H, T, T, H, T, T
Probability Likelihood
p(k heads in n tosses|)  L(|k heads in n tosses)
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Introduction: ML on Coin Tossing
Given a box with 3 coins of different fairness 1
3, 1
2, 2
3
We take out one coin and toss it 20 times:
H, T, T, H, H, T, T, T, T, H, T, T, H, T, H, T, T, H, T, T
Probability Likelihood
p(k heads in n tosses|)  L(|k heads in n tosses)
=
n
k
k
(1 - )n-k
(The binomial distribution)
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Introduction: ML on Coin Tossing (Estimate)
Three coin case
L(|7 in 20) =
20
7
7
(1 - )13
for each coin   1
3 , 1
2 , 2
3
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Introduction: ML on Coin Tossing (Estimate)
Three coin case
L(|7 in 20) =
20
7
7
(1 - )13
for each coin   1
3 , 1
2 , 2
3
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Introduction: ML on Coin Tossing (Estimate)
Three coin case
L(|7 in 20) =
20
7
7
(1 - )13
for each coin   1
3 , 1
2 , 2
3
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Introduction: ML on Coin Tossing (Estimate)
Three coin case
L(|7 in 20) =
20
7
7
(1 - )13
for each coin   1
3 , 1
2 , 2
3
For infinitely many coins
 = (0...1)
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Introduction: ML on Coin Tossing (Estimate)
Three coin case
L(|7 in 20) =
20
7
7
(1 - )13
for each coin   1
3 , 1
2 , 2
3
For infinitely many coins
 = (0...1)
ML estimate: L(^) = 0.1844
where coin shows ^ = 0.35
heads
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Coins and Mutations
Consider 4 coins labelled A, G, T, C.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Coins and Mutations
Consider 4 coins labelled A, G, T, C.
At each time step we pick any coin at random and flip it.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Coins and Mutations
Consider 4 coins labelled A, G, T, C.
At each time step we pick any coin at random and flip it.
If a coin comes up heads, we replace it from a random pick
of the other coins.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Coins and Mutations
Consider 4 coins labelled A, G, T, C.
At each time step we pick any coin at random and flip it.
If a coin comes up heads, we replace it from a random pick
of the other coins.
Note that the statistics of any column is independent of
other columns.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Coins and Mutations
Flip coins
ACACTTTGTGGTGTGGTGGT
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Coins and Mutations
Flip coins
ACACTTTGTGGTGTGGTGGT
ACACATTGTGGTGTGGTGGT
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Coins and Mutations
Flip coins
ACACTTTGTGGTGTGGTGGT
ACACATTGTGGTGTGGTGGT
ACACATTGTAGTGTGGTGGT
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Coins and Mutations
Flip coins
ACACTTTGTGGTGTGGTGGT
ACACATTGTGGTGTGGTGGT
ACACATTGTAGTGTGGTGGT
ACACATTGTAGTTTGGTGGT
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Coins and Mutations
Flip coins
ACACTTTGTGGTGTGGTGGT
ACACATTGTGGTGTGGTGGT
ACACATTGTAGTGTGGTGGT
ACACATTGTAGTTTGGTGGT
ACACATTGTAGTTTGGAGGT
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Coins and Mutations
Flip coins
ACACTTTGTGGTGTGGTGGT
ACACATTGTGGTGTGGTGGT
ACACATTGTAGTGTGGTGGT
ACACATTGTAGTTTGGTGGT
ACACATTGTAGTTTGGAGGT
We can extend this to continuous time.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Coins and Mutations
Flip coins
ACACTTTGTGGTGTGGTGGT
ACACATTGTGGTGTGGTGGT
ACACATTGTAGTGTGGTGGT
ACACATTGTAGTTTGGTGGT
ACACATTGTAGTTTGGAGGT
We can extend this to continuous time.
Each coin can be biased.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Coins and Mutations
Flip coins
ACACTTTGTGGTGTGGTGGT
ACACATTGTGGTGTGGTGGT
ACACATTGTAGTGTGGTGGT
ACACATTGTAGTTTGGTGGT
ACACATTGTAGTTTGGAGGT
We can extend this to continuous time.
Each coin can be biased.
Formally a Markov process.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Coins and Mutations
Flip coins
ACACTTTGTGGTGTGGTGGT
ACACATTGTGGTGTGGTGGT
ACACATTGTAGTGTGGTGGT
ACACATTGTAGTTTGGTGGT
ACACATTGTAGTTTGGAGGT
We can extend this to continuous time.
Each coin can be biased.
Formally a Markov process.
Result is that we can calculate a probability of a sequence
at some time in the future or past, given the sequence now.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Coins and Mutations
Flip coins
ACACTTTGTGGTGTGGTGGT
ACACATTGTGGTGTGGTGGT
ACACATTGTAGTGTGGTGGT
ACACATTGTAGTTTGGTGGT
ACACATTGTAGTTTGGAGGT
We can extend this to continuous time.
Each coin can be biased.
Formally a Markov process.
Result is that we can calculate a probability of a sequence
at some time in the future or past, given the sequence now.
Need to get mathematical.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Markov Process
Markov Property
The probability distribution of the next state is completely
determined by the previous state.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Markov Process
Markov Property
The probability distribution of the next state is completely
determined by the previous state.
As Maths
Pr(Xn+1 = x|Xn = xn, . . . , X1 = x1) = Pr(Xn+1 = x|Xn = xn)
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Markov Process
Markov Property
The probability distribution of the next state is completely
determined by the previous state.
As Maths
Pr(Xn+1 = x|Xn = xn, . . . , X1 = x1) = Pr(Xn+1 = x|Xn = xn)
In the coin example above, the probability of the new
sequence is completely determined by the previous state.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Markov Process
Markov Property
The probability distribution of the next state is completely
determined by the previous state.
As Maths
Pr(Xn+1 = x|Xn = xn, . . . , X1 = x1) = Pr(Xn+1 = x|Xn = xn)
In the coin example above, the probability of the new
sequence is completely determined by the previous state.
Consider Evolution. The probability of a DNA sequence of
the next generation is completely determined by the
current generation's DNA sequence.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Markov Process
Markov Property
The probability distribution of the next state is completely
determined by the previous state.
As Maths
Pr(Xn+1 = x|Xn = xn, . . . , X1 = x1) = Pr(Xn+1 = x|Xn = xn)
In the coin example above, the probability of the new
sequence is completely determined by the previous state.
Consider Evolution. The probability of a DNA sequence of
the next generation is completely determined by the
current generation's DNA sequence.
In other words the process is memoryless.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Markov Process
Markov Property
The probability distribution of the next state is completely
determined by the previous state.
As Maths
Pr(Xn+1 = x|Xn = xn, . . . , X1 = x1) = Pr(Xn+1 = x|Xn = xn)
In the coin example above, the probability of the new
sequence is completely determined by the previous state.
Consider Evolution. The probability of a DNA sequence of
the next generation is completely determined by the
current generation's DNA sequence.
In other words the process is memoryless.
We can therefore use a Markov process to model evolution.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Markov Process
Assumptions
Ergodic. That is, there is some equilibrium distribution.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Markov Process
Assumptions
Ergodic. That is, there is some equilibrium distribution.
Stationary. The base frequencies are in this equilibrium
distribution.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Markov Process
Assumptions
Ergodic. That is, there is some equilibrium distribution.
Stationary. The base frequencies are in this equilibrium
distribution.
Reversible. The model is the same when time is reversed.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Markov Process
Assumptions
Ergodic. That is, there is some equilibrium distribution.
Stationary. The base frequencies are in this equilibrium
distribution.
Reversible. The model is the same when time is reversed.
Each site in the alignment is independent and identically
distributed.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Outline
1 Introduction
Markov Process
2 The Likelihood
The Rate Matrix
Rates and Probabilities
3 Optimisation
Local Maxima
4 Bootstrap
Introduction
Nonparametric Bootstrap
Parametric bootstrap
Consensus and interpretation
5 Hypothesis testing
LRT
KH & SH
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
The Rate Matrix
Substitution Models
Evolutionary models are often described using a substitution rate
matrix R and character frequencies . Here, 4 × 4 matrix for DNA
models:
A
C T
G
S
P
3'
O
H
N
N
N
H
CH3
S
P
3'
O
H
N
N
O
S
P
3'
HN
N
N
N
O
N H
H
S
P
3'
N
N
N
N
H
H
N
b
a
d c
f
e
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
The Rate Matrix
Substitution Models
Evolutionary models are often described using a substitution rate
matrix R and character frequencies . Here, 4 × 4 matrix for DNA
models:
A
C T
G
S
P
3'
O
H
N
N
N
H
CH3
S
P
3'
O
H
N
N
O
S
P
3'
HN
N
N
N
O
N H
H
S
P
3'
N
N
N
N
H
H
N
b
a
d c
f
e
R =
A C G T


- a b c
a - d e
b d - f
c e f -


 = (A, C, G, T )
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
The Rate Matrix
Relations between DNA models
equal base frequencies
JC69
2 transitions)
(transversions,
3 subst. types 6 subst. types
(4 transversions,
2 transitions)
frequencies
different base
2 subst. types
(transitions vs.
transversions)
frequencies
1 substitution type,
different base
F81
2 subst. types
(transitions vs.
transversions)
HKY85 TN93 GTR
K2P
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
The Rate Matrix
Protein Models
Generally this is the same for protein sequences, but with
20 × 20 matrices. However unlike DNA the matrix is never
optimised. Some protein models are:
Poisson model ("JC69" for proteins)
Dayhoff (Dayhoff et al., 1978)
JTT (Jones et al., 1992)
mtREV (Adachi & Hasegawa, 1996)
cpREV (Adachi et al., 2000)
VT (Müller & Vingron, 2000)
WAG (Whelan & Goldman, 2000)
BLOSUM 62 (Henikoff & Henikoff, 1992)
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Rates and Probabilities
From Substitution rates to probabilities
. . . R and  are combined into the instantaneous rate matrix Q
Q =


~A aC bG cT
aA
~C dG eT
bA dC
~G fT
cA eC fG
~T


~A = -(aC + bG + cT )
~C = -(aA + dG + eT )
~G = -(bA + dC + fT )
~T = -(cA + eC + fG)
(where the row sums are zero).
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Rates and Probabilities
From Substitution rates to probabilities
. . . R and  are combined into the instantaneous rate matrix Q
Q =


~A aC bG cT
aA
~C dG eT
bA dC
~G fT
cA eC fG
~T


~A = -(aC + bG + cT )
~C = -(aA + dG + eT )
~G = -(bA + dC + fT )
~T = -(cA + eC + fG)
(where the row sums are zero).
Given now the instantaneous rate matrix Q, we can compute a
substitution probability matrix P at time t as
P(t) = eQt
. With this matrix P we can compute the probability Pij(t) of a
change i  j over a time t.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Rates and Probabilities
From Substitution rates to probabilities
. . . R and  are combined into the instantaneous rate matrix Q
Q =


~A aC bG cT
aA
~C dG eT
bA dC
~G fT
cA eC fG
~T


~A = -(aC + bG + cT )
~C = -(aA + dG + eT )
~G = -(bA + dC + fT )
~T = -(cA + eC + fG)
(where the row sums are zero).
Given now the instantaneous rate matrix Q, we can compute a
substitution probability matrix P at time t as
P(t) = eQt
. With this matrix P we can compute the probability Pij(t) of a
change i  j over a time t.
That is Pr(Xt = j|X0 = j) = Pij(t)
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Rates and Probabilities
Probability of the data
Start with a sequence s = {AGGT} at time 0.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Rates and Probabilities
Probability of the data
Start with a sequence s = {AGGT} at time 0.
We can calculate the probability that the sequence
changed to s = {ACGA} at t.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Rates and Probabilities
Probability of the data
Start with a sequence s = {AGGT} at time 0.
We can calculate the probability that the sequence
changed to s = {ACGA} at t.
First we calculate P(t) = eQt usually using some
eigenvalue decomposition of Qt.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Rates and Probabilities
Probability of the data
Start with a sequence s = {AGGT} at time 0.
We can calculate the probability that the sequence
changed to s = {ACGA} at t.
First we calculate P(t) = eQt usually using some
eigenvalue decomposition of Qt.
Let si be the character at the i'th position,  be the number
of characters in s and s. Pij(t) is the probability that
character i changed to character j.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Rates and Probabilities
Probability of the data
Start with a sequence s = {AGGT} at time 0.
We can calculate the probability that the sequence
changed to s = {ACGA} at t.
First we calculate P(t) = eQt usually using some
eigenvalue decomposition of Qt.
Let si be the character at the i'th position,  be the number
of characters in s and s. Pij(t) is the probability that
character i changed to character j.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Rates and Probabilities
Probability of the data
Start with a sequence s = {AGGT} at time 0.
We can calculate the probability that the sequence
changed to s = {ACGA} at t.
First we calculate P(t) = eQt usually using some
eigenvalue decomposition of Qt.
Let si be the character at the i'th position,  be the number
of characters in s and s. Pij(t) is the probability that
character i changed to character j.
P(s
|s, t) =

i=1
Psis
i
(t)
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Rates and Probabilities
Probability of the data
Start with a sequence s = {AGGT} at time 0.
We can calculate the probability that the sequence
changed to s = {ACGA} at t.
First we calculate P(t) = eQt usually using some
eigenvalue decomposition of Qt.
Let si be the character at the i'th position,  be the number
of characters in s and s. Pij(t) is the probability that
character i changed to character j.
P(s
|s, t) =

i=1
Psis
i
(t)
Consider finding the value of t where this is maximised.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Rates and Probabilities
Computing ML Distances Using Pij(t)
The Likelihood of sequence s evolving to s in time t:
L(t|s  s
) = P(s
|s, t) =

i=1
Psis
i
(t)
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Rates and Probabilities
Computing ML Distances Using Pij(t)
The Likelihood of sequence s evolving to s in time t:
L(t|s  s
) = P(s
|s, t) =

i=1
Psis
i
(t)
Likelihood surface for two
sequences under JC69:
GATCCTGAGAGAAATAAAC
GGTCCTGACAGAAATAAAC
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Rates and Probabilities
Computing ML Distances Using Pij(t)
The Likelihood of sequence s evolving to s in time t:
L(t|s  s
) = P(s
|s, t) =

i=1
Psis
i
(t)
Likelihood surface for two
sequences under JC69:
GATCCTGAGAGAAATAAAC
GGTCCTGACAGAAATAAAC
Note: we do not compute the
probability of the distance t
but that of the data D = {s, s
}.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Rates and Probabilities
Likelihoods of a Single column tree
A
G
C
T
1
T
G
C
A 0.0009
0.0273
0.0273
0.0009
U
Ut =10 tV=10
V
1
A
G
C
T
A
G
C
T
1
t =10Tt =10S
TS
A
C
G
T 0.000075
0.023402
0.000075
0.000771
W
Likelihoods of nucleotides at inner
nodes:
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Rates and Probabilities
Likelihoods of a Single column tree
A
G
C
T
1
T
G
C
A 0.0009
0.0273
0.0273
0.0009
U
Ut =10 tV=10
V
1
A
G
C
T
A
G
C
T
1
t =10Tt =10S
TS
A
C
G
T 0.000075
0.023402
0.000075
0.000771
W
Likelihoods of nucleotides at inner
nodes:
LU (i) = [PiC(10)  L(C)]  [PiG(10)  L(G)]
LW (i) =
u
Piu(tU )  LU (u) 
v
Piv(tV )  LV (v)
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Rates and Probabilities
Likelihoods of a Single column tree
A
G
C
T
1
T
G
C
A 0.0009
0.0273
0.0273
0.0009
U
Ut =10 tV=10
V
1
A
G
C
T
A
G
C
T
1
t =10Tt =10S
TS
A
C
G
T 0.000075
0.023402
0.000075
0.000771
W
Likelihoods of nucleotides at inner
nodes:
LU (i) = [PiC(10)  L(C)]  [PiG(10)  L(G)]
LW (i) =
u
Piu(tU )  LU (u) 
v
Piv(tV )  LV (v)
Site-Likelihood of an alignment
column k:
L(k)
=
i
i  LW (i) = 0.024323
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Rates and Probabilities
Likelihoods of Trees (multiple columns)
G
CT T
AA
10 10
1010
U
W
T
V
S
CAA
0.047554
0.047554
0.024323
Considering this tree with n = 3 sequences
of length  = 3 the tree likelihood
of this tree is
L(T ) =

k=1
L(k)
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Rates and Probabilities
Likelihoods of Trees (multiple columns)
G
CT T
AA
10 10
1010
U
W
T
V
S
CAA
0.047554
0.047554
0.024323
Considering this tree with n = 3 sequences
of length  = 3 the tree likelihood
of this tree is
L(T ) =

k=1
L(k)
= 0.0475542
 0.024323
= 0.000055
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Rates and Probabilities
Likelihoods of Trees (multiple columns)
G
CT T
AA
10 10
1010
U
W
T
V
S
CAA
0.047554
0.047554
0.024323
Considering this tree with n = 3 sequences
of length  = 3 the tree likelihood
of this tree is
L(T ) =

k=1
L(k)
= 0.0475542
 0.024323
= 0.000055
or the log-likelihood
ln L(T ) =

k=1
ln L(k)
= -9.80811
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Outline
1 Introduction
Markov Process
2 The Likelihood
The Rate Matrix
Rates and Probabilities
3 Optimisation
Local Maxima
4 Bootstrap
Introduction
Nonparametric Bootstrap
Parametric bootstrap
Consensus and interpretation
5 Hypothesis testing
LRT
KH & SH
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Optimise branch lengths
To compute optimal branch lengths:
Initialise the branch lengths
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Optimise branch lengths
To compute optimal branch lengths:
Initialise the branch lengths
Starting with a branch, adjust the length calculating the log
Likelihood until a maximum is found.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Optimise branch lengths
To compute optimal branch lengths:
Initialise the branch lengths
Starting with a branch, adjust the length calculating the log
Likelihood until a maximum is found.
Do the same to other branches and repeat until no further
improvement can be made.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Optimise branch lengths
To compute optimal branch lengths:
Initialise the branch lengths
Starting with a branch, adjust the length calculating the log
Likelihood until a maximum is found.
Do the same to other branches and repeat until no further
improvement can be made.
Model parameters can also be optimised (ie ).
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Optimise branch lengths
To compute optimal branch lengths:
Initialise the branch lengths
Starting with a branch, adjust the length calculating the log
Likelihood until a maximum is found.
Do the same to other branches and repeat until no further
improvement can be made.
Model parameters can also be optimised (ie ).
Note traditional multivariate optimisation can apply.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Optimise branch lengths
To compute optimal branch lengths:
Initialise the branch lengths
Starting with a branch, adjust the length calculating the log
Likelihood until a maximum is found.
Do the same to other branches and repeat until no further
improvement can be made.
Model parameters can also be optimised (ie ).
Note traditional multivariate optimisation can apply.
Changing the topology is much harder.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Finding the ML Tree
Exhaustive Search
Guarantees to find the optimal tree, because all trees are
evaluated, but not feasible for more than 10-12 taxa.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Finding the ML Tree
Exhaustive Search
Guarantees to find the optimal tree, because all trees are
evaluated, but not feasible for more than 10-12 taxa.
Branch and Bound
Guarantees to find the optimal tree, without searching certain
parts of the tree space ­ can run on more sequences, but often
not for current-day datasets.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Finding the ML Tree
Exhaustive Search
Guarantees to find the optimal tree, because all trees are
evaluated, but not feasible for more than 10-12 taxa.
Branch and Bound
Guarantees to find the optimal tree, without searching certain
parts of the tree space ­ can run on more sequences, but often
not for current-day datasets.
Heuristics
Cannot guarantee to find the optimal tree, but are at least able
to analyse large datasets.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Build up a tree: Stepwise Insertion
A
C
B
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Build up a tree: Stepwise Insertion
A
C
B BA
C D
BA
CD
A
B
C
D
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Build up a tree: Stepwise Insertion
A
C
B BA
C D
BA
CD
A
B
C
D
-3920.21
-3689.22
-3920.98
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Build up a tree: Stepwise Insertion
A
C
B BA
C D
BA
CD
A
B
C
D
-3920.21
-3689.22
-3920.98
B
D
A
C
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Build up a tree: Stepwise Insertion
A
C
B BA
C D
BA
CD
A
B
C
D
-3920.21
-3689.22
-3920.98
B
D
A
C
BC
D
A
E
B
A D
C
E
BC
A D
E
B
A
C
D
E
D
A
C
B
E
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Build up a tree: Stepwise Insertion
A
C
B BA
C D
BA
CD
A
B
C
D
-3920.21
-3689.22
-3920.98
B
D
A
C
BC
D
A
E
B
A D
C
E
BC
A D
E
B
A
C
D
E
D
A
C
B
E
B
A
C
D
E
-4710.37
-4560.70
-4521.39
-4579.17-4610.40
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Local Maxima
Local Maxima
What if we have multiple maxima in the likelihood surface?
Use Tree rearrangements to escape local maxima.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Local Maxima
Tree Rearrangements
B
A C
D
E
F
H
IG
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Local Maxima
Tree Rearrangements
B
A C
D
E
F
H
IG
B
A C
D
E
F
H
IG
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Local Maxima
Tree Rearrangements
B
A C
D
E
F
H
IG
B
A C
D
E
F
H
IG
B
A C
D
G
H
I
E
F
Possible NNI trees = O(n)
Nearest Neighbor Interchange
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Local Maxima
Tree Rearrangements
B
A C
D
E
F
H
IG
B
A C
D
E
F
H
IG
B
A C
D
G
H
I
E
F
Possible NNI trees = O(n)
Nearest Neighbor Interchange
A
B
C
D
E
F
G
H
I
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Local Maxima
Tree Rearrangements
B
A C
D
E
F
H
IG
B
A C
D
E
F
H
IG
B
A C
D
G
H
I
E
F
Possible NNI trees = O(n)
Nearest Neighbor Interchange
A
B
C
D
E
F
G
H
I
H
IG
D
BA
C
FE
...
subtree pruning + regrafting
Possible SPR trees = O(n*n)
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Local Maxima
Tree Rearrangements
B
A C
D
E
F
H
IG
B
A C
D
E
F
H
IG
B
A C
D
G
H
I
E
F
Possible NNI trees = O(n)
Nearest Neighbor Interchange
A
B
C
D
E
F
G
H
I
H
IG
D
BA
C
FE
...
subtree pruning + regrafting
Possible SPR trees = O(n*n)
H
IG
D
BA
FE
C
...
......
tree-bisection + reconnection
Possible TBR trees = O(n*n*n)
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Outline
1 Introduction
Markov Process
2 The Likelihood
The Rate Matrix
Rates and Probabilities
3 Optimisation
Local Maxima
4 Bootstrap
Introduction
Nonparametric Bootstrap
Parametric bootstrap
Consensus and interpretation
5 Hypothesis testing
LRT
KH & SH
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Introduction
Bootstraps
Usually when we estimate some parameter from data, we
have some measure of variability. ie Mean and standard
deviation.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Introduction
Bootstraps
Usually when we estimate some parameter from data, we
have some measure of variability. ie Mean and standard
deviation.
We want to be able to do the same with trees.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Introduction
Bootstraps
Usually when we estimate some parameter from data, we
have some measure of variability. ie Mean and standard
deviation.
We want to be able to do the same with trees.
The bootstrap is a general statistical method that can be
used in this case.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Introduction
Bootstraps
Usually when we estimate some parameter from data, we
have some measure of variability. ie Mean and standard
deviation.
We want to be able to do the same with trees.
The bootstrap is a general statistical method that can be
used in this case.
Nonparametric bootstrap, just re-samples the alignment.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Introduction
Bootstraps
Usually when we estimate some parameter from data, we
have some measure of variability. ie Mean and standard
deviation.
We want to be able to do the same with trees.
The bootstrap is a general statistical method that can be
used in this case.
Nonparametric bootstrap, just re-samples the alignment.
Parametric bootstrap uses model parameters to generate
replicate data.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Introduction
Bootstraps
Usually when we estimate some parameter from data, we
have some measure of variability. ie Mean and standard
deviation.
We want to be able to do the same with trees.
The bootstrap is a general statistical method that can be
used in this case.
Nonparametric bootstrap, just re-samples the alignment.
Parametric bootstrap uses model parameters to generate
replicate data.
Bayesian methods usually get this for "free" because we
already have a large set of trees that represent potions in
the posterior density.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Introduction
Pros and Cons
Pros
Established statistical
method.
Cons
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Introduction
Pros and Cons
Pros
Established statistical
method.
Simple to implement.
Cons
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Introduction
Pros and Cons
Pros
Established statistical
method.
Simple to implement.
Studies indicate that iťs
quite conservative.
Cons
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Introduction
Pros and Cons
Pros
Established statistical
method.
Simple to implement.
Studies indicate that iťs
quite conservative.
Cons
Results have no
convenient interpretation.
ie 50% support does not
mean 50% probability.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Introduction
Pros and Cons
Pros
Established statistical
method.
Simple to implement.
Studies indicate that iťs
quite conservative.
Cons
Results have no
convenient interpretation.
ie 50% support does not
mean 50% probability.
Some strong assumptions
are imposed on the data.
ie iid.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Introduction
Pros and Cons
Pros
Established statistical
method.
Simple to implement.
Studies indicate that iťs
quite conservative.
Cons
Results have no
convenient interpretation.
ie 50% support does not
mean 50% probability.
Some strong assumptions
are imposed on the data.
ie iid.
Relies on the fact that the
data sample we are using
is representative of entire
"population" of data.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Introduction
Bootstrap flow
Estimate a ML tree and the model parameters .
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Introduction
Bootstrap flow
Estimate a ML tree and the model parameters .
From the data/or estimateted parameters, generate
replicate data sets.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Introduction
Bootstrap flow
Estimate a ML tree and the model parameters .
From the data/or estimateted parameters, generate
replicate data sets.
For each replicate data set estimate a replicate ML tree.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Introduction
Bootstrap flow
Estimate a ML tree and the model parameters .
From the data/or estimateted parameters, generate
replicate data sets.
For each replicate data set estimate a replicate ML tree.
Combine the replicate ML trees into some kind of
consensus tree.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Nonparametric Bootstrap
Nonparametric Bootstrap
Nonparametric bootstrap samples the alignment with
replacement.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Nonparametric Bootstrap
Nonparametric Bootstrap
Nonparametric bootstrap samples the alignment with
replacement.
A site, or column in the alignment is picked at random.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Nonparametric Bootstrap
Nonparametric Bootstrap
Nonparametric bootstrap samples the alignment with
replacement.
A site, or column in the alignment is picked at random.
This column of sequence data is placed into the replicate
alignment.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Nonparametric Bootstrap
Nonparametric Bootstrap
Nonparametric bootstrap samples the alignment with
replacement.
A site, or column in the alignment is picked at random.
This column of sequence data is placed into the replicate
alignment.
Some columns will appear more than once in the replicate
alignment.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Nonparametric Bootstrap
Nonparametric Bootstrap
Nonparametric bootstrap samples the alignment with
replacement.
A site, or column in the alignment is picked at random.
This column of sequence data is placed into the replicate
alignment.
Some columns will appear more than once in the replicate
alignment.
Other columns will not appear at all.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Nonparametric Bootstrap
Nonparametric Bootstrap
Nonparametric bootstrap samples the alignment with
replacement.
A site, or column in the alignment is picked at random.
This column of sequence data is placed into the replicate
alignment.
Some columns will appear more than once in the replicate
alignment.
Other columns will not appear at all.
Requires that the data is IID across sites.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Nonparametric Bootstrap
Original Data
A C A C G C T T T A
A G A T G C T T A A
A C C C C - - G T A
A T A C C C T T T T
A T - - C C T T T A
Re-sampled Data
C
G
C
T
T
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Nonparametric Bootstrap
Original Data
A C A C G C T T T A
A G A T G C T T A A
A C C C C - - G T A
A T A C C C T T T T
A T - - C C T T T A
Re-sampled Data
C
G
C
T
T
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Nonparametric Bootstrap
Original Data
A C A C G C T T T A
A G A T G C T T A A
A C C C C - - G T A
A T A C C C T T T T
A T - - C C T T T A
Re-sampled Data
C A
G A
C A
T T
T A
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Nonparametric Bootstrap
Original Data
A C A C G C T T T A
A G A T G C T T A A
A C C C C - - G T A
A T A C C C T T T T
A T - - C C T T T A
Re-sampled Data
C A T
G A T
C A T
T T
T A T
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Nonparametric Bootstrap
Original Data
A C A C G C T T T A
A G A T G C T T A A
A C C C C - - G T A
A T A C C C T T T T
A T - - C C T T T A
Re-sampled Data
C A T C
G A T G
C A - C
T T T T
T A T T
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Nonparametric Bootstrap
Original Data
A C A C G C T T T A
A G A T G C T T A A
A C C C C - - G T A
A T A C C C T T T T
A T - - C C T T T A
Re-sampled Data
C A T C C
G A T G T
C A - C C
T T T T C
T A T T -
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Nonparametric Bootstrap
Original Data
A C A C G C T T T A
A G A T G C T T A A
A C C C C - - G T A
A T A C C C T T T T
A T - - C C T T T A
Re-sampled Data
C A T C C T
G A T G T T
C A - C C G
T T T T C T
T A T T - T
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Nonparametric Bootstrap
Original Data
A C A C G C T T T A
A G A T G C T T A A
A C C C C - - G T A
A T A C C C T T T T
A T - - C C T T T A
Re-sampled Data
C A T C C T T
G A T G T T A
C A - C C G T
T T T T C T T
T A T T - T T
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Nonparametric Bootstrap
Original Data
A C A C G C T T T A
A G A T G C T T A A
A C C C C - - G T A
A T A C C C T T T T
A T - - C C T T T A
Re-sampled Data
C A T C C T T T
G A T G T T A T
C A - C C G T G
T T T T C T T T
T A T T - T T T
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Nonparametric Bootstrap
Original Data
A C A C G C T T T A
A G A T G C T T A A
A C C C C - - G T A
A T A C C C T T T T
A T - - C C T T T A
Re-sampled Data
C A T C C T T T C
G A T G T T A T T
C A - C C G T G C
T T T T C T T T C
T A T T - T T T -
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Nonparametric Bootstrap
Original Data
A C A C G C T T T A
A G A T G C T T A A
A C C C C - - G T A
A T A C C C T T T T
A T - - C C T T T A
Re-sampled Data
C A T C C T T T C G
G A T G T T A T T G
C A - C C G T G C C
T T T T C T T T C C
T A T T - T T T - C
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Nonparametric Bootstrap
Original Data
A C A C G C T T T A
A G A T G C T T A A
A C C C C - - G T A
A T A C C C T T T T
A T - - C C T T T A
Re-sampled Data
C A T C C T T T C G
G A T G T T A T T G
C A - C C G T G C C
T T T T C T T T C C
T A T T - T T T - C
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Nonparametric Bootstrap
Original Data
A C A C G C T T T A
A G A T G C T T A A
A C C C C - - G T A
A T A C C C T T T T
A T - - C C T T T A
Re-sampled Data
C A T C C T T T C G
G A T G T T A T T G
C A - C C G T G C C
T T T T C T T T C C
T A T T - T T T - C
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Nonparametric Bootstrap
Original Data
A C A C G C T T T A
A G A T G C T T A A
A C C C C - - G T A
A T A C C C T T T T
A T - - C C T T T A
Re-sampled Data
C A T C C T T T C G
G A T G T T A T T G
C A - C C G T G C C
T T T T C T T T C C
T A T T - T T T - C
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Nonparametric Bootstrap
Original Data
A C A C G C T T T A
A G A T G C T T A A
A C C C C - - G T A
A T A C C C T T T T
A T - - C C T T T A
Re-sampled Data
C A T C C T T T C G
G A T G T T A T T G
C A - C C G T G C C
T T T T C T T T C C
T A T T - T T T - C
Jackknife is the same without replacement
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Parametric bootstrap
Parametric Bootstrap
Instead of re-sampling the data, we use estimated model
parameters.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Parametric bootstrap
Parametric Bootstrap
Instead of re-sampling the data, we use estimated model
parameters.
Start by estimating a ML tree and model parameters .
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Parametric bootstrap
Parametric Bootstrap
Instead of re-sampling the data, we use estimated model
parameters.
Start by estimating a ML tree and model parameters .
Using these estimated parameters and the estimated ML
tree simulate a new replicate data set.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Parametric bootstrap
Parametric Bootstrap
Instead of re-sampling the data, we use estimated model
parameters.
Start by estimating a ML tree and model parameters .
Using these estimated parameters and the estimated ML
tree simulate a new replicate data set.
Estimate a new ML tree and parameters 
.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Parametric bootstrap
Parametric Bootstrap
Instead of re-sampling the data, we use estimated model
parameters.
Start by estimating a ML tree and model parameters .
Using these estimated parameters and the estimated ML
tree simulate a new replicate data set.
Estimate a new ML tree and parameters 
.
In some cases model parameters can be fixed.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Parametric bootstrap
Parametric Bootstrap
Instead of re-sampling the data, we use estimated model
parameters.
Start by estimating a ML tree and model parameters .
Using these estimated parameters and the estimated ML
tree simulate a new replicate data set.
Estimate a new ML tree and parameters 
.
In some cases model parameters can be fixed.
Parametric bootstraps do not make any extra assumptions
about the data over the model.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Consensus and interpretation
Combining the trees
50% Majority rule is conservative and all nodes cannot be
conflicting.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Consensus and interpretation
Combining the trees
50% Majority rule is conservative and all nodes cannot be
conflicting.
Extended consensus rules can vary slightly in
implementation.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Consensus and interpretation
Combining the trees
50% Majority rule is conservative and all nodes cannot be
conflicting.
Extended consensus rules can vary slightly in
implementation.
In particular the extended majority rule (default in
Consensus) can have nodes in the final tree that conflict
with nodes that are more frequent.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Consensus and interpretation
Summarising Trees: Consensus Methods
E
Tree A
Tree B
Tree C
E
A
C
B
B
C
A
C
B
A D
F
D
F
D
F
E
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Consensus and interpretation
Summarising Trees: Consensus Methods
E
Tree A
Tree B
Tree C
E
A
C
B
B
C
A
C
B
A D
F
D
F
D
F
E
AB|CDEF
ABC|DEF
AC|BDEF
ABC|DEF
AC|BDEF
ABC|DEF
ABCD|EF
AC|BDEF - 2 (66.7%)
ABCD|EF - 1 (33.3%)
AB|CDEF - 1 (33.3%)
ABC|DEF - 3 (100%)
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Consensus and interpretation
Summarising Trees: Consensus Methods
E
Tree A
Tree B
Tree C
E
A
C
B
B
C
A
C
B
A D
F
D
F
D
F
E
AB|CDEF
ABC|DEF
AC|BDEF
ABC|DEF
AC|BDEF
ABC|DEF
ABCD|EF
AC|BDEF - 2 (66.7%)
ABCD|EF - 1 (33.3%)
AB|CDEF - 1 (33.3%)
ABC|DEF - 3 (100%)
B
strict consensus
EABC|DEF
C
A D
F
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Consensus and interpretation
Summarising Trees: Consensus Methods
E
Tree A
Tree B
Tree C
E
A
C
B
B
C
A
C
B
A D
F
D
F
D
F
E
AB|CDEF
ABC|DEF
AC|BDEF
ABC|DEF
AC|BDEF
ABC|DEF
ABCD|EF
AC|BDEF - 2 (66.7%)
ABCD|EF - 1 (33.3%)
AB|CDEF - 1 (33.3%)
ABC|DEF - 3 (100%)
B
strict consensus
EABC|DEF
C
A D
F
A
semi-strict
ABCD|EF
ABC|DEF
B
C D
F
E
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Consensus and interpretation
Summarising Trees: Consensus Methods
E
Tree A
Tree B
Tree C
E
A
C
B
B
C
A
C
B
A D
F
D
F
D
F
E
AB|CDEF
ABC|DEF
AC|BDEF
ABC|DEF
AC|BDEF
ABC|DEF
ABCD|EF
AC|BDEF - 2 (66.7%)
ABCD|EF - 1 (33.3%)
AB|CDEF - 1 (33.3%)
ABC|DEF - 3 (100%)
B
strict consensus
EABC|DEF
C
A D
F
A
semi-strict
ABCD|EF
ABC|DEF
B
C D
F
E
C EAC|BDEF
ABC|DEF
A
B
D
F
majority-rule
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Consensus and interpretation
Interpretation
Unfortunately in this setting interpreting bootstrap scores is
not straight forward.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Consensus and interpretation
Interpretation
Unfortunately in this setting interpreting bootstrap scores is
not straight forward.
It is not a probability.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Consensus and interpretation
Interpretation
Unfortunately in this setting interpreting bootstrap scores is
not straight forward.
It is not a probability.
Generally it appears to be somewhat conservative.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Consensus and interpretation
Interpretation
Unfortunately in this setting interpreting bootstrap scores is
not straight forward.
It is not a probability.
Generally it appears to be somewhat conservative.
On the other hand it is not uncommon to see high
bootstrap support for the wrong tree.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Consensus and interpretation
Interpretation
Unfortunately in this setting interpreting bootstrap scores is
not straight forward.
It is not a probability.
Generally it appears to be somewhat conservative.
On the other hand it is not uncommon to see high
bootstrap support for the wrong tree.
One interpretation is that the bootstrap attempts to
measure sampling variance. (Swofford, et al 1996)
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Consensus and interpretation
Example Support of a known tree
Hills et al, 1992. Bacteriophage T7 DNA sequences with a
known phylogeny.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Outline
1 Introduction
Markov Process
2 The Likelihood
The Rate Matrix
Rates and Probabilities
3 Optimisation
Local Maxima
4 Bootstrap
Introduction
Nonparametric Bootstrap
Parametric bootstrap
Consensus and interpretation
5 Hypothesis testing
LRT
KH & SH
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Hypothesis testing
What question do I want to answer?
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Hypothesis testing
What question do I want to answer?
Say should I use the JC model or the GTR model?
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Hypothesis testing
What question do I want to answer?
Say should I use the JC model or the GTR model?
Or perhaps, Is tree A statistically significantly different from
tree B?
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Hypothesis testing
What question do I want to answer?
Say should I use the JC model or the GTR model?
Or perhaps, Is tree A statistically significantly different from
tree B?
Answering these question is the advantage of using ML.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
Hypothesis testing
What question do I want to answer?
Say should I use the JC model or the GTR model?
Or perhaps, Is tree A statistically significantly different from
tree B?
Answering these question is the advantage of using ML.
Iťs important to note that you should know the null
hypothesis/hypotheses before you "collect" the data.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
LRT
Nested models
A model is nested in another model, if it is a simplification
of the complicated model.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
LRT
Nested models
A model is nested in another model, if it is a simplification
of the complicated model.
eg Star topology. GTR vrs JC.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
LRT
Nested models
A model is nested in another model, if it is a simplification
of the complicated model.
eg Star topology. GTR vrs JC.
In such a situation we can consider the likelihood of both
models.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
LRT
Nested models
A model is nested in another model, if it is a simplification
of the complicated model.
eg Star topology. GTR vrs JC.
In such a situation we can consider the likelihood of both
models.
The Hypothesis: Is the more complicated model better?
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
LRT
Nested models
A model is nested in another model, if it is a simplification
of the complicated model.
eg Star topology. GTR vrs JC.
In such a situation we can consider the likelihood of both
models.
The Hypothesis: Is the more complicated model better?
The Null Hypothesis: Both models are equally good.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
LRT
Nested models
A model is nested in another model, if it is a simplification
of the complicated model.
eg Star topology. GTR vrs JC.
In such a situation we can consider the likelihood of both
models.
The Hypothesis: Is the more complicated model better?
The Null Hypothesis: Both models are equally good.
Note that the more complicated model always has an
equal or higher likelihood.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
LRT
Nested models
A model is nested in another model, if it is a simplification
of the complicated model.
eg Star topology. GTR vrs JC.
In such a situation we can consider the likelihood of both
models.
The Hypothesis: Is the more complicated model better?
The Null Hypothesis: Both models are equally good.
Note that the more complicated model always has an
equal or higher likelihood.
We can use a Log Likelihood ratio test.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
LRT
LRT
Log Likelihood ratio test
 = -2 log
L0
L1
= 2(log L1 - log L0)
 is asymptotically distributed to the 2 distribution with the
appropriate degrees of freedom.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
LRT
LRT
Log Likelihood ratio test
 = -2 log
L0
L1
= 2(log L1 - log L0)
 is asymptotically distributed to the 2 distribution with the
appropriate degrees of freedom.
The degrees of freedom are the difference between the
two models i.e. Star tree compared to a given tree, iťs the
number of internal branches.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
LRT
LRT
Log Likelihood ratio test
 = -2 log
L0
L1
= 2(log L1 - log L0)
 is asymptotically distributed to the 2 distribution with the
appropriate degrees of freedom.
The degrees of freedom are the difference between the
two models i.e. Star tree compared to a given tree, iťs the
number of internal branches.
We calculate  and check if iťs outside our P-value range
on the 2 distribution.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
KH & SH
Tree Tests
LRT cannot be used on different topologies.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
KH & SH
Tree Tests
LRT cannot be used on different topologies.
So two tree test methods have been developed. KH and
SH
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
KH & SH
Tree Tests
LRT cannot be used on different topologies.
So two tree test methods have been developed. KH and
SH
Note that the first test (KH) is often misapplied.
Introduction The Likelihood Optimisation Bootstrap Hypothesis testing
KH & SH
Tree Tests
LRT cannot be used on different topologies.
So two tree test methods have been developed. KH and
SH
Note that the first test (KH) is often misapplied.
The idea is similar to the LRT that there is a statistic that is
compared to a distribution. Only now we must estimate
that distribution.