Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Bayesian Tree Sampling Greg Ewing CIBIV December 3, 2007 Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Outline 1 Bayes Theorem Bayes Theorem Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Outline 1 Bayes Theorem Bayes Theorem 2 Markov Chains definition Properties Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Outline 1 Bayes Theorem Bayes Theorem 2 Markov Chains definition Properties 3 MHMCMC Algorithm Examples Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Outline 1 Bayes Theorem Bayes Theorem 2 Markov Chains definition Properties 3 MHMCMC Algorithm Examples 4 What is long enough Its all about the die Hats Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Outline 1 Bayes Theorem Bayes Theorem 2 Markov Chains definition Properties 3 MHMCMC Algorithm Examples 4 What is long enough Its all about the die Hats 5 Phylogenetic Bayesian MCMC In practice Priors Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Outline 1 Bayes Theorem Bayes Theorem 2 Markov Chains definition Properties 3 MHMCMC Algorithm Examples 4 What is long enough Its all about the die Hats 5 Phylogenetic Bayesian MCMC In practice Priors Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Bayes Theorem The difference The Bayesian approach asks the right question in a hypothesis testing procedure, namely, "What is the probability that this hypothesis is true, given the data?" rather than the classical approach, which asks a question like, "Assuming that this hypothesis is true, what is the probability of the observed data?" ­Statistical Methods in Bioinformatics Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Bayes Theorem Derivation We know that Pr(A B) = Pr(B|A) Pr(A), from conditional probability. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Bayes Theorem Derivation We know that Pr(A B) = Pr(B|A) Pr(A), from conditional probability. Also Pr(A B) = Pr(B A) = Pr(A|B) Pr(B). Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Bayes Theorem Derivation We know that Pr(A B) = Pr(B|A) Pr(A), from conditional probability. Also Pr(A B) = Pr(B A) = Pr(A|B) Pr(B). Therefore Pr(A|B) Pr(B) = Pr(B|A) Pr(A) Pr(A|B) = Pr(B|A) Pr(A) Pr(B) . This is Bayes formula or theorem. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Bayes Theorem Bayes Theorem Pr(A|B) = Pr(B|A) Pr(A) Pr(B) Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Bayes Theorem Bayes Theorem Pr(A|B) = Pr(B|A) Pr(A) Pr(B) Pr(A|B) Posterior Density Likelihood L(A, B) Prior Pr(A) Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Bayes Theorem Bayes Theorem Pr(A|B) = Pr(B|A) Pr(A) Pr(B) Bayesian, flips the probability around. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Bayes Theorem Bayes Theorem Pr(A|B) = Pr(B|A) Pr(A) Pr(B) Bayesian, flips the probability around. It is easy to include prior information which is often available. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Bayes Theorem Bayes Theorem Pr(A|B) = Pr(B|A) Pr(A) Pr(B) Bayesian, flips the probability around. It is easy to include prior information which is often available. The Bayesian conditional probability is perhaps more intuitive. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Bayes Theorem Making formulas tangible Pr(T, M|D) Pr(D|T, M) Pr(T, M) The likelihood is L(T, D, M) = Pr(D|T, M) Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Bayes Theorem Making formulas tangible Pr(T, M|D) Pr(D|T, M) Pr(T, M) The likelihood is L(T, D, M) = Pr(D|T, M) T is the tree. D is the DNA/Protein etc sequence data. M is the model parameters, like GTR. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Bayes Theorem Making formulas tangible Pr(T, M|D) Pr(D|T, M) Pr(T, M) The likelihood is L(T, D, M) = Pr(D|T, M) T is the tree. D is the DNA/Protein etc sequence data. M is the model parameters, like GTR. In words: The likelihood is the probability of the DNA data given the Tree and the model parameters. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Bayes Theorem Making formulas tangible Pr(T, M|D) Pr(D|T, M) Pr(T, M) The likelihood is L(T, D, M) = Pr(D|T, M) T is the tree. D is the DNA/Protein etc sequence data. M is the model parameters, like GTR. In words: The likelihood is the probability of the DNA data given the Tree and the model parameters. The Prior is Pr(T, M) and indicates any information we already know. i.e. The root is not older than 10 million years. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Bayes Theorem Making formulas tangible Pr(T, M|D) Pr(D|T, M) Pr(T, M) The likelihood is L(T, D, M) = Pr(D|T, M) T is the tree. D is the DNA/Protein etc sequence data. M is the model parameters, like GTR. In words: The likelihood is the probability of the DNA data given the Tree and the model parameters. The Prior is Pr(T, M) and indicates any information we already know. i.e. The root is not older than 10 million years. The Posterior density is Pr(T, M|D) the probability of the tree and model parameters given the sequence data. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Bayes Theorem The Bad News We can't directly solve for the posterior distribution. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Bayes Theorem The Bad News We can't directly solve for the posterior distribution. Therefore MHMCMC must be used, this means it will take a lot of computer resources. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Bayes Theorem The Bad News We can't directly solve for the posterior distribution. Therefore MHMCMC must be used, this means it will take a lot of computer resources. The "answer" is not a tree, but a distribution of trees/states. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Bayes Theorem The Bad News We can't directly solve for the posterior distribution. Therefore MHMCMC must be used, this means it will take a lot of computer resources. The "answer" is not a tree, but a distribution of trees/states. It will always be slower than ML. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Outline 1 Bayes Theorem Bayes Theorem 2 Markov Chains definition Properties 3 MHMCMC Algorithm Examples 4 What is long enough Its all about the die Hats 5 Phylogenetic Bayesian MCMC In practice Priors Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Markov Chains Assume that I have a machine that outputs random numbers, ie a chain of numbers. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Markov Chains Assume that I have a machine that outputs random numbers, ie a chain of numbers. If I can work out the probability of the next output by only looking at the previous output, it is said to have the Markov property. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Markov Chains Assume that I have a machine that outputs random numbers, ie a chain of numbers. If I can work out the probability of the next output by only looking at the previous output, it is said to have the Markov property. Example: Our machine flips a coin and either adds one to the last output or subtracts one. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Markov Chains Assume that I have a machine that outputs random numbers, ie a chain of numbers. If I can work out the probability of the next output by only looking at the previous output, it is said to have the Markov property. Example: Our machine flips a coin and either adds one to the last output or subtracts one. Machine Output 1,2,1,0,1,0,-1,-2,-3,-2,-3,-4,-3,-2,-1,0,-1,0,1,2,1,2,3 Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Markov Chains Assume that I have a machine that outputs random numbers, ie a chain of numbers. If I can work out the probability of the next output by only looking at the previous output, it is said to have the Markov property. Example: Our machine flips a coin and either adds one to the last output or subtracts one. Machine Output 1,2,1,0,1,0,-1,-2,-3,-2,-3,-4,-3,-2,-1,0,-1,0,1,2,1,2,3 We don't care about the whole sequence, just the last output which is 3. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Markov Chains Assume that I have a machine that outputs random numbers, ie a chain of numbers. If I can work out the probability of the next output by only looking at the previous output, it is said to have the Markov property. Example: Our machine flips a coin and either adds one to the last output or subtracts one. Machine Output 1,2,1,0,1,0,-1,-2,-3,-2,-3,-4,-3,-2,-1,0,-1,0,1,2,1,2,3 We don't care about the whole sequence, just the last output which is 3. The next item has a 50% chance that it will be a 4, and a 50% chance that it will be a 2. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Markov Chains Assume that I have a machine that outputs random numbers, ie a chain of numbers. If I can work out the probability of the next output by only looking at the previous output, it is said to have the Markov property. Example: Our machine flips a coin and either adds one to the last output or subtracts one. Machine Output 1,2,1,0,1,0,-1,-2,-3,-2,-3,-4,-3,-2,-1,0,-1,0,1,2,1,2,3 We don't care about the whole sequence, just the last output which is 3. The next item has a 50% chance that it will be a 4, and a 50% chance that it will be a 2. This is a Markov Chain. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Definition of a Markov Chain Definition A Markov Chain is a chain of randomly chosen values where the probability of the next value is entirely determined by the previous value. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Definition of a Markov Chain Definition A Markov Chain is a chain of randomly chosen values where the probability of the next value is entirely determined by the previous value. Rough Math definition Pr(Xn|Xn-1, Xn-2, . . .) = Pr(Xn|Xn-1) Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Markov Chain Graph State Graph A B c Simple Markov Chains can be represented as a graph. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Markov Chain Graph State Graph A B c Simple Markov Chains can be represented as a graph. Nodes or circles represent states (the last output). Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Markov Chain Graph State Graph A B c Simple Markov Chains can be represented as a graph. Nodes or circles represent states (the last output). Arrows are transitions between states. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Markov Chain Graph State Graph A B c Simple Markov Chains can be represented as a graph. Nodes or circles represent states (the last output). Arrows are transitions between states. Transitions (Arrows) usually have probabilities on them. That is the probability that this transition will be followed. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Markov Chain Graph State Graph A B c Simple Markov Chains can be represented as a graph. Nodes or circles represent states (the last output). Arrows are transitions between states. Transitions (Arrows) usually have probabilities on them. That is the probability that this transition will be followed. For clarity, when transitions are equiprobable we omit the transition probabilities. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Markov Chain Graph Example State Graph A B c Output A Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Markov Chain Graph Example State Graph A B c Output A C Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Markov Chain Graph Example State Graph A B c Output A C B Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Markov Chain Graph Example State Graph A B c Output A C B A Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Markov Chain Graph Example State Graph A B c Output A C B A B Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Markov Chain Graph Example State Graph A B c Output A C B A B A Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Markov Chain Graph Example State Graph A B c Output A C B A B A B Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Markov Chain Graph Example State Graph A B c Output A C B A B A B A Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Markov Chain Graph Example State Graph A B c Output A C B A B A B A C Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Markov Chain Graph Example State Graph A B c Output A C B A B A B A C B Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Markov Chain Graph Example State Graph A B c Output A C B A B A B A C B A Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Markov Chain Graph Example State Graph A B c Output A C B A B A B A C B A C Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary definition Markov Chain Graph Example State Graph A B c Output A C B A B A B A C B A C Note that the states can be anything. ie different trees Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Irreducibility Reducible state diagram A B c D E Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Irreducibility Reducible state diagram A B c D E Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Irreducibility Reducible state diagram A B c D E Definition A Markov Chain is Irreducible if and only if the chain can get from any possible state to any other possible state eventually. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Irreducibility Reducible state diagram A B c D E Definition A Markov Chain is Irreducible if and only if the chain can get from any possible state to any other possible state eventually. The above state diagram is NOT irreducible. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Irreducibility Reducible state diagram A B c D E Definition A Markov Chain is Irreducible if and only if the chain can get from any possible state to any other possible state eventually. The above state diagram is NOT irreducible. Adding a transition from D C it would make this irreducible Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Reversibility A B c Is this output reversed? C A B C A B A B A B C Note that there is no C B transition or C A transition. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Reversibility A B c Is this output reversed? C A B C A B A B A B C Note that there is no C B transition or C A transition. Therefore we can tell that this output sequence is reversed. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Reversibility Tricky Example A B c 0. 5 0. 5 0. 1 0.5 0.9 0.5 Is this output reversed? A B C A B C B C B C B C A B C B C A B A Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Reversibility Is this output reversed? A B C A B C B C B C B C A B C B C A B A The transition B A is much less likely than B C in the forward direction. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Reversibility Is this output reversed? A B C A B C B C B C B C A B C B C A B A The transition B A is much less likely than B C in the forward direction. In this example there are 7 B C transitions and only 1 B A transition in the forward direction. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Reversibility Is this output reversed? A B C A B C B C B C B C A B C B C A B A The transition B A is much less likely than B C in the forward direction. In this example there are 7 B C transitions and only 1 B A transition in the forward direction. Conversely there are 4 B C transitions and 4 B A transitions in the reverse direction. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Reversibility Is this output reversed? A B C A B C B C B C B C A B C B C A B A The transition B A is much less likely than B C in the forward direction. In this example there are 7 B C transitions and only 1 B A transition in the forward direction. Conversely there are 4 B C transitions and 4 B A transitions in the reverse direction. It seems we can guess that this output is not reversed. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Reversibility Is this output reversed? A B C A B C B C B C B C A B C B C A B A The transition B A is much less likely than B C in the forward direction. In this example there are 7 B C transitions and only 1 B A transition in the forward direction. Conversely there are 4 B C transitions and 4 B A transitions in the reverse direction. It seems we can guess that this output is not reversed. But we stick to simple definitions for this course. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Reversibility Definition A Markov Chain is reversible if we cannot detect whether or not the chain is running in "reverse". That is the output is statistically identicle in both directions. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Aperiodic Periodic-Aperiodic 1 3 2 Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Aperiodic Periodic-Aperiodic 1 3 2 1 3 2 Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Aperiodic Periodic-Aperiodic 1 3 2 1 3 2 Definition A Markov Chain is periodic if there is some fixed "cycle" of states, and it is aperiodic otherwise. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Why do we care? Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Why do we care? If a MCMC chain has these 3 properties (reversible, irreducible and aperiodic), then it is also ergodic. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Stationary distribution 1 3 2 4 -1 output 1 3 2 4 4 2 1 2 -1 -1 4 2 3 1 3 2 4 4 4 -1 -1 -1 4 2 3 1 2 3 -1 We can calculate statistics on the output, like mean and standard deviation. Also we can plot histograms etc. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Stationary distribution 1 3 2 4 -1 output 1 3 2 4 4 2 1 2 -1 -1 4 2 3 1 3 2 4 4 4 -1 -1 -1 4 2 3 1 2 3 -1 We can calculate statistics on the output, like mean and standard deviation. Also we can plot histograms etc. Consider the distribution of the output. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Stationary distribution 1 3 2 4 -1 output 1 3 2 4 4 2 1 2 -1 -1 4 2 3 1 3 2 4 4 4 -1 -1 -1 4 2 3 1 2 3 -1 We can calculate statistics on the output, like mean and standard deviation. Also we can plot histograms etc. Consider the distribution of the output. What about the start state. That is if the chain is started in state 1, will the distribution be different from starting in 2. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Ergodic Definition If we can start from any state, and if we take samples for long enough, and we end up with the same distribution, that distribution is the stationary distribution of the Markov Chain, and the Markov Chain is said to be ergodic Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Ergodic Definition If we can start from any state, and if we take samples for long enough, and we end up with the same distribution, that distribution is the stationary distribution of the Markov Chain, and the Markov Chain is said to be ergodic Definition If a Markov Chain is reversible, irreducible and aperiodic then it is also ergodic. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Ergodic Definition If we can start from any state, and if we take samples for long enough, and we end up with the same distribution, that distribution is the stationary distribution of the Markov Chain, and the Markov Chain is said to be ergodic Definition If a Markov Chain is reversible, irreducible and aperiodic then it is also ergodic. So we can know that a chain will converge to the stationary distribution without testing every state. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Ergodic Definition If we can start from any state, and if we take samples for long enough, and we end up with the same distribution, that distribution is the stationary distribution of the Markov Chain, and the Markov Chain is said to be ergodic Definition If a Markov Chain is reversible, irreducible and aperiodic then it is also ergodic. So we can know that a chain will converge to the stationary distribution without testing every state. Usually the symbol denotes the stationary distribution. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Properties Extra Markov Chain Properties Ergodic Definition If we can start from any state, and if we take samples for long enough, and we end up with the same distribution, that distribution is the stationary distribution of the Markov Chain, and the Markov Chain is said to be ergodic Definition If a Markov Chain is reversible, irreducible and aperiodic then it is also ergodic. So we can know that a chain will converge to the stationary distribution without testing every state. Usually the symbol denotes the stationary distribution. Note that we have not said anything about how many samples we need to get an accurate distribution. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Outline 1 Bayes Theorem Bayes Theorem 2 Markov Chains definition Properties 3 MHMCMC Algorithm Examples 4 What is long enough Its all about the die Hats 5 Phylogenetic Bayesian MCMC In practice Priors Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Algorithm Metropolis Hastings MCMC Algorithm Start in state Xn Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Algorithm Metropolis Hastings MCMC Algorithm Start in state Xn Randomly generate some new state X from X Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Algorithm Metropolis Hastings MCMC Algorithm Start in state Xn Randomly generate some new state X from X Calculate the acceptance probability based on the posterior density. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Algorithm Metropolis Hastings MCMC Algorithm Start in state Xn Randomly generate some new state X from X Calculate the acceptance probability based on the posterior density. Accept the new state with that probability. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Algorithm Metropolis Hastings MCMC Algorithm Start in state Xn Randomly generate some new state X from X Calculate the acceptance probability based on the posterior density. Accept the new state with that probability. If we accept, then Xn+1 = X, otherwise Xn+1 = Xn. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Algorithm If our new state generation step can get to any valid state eventually (with non zero probability), then the chain is irreducible. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Algorithm If our new state generation step can get to any valid state eventually (with non zero probability), then the chain is irreducible. If iťs possible to generate X from X and X from X then the chain can be reversible. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Algorithm If our new state generation step can get to any valid state eventually (with non zero probability), then the chain is irreducible. If iťs possible to generate X from X and X from X then the chain can be reversible. The acceptance probability is chosen so that the chain will be reversible and aperiodic. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Algorithm If our new state generation step can get to any valid state eventually (with non zero probability), then the chain is irreducible. If iťs possible to generate X from X and X from X then the chain can be reversible. The acceptance probability is chosen so that the chain will be reversible and aperiodic. Therefore the chain is ergodic with stationary distribution . Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Algorithm If our new state generation step can get to any valid state eventually (with non zero probability), then the chain is irreducible. If iťs possible to generate X from X and X from X then the chain can be reversible. The acceptance probability is chosen so that the chain will be reversible and aperiodic. Therefore the chain is ergodic with stationary distribution . The Key Idea The stationary distribution is the posterior distribution of interest. That is the MHMCMC chain is sampling the Bayesian posterior distribution. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Examples Example Start with tree T = (a, b|c, d). Output (a, b|c, d) Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Examples Example Start with tree T = (a, b|c, d). Generate a new tree from T by a branch swap (b c). T = (a, c|b, d) Output (a, b|c, d) Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Examples Example Start with tree T = (a, b|c, d). Generate a new tree from T by a branch swap (b c). T = (a, c|b, d) Calculate acceptance probability and then accept/reject. We reject this time. Output (a, b|c, d) Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Examples Example Start with tree T = (a, b|c, d). Generate a new tree from T by a branch swap (b c). T = (a, c|b, d) Calculate acceptance probability and then accept/reject. We reject this time. The new state is T = (a, b|c, d) which we output. Output (a, b|c, d) (a, b|c, d) Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Examples Example Start with tree T = (a, b|c, d). Generate a new tree from T by a branch swap (b c). T = (a, c|b, d) Calculate acceptance probability and then accept/reject. We reject this time. The new state is T = (a, b|c, d) which we output. The next generated state is T = (a, d|b, c) (b d) and this time we accept. Output (a, b|c, d) (a, b|c, d) Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Examples Example Start with tree T = (a, b|c, d). Generate a new tree from T by a branch swap (b c). T = (a, c|b, d) Calculate acceptance probability and then accept/reject. We reject this time. The new state is T = (a, b|c, d) which we output. The next generated state is T = (a, d|b, c) (b d) and this time we accept. The new state is T = (a, d|b, c) Output (a, b|c, d) (a, b|c, d) (a, d|b, c) Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Examples Example Start with tree T = (a, b|c, d). Generate a new tree from T by a branch swap (b c). T = (a, c|b, d) Calculate acceptance probability and then accept/reject. We reject this time. The new state is T = (a, b|c, d) which we output. The next generated state is T = (a, d|b, c) (b d) and this time we accept. The new state is T = (a, d|b, c) We continue T = (a, c|b, d) (c d), and accept. Output (a, b|c, d) (a, b|c, d) (a, d|b, c) (a, c|b, d) Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Examples Die example Wiki Formula Pr(k|i, s) = 1 si k-i s n=0 (-1)n i n k - sn - 1 i - 1 Die MHMCMC Formula looks too complicated! Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Examples Die example Wiki Formula Pr(k|i, s) = 1 si k-i s n=0 (-1)n i n k - sn - 1 i - 1 Die MHMCMC Formula looks too complicated! Use a simple MHMCMC instead. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Examples Die example Wiki Formula Pr(k|i, s) = 1 si k-i s n=0 (-1)n i n k - sn - 1 i - 1 Die MHMCMC Formula looks too complicated! Use a simple MHMCMC instead. Just pick one die at random and re-throw. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Examples Die example Wiki Formula Pr(k|i, s) = 1 si k-i s n=0 (-1)n i n k - sn - 1 i - 1 Die MHMCMC Formula looks too complicated! Use a simple MHMCMC instead. Just pick one die at random and re-throw. This is reversible and the acceptance ratio is 1. i.e we always accept. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Examples Die example 3 die 1 1 1 Output 3 Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Examples Die example 3 die 1 1 1 4 1 1 Output 3 6 Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Examples Die example 3 die 1 1 1 4 1 1 4 1 6 Output 3 6 11 Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Examples Die example 3 die 1 1 1 4 1 1 4 1 6 2 1 6 Output 3 6 11 9 Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Examples Die example 3 die 1 1 1 4 1 1 4 1 6 2 1 6 3 1 6 Output 3 6 11 9 10 Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Examples Die example 3 die 1 1 1 4 1 1 4 1 6 2 1 6 3 1 6 3 1 4 Output 3 6 11 9 10 8 Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Examples Die example 3 die 1 1 1 4 1 1 4 1 6 2 1 6 3 1 6 3 1 4 3 5 4 Output 3 6 11 9 10 8 12 Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Examples Die example 3 die 1 1 1 4 1 1 4 1 6 2 1 6 3 1 6 3 1 4 3 5 4 3 2 4 Output 3 6 11 9 10 8 12 9 Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Outline 1 Bayes Theorem Bayes Theorem 2 Markov Chains definition Properties 3 MHMCMC Algorithm Examples 4 What is long enough Its all about the die Hats 5 Phylogenetic Bayesian MCMC In practice Priors Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Its all about the die More Die By changing just one dice at each step, the sum can never change by more than 5 from step to step. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Its all about the die More Die By changing just one dice at each step, the sum can never change by more than 5 from step to step. If we have 100 die and start at all ones, it will take a long time to get to the "equilibrium". Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Its all about the die More Die By changing just one dice at each step, the sum can never change by more than 5 from step to step. If we have 100 die and start at all ones, it will take a long time to get to the "equilibrium". On the other hand we could roll every die at each step. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Its all about the die More Die By changing just one dice at each step, the sum can never change by more than 5 from step to step. If we have 100 die and start at all ones, it will take a long time to get to the "equilibrium". On the other hand we could roll every die at each step. In this case we get to equilibrium in just a single step but must generate 100 random numbers per step. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Its all about the die More Die 100 die, rolling 1 dice per step 0 200 400 600 800 1000 100150200250300350400 Samples DieSum Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Its all about the die More Die 100 die, rolling 1 dice per step 0 200 400 600 800 1000 100150200250300350400 Samples DieSum 100 die, rolling all per step 0 200 400 600 800 1000 100150200250300350400 Samples DieSum Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Its all about the die Effective Sample Size Both chains were 1000 MCMC samples long, but each sample is not independent of the other. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Its all about the die Effective Sample Size Both chains were 1000 MCMC samples long, but each sample is not independent of the other. Its clear that the second case gives better results. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Its all about the die Effective Sample Size Both chains were 1000 MCMC samples long, but each sample is not independent of the other. Its clear that the second case gives better results. Effective sample size is the estimated number of independent samples and is calculated with the Integrated autocorrelation time. (in tracer for example) Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Its all about the die Effective Sample Size Both chains were 1000 MCMC samples long, but each sample is not independent of the other. Its clear that the second case gives better results. Effective sample size is the estimated number of independent samples and is calculated with the Integrated autocorrelation time. (in tracer for example) Due to the correlations between samples we don't really need every sample from the MCMC chain and instead only collect every 100'th sample or so. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Its all about the die Effective Sample Size Both chains were 1000 MCMC samples long, but each sample is not independent of the other. Its clear that the second case gives better results. Effective sample size is the estimated number of independent samples and is calculated with the Integrated autocorrelation time. (in tracer for example) Due to the correlations between samples we don't really need every sample from the MCMC chain and instead only collect every 100'th sample or so. Performance should be measured in the number of effective samples per CPU cycle. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Hats Witch's Hat Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Hats Witch's Hat Consider all non tree like signals. Recombination, Horizontal Gene Transfer and other effects could contribute to a lot of witch's hats. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Hats Key Points for simple analysis Check Effective Sample Size. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Hats Key Points for simple analysis Check Effective Sample Size. Choose the correct sample intervals. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Hats Key Points for simple analysis Check Effective Sample Size. Choose the correct sample intervals. Check Burn in. It should be small enough that it does not matter if you include it. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Hats Key Points for simple analysis Check Effective Sample Size. Choose the correct sample intervals. Check Burn in. It should be small enough that it does not matter if you include it. Not all moves are equal. How long depends on many things Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Hats Key Points for simple analysis Check Effective Sample Size. Choose the correct sample intervals. Check Burn in. It should be small enough that it does not matter if you include it. Not all moves are equal. How long depends on many things Multiple runs from random starting locations Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Outline 1 Bayes Theorem Bayes Theorem 2 Markov Chains definition Properties 3 MHMCMC Algorithm Examples 4 What is long enough Its all about the die Hats 5 Phylogenetic Bayesian MCMC In practice Priors Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary In practice Posterior Pr(T, M|D) Pr(D|T, M) Pr(T, M) The likelihood is L(T, D, M) = Pr(D|T, M) T is the tree. D is the DNA/Protein etc sequence data. M is the model parameters, like GTR. Warning Trees Make Life Difficult Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary In practice Moves and why you care about irreducibility Many programs have a huge set of options. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary In practice Moves and why you care about irreducibility Many programs have a huge set of options. It is often possible to have moves that are not reversible or irreducible. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary In practice Moves and why you care about irreducibility Many programs have a huge set of options. It is often possible to have moves that are not reversible or irreducible. Hence will not properly sample the posterior distribution. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary In practice Moves and why you care about irreducibility Many programs have a huge set of options. It is often possible to have moves that are not reversible or irreducible. Hence will not properly sample the posterior distribution. It may not be possible to get to the parts of the state space that are of interest. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary In practice Moves and why you care about irreducibility Many programs have a huge set of options. It is often possible to have moves that are not reversible or irreducible. Hence will not properly sample the posterior distribution. It may not be possible to get to the parts of the state space that are of interest. The wrong choice of moves could make the chain run very slowly. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary In practice Moves and why you care about irreducibility Many programs have a huge set of options. It is often possible to have moves that are not reversible or irreducible. Hence will not properly sample the posterior distribution. It may not be possible to get to the parts of the state space that are of interest. The wrong choice of moves could make the chain run very slowly. Examples of real output. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary In practice Aside: Hot and Cold chains Have more than one chain. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary In practice Aside: Hot and Cold chains Have more than one chain. Each extra chain is heated. With only one chain that is not. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary In practice Aside: Hot and Cold chains Have more than one chain. Each extra chain is heated. With only one chain that is not. We swap states between chains at each step or as frequently as desired. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary In practice Aside: Hot and Cold chains Have more than one chain. Each extra chain is heated. With only one chain that is not. We swap states between chains at each step or as frequently as desired. Only collect samples from the cold chain. ie the only chain with the correct distribution. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary In practice Aside: Hot and Cold chains Have more than one chain. Each extra chain is heated. With only one chain that is not. We swap states between chains at each step or as frequently as desired. Only collect samples from the cold chain. ie the only chain with the correct distribution. The idea is that we won't get stuck. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary In practice Aside: Hot and Cold chains Have more than one chain. Each extra chain is heated. With only one chain that is not. We swap states between chains at each step or as frequently as desired. Only collect samples from the cold chain. ie the only chain with the correct distribution. The idea is that we won't get stuck. Generally not as effective as just developing some better moves. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Priors Priors Huge topic! Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Priors Priors Huge topic! Without proper priors, the posterior density may not even exist! Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Priors Priors Huge topic! Without proper priors, the posterior density may not even exist! Priors do not need to be highly informed to be effective. e.g root height. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Priors Priors Huge topic! Without proper priors, the posterior density may not even exist! Priors do not need to be highly informed to be effective. e.g root height. Informative priors can make analysis possible by restricting the state space Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Priors Priors Huge topic! Without proper priors, the posterior density may not even exist! Priors do not need to be highly informed to be effective. e.g root height. Informative priors can make analysis possible by restricting the state space Priors should be considered with respect to the hypothesis that will be tested. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Priors Priors-Rules Trees must have a prior. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Priors Priors-Rules Trees must have a prior. Even if all the branch lengths in a topology are infinitely long the likelihood is still finite. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Priors Priors-Rules Trees must have a prior. Even if all the branch lengths in a topology are infinitely long the likelihood is still finite. Infinitely long branches do not make sense. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Priors Priors-Rules Trees must have a prior. Even if all the branch lengths in a topology are infinitely long the likelihood is still finite. Infinitely long branches do not make sense. Yule priors, coalescent priors and exponential priors are common. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Priors Priors-Rules Trees must have a prior. Even if all the branch lengths in a topology are infinitely long the likelihood is still finite. Infinitely long branches do not make sense. Yule priors, coalescent priors and exponential priors are common. For rooted topologies, a simple bounded uniform prior is sufficient. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Priors Priors-Rules Trees must have a prior. Even if all the branch lengths in a topology are infinitely long the likelihood is still finite. Infinitely long branches do not make sense. Yule priors, coalescent priors and exponential priors are common. For rooted topologies, a simple bounded uniform prior is sufficient. Even if the max root height is 100 expected substitutions per site, the posterior can now be normalized. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Rules of thumb Do more than one run. I recommend about 10 or so if possible. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Rules of thumb Do more than one run. I recommend about 10 or so if possible. Each run should always start from a random starting point. Never use an NJ tree or any other "good" starting point. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Rules of thumb Do more than one run. I recommend about 10 or so if possible. Each run should always start from a random starting point. Never use an NJ tree or any other "good" starting point. Burn in should be less than a tenth of the full run. In general if the statistics are affected by the amount of burn in, it wasn't run long enough. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Rules of thumb Do more than one run. I recommend about 10 or so if possible. Each run should always start from a random starting point. Never use an NJ tree or any other "good" starting point. Burn in should be less than a tenth of the full run. In general if the statistics are affected by the amount of burn in, it wasn't run long enough. More parameters will always take longer. Don't use more parameters than are needed. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Rules of thumb Do more than one run. I recommend about 10 or so if possible. Each run should always start from a random starting point. Never use an NJ tree or any other "good" starting point. Burn in should be less than a tenth of the full run. In general if the statistics are affected by the amount of burn in, it wasn't run long enough. More parameters will always take longer. Don't use more parameters than are needed. Check your priors! Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Summary Bayesian inference is not maximum likelihood. Other points to consider. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Summary Bayesian inference is not maximum likelihood. It is not a black box. Care must be taken to get the chain setup correctly, and when interpreting the results. Other points to consider. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Summary Bayesian inference is not maximum likelihood. It is not a black box. Care must be taken to get the chain setup correctly, and when interpreting the results. Run the chain long enough! This is the most common mistake. Other points to consider. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Summary Bayesian inference is not maximum likelihood. It is not a black box. Care must be taken to get the chain setup correctly, and when interpreting the results. Run the chain long enough! This is the most common mistake. Other points to consider. Generally slower than ML. (bootstrapped) Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Summary Bayesian inference is not maximum likelihood. It is not a black box. Care must be taken to get the chain setup correctly, and when interpreting the results. Run the chain long enough! This is the most common mistake. Other points to consider. Generally slower than ML. (bootstrapped) Support values are easier to interpret. Bayes Theorem Markov Chains MHMCMC What is long enough Phylogenetic Bayesian MCMC Summary Summary Bayesian inference is not maximum likelihood. It is not a black box. Care must be taken to get the chain setup correctly, and when interpreting the results. Run the chain long enough! This is the most common mistake. Other points to consider. Generally slower than ML. (bootstrapped) Support values are easier to interpret. Can incorporate prior information easily.