BCB 503: RevBayes Intro

Fourth session: Diversification

Orlando Schwery, 14. September 2021

This is based on the following tutorials:


Additional tutorials related to the topic that we won’t cover today:


Standard BiSSE

This time, we’re going to write our code into files that we call from RevBayes later, instead of interactively. Thus apart from our usual data and output folders, we shall also have a folder that we may call src (for source, but call it whatever you please, e.g. code, as long as you adjust the paths accordingly), and where we’ll put all our code files.

For now, we shall have a runfile.Rev where we have our general settings, and a bisse.Rev file with the specific code for our BiSSE model, which we will then call from within the runfile. If we play our cards right, we can then run a different model with the same runfile by just changing which model file we call.


Data

We’re using a dataset for primates and their activity period (nocturnal vs. diurnal). Since data isn’t specific to the model, we add this to the runfile. For practical reasons that will become clear later, we can set variables for our data based on the file names:

group = "primates"
dataset = "activity_period"

We then load the tree and the data:

tree <- readTrees("data/" + group + "_tree.nex")[1]
data <- readCharacterData("data/" + group + "_" + dataset + ".nex")

We save the number of taxa and the number of states to variables for later:

num_taxa <- tree.ntips()
NUM_STATES = 2

Just the same, we can also add the name of our model, which practically would be the name of our model file without the extension ( i.e. our model file here would be called bisse.Rev):

model = "bisse"

We also initialise our moves and monitors for the MCMC:

moves    = VectorMoves()
monitors = VectorMonitors()


BiSSE Model

Diversification Rates

Now we can specify the BiSSE model, which we will of course do in our bisse file. We start with our two diversification rate parameters, speciation (lambda) and extinction (mu). Since we don’t (or shouldn’t?) have any prior expectation that the rates between the two states are different, we can assign them the same priors, thus we can use a for loop to assign them (and the respective moves).

for (i in 1:NUM_STATES) {

    speciation[i] ~ dnExponential(1)
    moves.append( mvScale(speciation[i],lambda=0.20,tune=true,weight=3.0) )

    extinction[i] ~ dnExponential(1)
    moves.append( mvScale(extinction[i],lambda=0.20,tune=true,weight=3.0) )
    
}

We might also be interested in the net-diversification, and instead of calculating it after the fact, we can specify it now as a deterministic node (in the same loop):

diversification[i] := speciation[i] - extinction[i]

Note that our net diversification rate is part of the model now, but since it doesn’t have any moves assigned to it, it won’t be optimised during the MCMC, it’s merely a number that’s being calculated each time and being kept track of.

Sometimes an MCMC can struggle to find a good starting point, in which case we can set a starting value on the prior that we think should be in a reasonable range. A common way of doing it is to set that of lambda to an expected number of speciation events given the number of taxa and the age of the tree (we’re using the total number of primates here) [though some people use tree length instead of age (and without ln)], and that of mu to an order of magnitude lower than lambda:

speciation[i].setValue( ln(extant_taxa/2.0) / tree.rootAge() )
extinction[i].setValue( speciation[i]/10.0 )

If we want, we can look at our parameters now (if we plop that loop and settings into the console first, that is) and see that each of those variables is now a vector of two values:

speciation
extinction
diversification  # for completeness

If we use binary states now, Rev will assume speciation[1] to be the rate for trait state 0, and speciation[2] for trait state 1 respectively (and the same for all the others too).


Transition Rates

Now we also need transition_rates at which the trait states evolve from one to the other and back. A good way to think about a prior for this is to think how many changes we might expect across the whole tree (.treeLength calculates the sum of all branch lengths). Let’s say 10, so we can specify an exponential prior that’s got the corresponding rate that leads to an expected mean of 10 transitions:

rate_pr := tree.treeLength() / 10

We can just add them to the same loop above, after all, there’s as many as there are states, one from 0 to 1 and one from 1 to 0:

transition_rates[i] ~ dnExp(rate_pr)
moves.append( mvScale(transition_rates[i], lambda=0.20, tune=true, weight=3.0) )

Finally, similarly to before in the Biogeography case, we need to turn these rates into a transition matrix:

rate_matrix := fnFreeK( transition_rates, rescaled=false)

Notably, we put that bit outside of that loop. We set rescaled = false, because we don’t want an average rate of 1 as for molecular evolution, but we want the rates to be in the same time-scale as the diversification rates.


Root State

As with any trait evolving on a tree, everything changes perspective depending on which trait state we’ve started off with. Thus we have the possibility to put a prior on the state frequency at the root as well. By default we can let BiSSE estimate those too, and give it a prior that makes both equally frequent (as we’ll do here). But we could also set the prior to different frequencies if we have a sense that’d be appropriate.

root_state_freq ~ dnDirichlet( rep(1, NUM_STATES) )
moves.append( mvDirichletSimplex(root_state_freq, tune=true, weight=2) )

The Dirichlet distribution is a good one to use for this (as it is a continuous multivariate probability distribution), and note that we’re setting a 1 for each trait state in our analysis. It’s also possible to set this prior based on empirical frequencies, or to force them to be equal.


Sampling Fraction

By default, BiSSE and any other diversification model will assume that our phylogeny is completely sampled ( i.e. all species that aren’t in the tree are extinct). But of course we know that’s usually not the case. As mentioned above, we know that we’ve got 367 extant taxa, but only 233 of them are in our tree. So we can easily set the sampling fraction based on those values. Since the number of extant_taxa is a data-specific value, let’s put that into the runfile (to have all settings together):

extant_taxa <- 367

Now we can just do the calculation in our bisse script, so we don’t have to go in there to make changes just because we want to use it on a different data set:

sampling <- num_taxa / extant_taxa


Root Age

Finally, our model depends on the time since the MRCA, so we can just provide the root age of our tree:

root_age <- tree.rootAge()

As before, this is a fixed node because we’re assuming our tree (and thus it’s age) to be known, but obviously, in a different case, we could put a prior on that too and have it estimated jointly.


The Time Tree Distribution

Now we’ve arrived at the beloved state of our analysis where we find the core function to this analysis type, where we enter all our variables and then clamp our data to it. In this case, we can draw from a distribution of trees with traits and trait-dependent diversification rates from a function called dnCDBDP (which stands for distribution of Clade Dependent Birth Death Process). Neatly, this will be the same one for most of our SSE models.

sse ~ dnCDBDP( rootAge           = root_age,
               lambda            = speciation,
               mu                = extinction, 
               Q                 = rate_matrix,
               pi                = root_state_freq,
               rho               = sampling )

Now when we want to clamp our data to it, we gotta think a moment: what is our data here? In fact, unlike for the other models, the data isn’t only the tip states, but also the tree. Thus:

sse.clamp( tree )
sse.clampCharData( data )

At last, we wrap this all up in a model object, and we’re done in our bisse.Rev file:

mymodel = model(sse)


Setting Up the MCMC

Back in our runfile, we can now complete the rest, which means appending our usual monitors for Model and Screen. When specifying the filename for the mnModel output, we can add the name of our current model, group, and dataset in, so if we want to use the script to run a different model with the same data (or vice versa), we won’t overwrite our previous results:

monitors.append( mnModel(filename="output/" + group + "_" + dataset + "_" + model + "_mcmc.log", printgen=1) )
monitors.append( mnScreen(printgen=10, speciation, extinction) )

If you were here last week, you might remember that there are many more monitors that can e.g. also save node states etc., and that’s also true here. We’re not going to use it now, but the online tutorial includes instructions for it. The monitor would be mnJointConditionalAncestralState, and the resulting trace of node states would need to be summarised again after running before it can be visualised on the tree.

Now we can let it initialise and run our MCMC:

mymcmc = mcmc(mymodel, monitors, moves, nruns=2, combine="mixed")
mymcmc.run(generations=5000, tuningInterval=200)

Of course, one thing is still missing here, and that is the point where we let the runfile load our bisse model file. Where should that go?

Definitely after the point where we initialise the moves, but surely before we use the model object in the mcmc. In fact, since we specify the two rates speciation and extinction in the Screen monitor, it should probably also be put before that, so let’s put it between the initialising of the moves and the appending of monitors (as that also is in line with the order we’ve written past analyses). Also here, we can use the object with the saved model name, so if we want to change the model we only need to change that above:

source("src/" + model + ".Rev")

We could add a q() at the end if we wanted Rev to quit after it’s done running, but let’s not do that for now. But let’s add a little message for ourselves that tells us we’ve successfully reached the end:

print("Analysis of " + dataset + " in " + group + " under " + model + " is done!") 

Running the Analysis

Now we can execute the files we’ve written. We open RevBayes and set our working directory as we’re used to now:

rb                                         # or maybe ./rb depending on your system
getwd()
setwd("/mnt/c/Users/oschwery/Documents/Teaching/RevBayes_Workshop_UI2021/4-Diversification/")

Note that our wd has to be on the folder level above data, src, and output, so the way we’ve specified the paths in the files will work out!

Now we source the runfile just as we did the bisse file above:

source("src/runfile.Rev")

Now we should see it go!

Once it is done, we can inspect the log file in Tracer.


MuSSE

Now now we have different data too, which is on primate mating systems. The states are coded as 0 being monogamy, 1 being polygyny, 2 being polygynandry, and 3 being polyadnry.

Now what do we need to change in our files (or what files should we add) to minimise the amount of new code we should write?

In the runfile, we obviously need to make sure the correct data file is loaded, so we change the file-ending appropriately:

dataset = "mating_system"

Also, since we got two more states now, we have to change that as well:

NUM_STATES = 4

And finally, we change the modelname to the new model we’re trying to run:

model = "musse"

But now we need to make sure such a musse.Rev model file actually exists. Let’s just copy the content of bisse.Rev and modify it accordingly.

When we look at the first bit of our code, we see that we don’t really need to change anything, thanks to the fact that we wrote it as a loop, the iterations of which are determined by the number of states (NUM_STATES). So when for BiSSE, this loop cycled twice, and gave us two speciation rates (and extinction, etc.), it will now automatically go round four times and give us priors for four of each rate, as needed.

However, one bit doesn’t scale that way, and that is the transition_rates. For each trait state, we can transition to each other trait state ( e.g. 0 can go 0 -> 1, 0 -> 2, and 0 -> 3, and accordingly for all the others).

Thus we wouldn’t need 4 transition rates, as the initial loop would give us, but 3 for each of the four states, i.e. 4 * 3 = 12. So we can take the code for the transition_rate out of that loop and make a new loop for it according to that logic:

rate_pr := tree.treeLength() / 10

for (i in 1:(NUM_STATES *(NUM_STATES-1))) {
    transition_rates[i] ~ dnExponential(rate_pr)
    moves.append( mvScale(transition_rates[i], lambda=0.20, tune=true, weight=3.0) )
}

# turn this into a rate matrix again
rate_matrix := fnFreeK( transition_rates, rescaled=false)

The rest of the code can stay the same as for BiSSE, since the only thing that changes is the number of possible states for the root_state_freq, but the fact that we also coded that using NUM_STATES takes care of that.

If we take a step back, we realise that the code we wrote can now work for a MuSSE run with any number of trait states, since the loops using NUM_STATES are now always going to yield the right number of priors, moves, etc. That is however, unless we would want our different speciation and/or extinction rates to have different priors (or starting values), instead of the same, although that’s

In fact, if NUM_STATES = 2 as in BiSSE, the loop above gives 2 * (2-1) = 2 transition rates, so if we ran a MuSSE on data with only two trait states, this would just yield BiSSE again (as it should)!


***Update*** MuSSE with Two Binary States

Instead of having a multi-state trait as above, MuSSE can be understood as being used on multiple binary traits. These can most easily be coded by giving each combination of states of the two traits a separate number, thus the states now represent:

# 0 = 00, 1 = 01, 2 = 10, 3 = 11

A main difference is here, that one may want to disallow both traits from changing state at the same time, i.e. 00 has to go through 01 or 10 to get to 11. So we e.g. can’t have transitions 0 -> 3, 1 -> 2, and vice-versa . Thus we can’t just expand the rates to all combinations of trait states as above. Instead, we make a matrix of 0s and then fill in each rate for transitions that are allowed separately:

# we will fill the non-zero elements below
for (i in 1:NUM_STATES) {
  for (j in 1:NUM_STATES) {
    transition_rates[i][j] <- 0.0
  }
}

The result is a 4x4 matrix transition_rates of rates being 0.0. Now we overwrite specific entries with the exponential rate prior. The notation with two squared brackets allows us to acces each entry separately, e.g. transition_rates[x][y] would refer to the entry in row x and column y.

We skip the diagonal that represents transitions of a state to itself ( i.e. [1][1]), as well as those two steps away, which for our coding defined above means we skip [1][4] and [4][1] as well as [2][3] and [3][2]. Btw, these just happen to be both diagonals.

transition_rates[1][2] ~ dnExponential(rate_pr)  # 00 to 01
transition_rates[1][3] ~ dnExponential(rate_pr)  # 00 to 10
transition_rates[2][1] ~ dnExponential(rate_pr)  # 01 to 00
transition_rates[2][4] ~ dnExponential(rate_pr)  # 01 to 11
transition_rates[3][1] ~ dnExponential(rate_pr)  # 10 to 00
transition_rates[3][4] ~ dnExponential(rate_pr)  # 10 to 11
transition_rates[4][2] ~ dnExponential(rate_pr)  # 11 to 01
transition_rates[4][3] ~ dnExponential(rate_pr)  # 11 to 10

Now of course, we also need moves on those, so they will be optimised by the MCMC:

moves.append(mvScale(transition_rates[1][2], lambda=0.20, tune=TRUE, weight=3.0))
moves.append(mvScale(transition_rates[1][3], lambda=0.20, tune=TRUE, weight=3.0))
moves.append(mvScale(transition_rates[2][1], lambda=0.20, tune=TRUE, weight=3.0))
moves.append(mvScale(transition_rates[2][4], lambda=0.20, tune=TRUE, weight=3.0))
moves.append(mvScale(transition_rates[3][1], lambda=0.20, tune=TRUE, weight=3.0))
moves.append(mvScale(transition_rates[3][4], lambda=0.20, tune=TRUE, weight=3.0))
moves.append(mvScale(transition_rates[4][2], lambda=0.20, tune=TRUE, weight=3.0))
moves.append(mvScale(transition_rates[4][3], lambda=0.20, tune=TRUE, weight=3.0))

Notably, we don’t put moves on the ones we set to 0.0, because we want them to stay 0.0 (though they probably wouldn’t move without a prior to pick values from anyway).

Now we turn this into a rate matrix again:

rate_matrix := fnFreeK( transition_rates, rescaled=FALSE)

Note that we can use the same way of setting up the file if we e.g. want to make a multi-state model where not all transitions are allowed, e.g. a stepping-stone model or an ordered character, where state 0 has to transition through 1 and 2 to get to 3, or where some transitions are irreversible, etc.

Also, we would want to use the same tactic if we wanted to not just have 4 independent diversification rates for each state. For example, right now any combination of the two traits could have a higher or lower diversification than the others, but depending on our hypotheses, we could enforce that the effects of the presence of each trait would be additive by setting:

speciation[1]  ~ dnExponential(1)
speciation[2]  ~ dnExponential(1)
speciation[3]  ~ dnExponential(1)
speciation[4] := speciation[2] + speciation[3]  # effect of trait states 11 is 01 + 10

In this case, we would obviously not want to set a move on speciation[4], as this rate is not estimated anymore, but just calculated from the others.


HiSSE (or CID2) [this is for another day, but feel free to think about it]

Now we might feel suspicious about our BiSSE-result from the activity_period data, but still feel there might be something to it, so maybe activity period does have an influence, but it’s not the only one?

So we’re using HiSSE on the same data, but with 4 hidden states, to see if we find something enlightening.

Once again, we make new files for the model and modify our runfile. What do we do?

[Note that I’ve vastly simplified the way the hidden states are set up here as opposed to the online tutorial…]