Fit predefined multivariate Ornstein-Uhlenbeck models to multivariate evolutionary sequence (time-series) data.

Function to find maximum likelihood solutions to a large suite of predefined multivariate Ornstein-Uhlenbeck model fitted to multivariate evolutionary sequence (time-series) data.

Usage

fit.multivariate.OU(
  yy,
  A.matrix = "diag",
  R.matrix = "symmetric",
  method = "Nelder-Mead",
  hess = FALSE,
  pool = TRUE,
  trace = FALSE,
  iterations = NULL,
  iter.sd = NULL,
  user.init.diag.A = NULL,
  user.init.diag.R = NULL,
  user.init.off.diag.A = NULL,
  user.init.off.diag.R = NULL,
  user.init.theta = NULL,
  user.init.anc = NULL
)

Arguments

yy: a multivariate evoTS object.
A.matrix: the pull matrix. The options are "diag", "upper.tri", "lower.tri", and "full". Default is "diag". See details (or vignette) for more info on what the different options mean.
R.matrix: the drift matrix. The options are "diag" and "symmetric".
method: optimization method, passed to function optim. Default is "Nelder-Mead".
hess: logical, indicating whether to calculate standard errors from the Hessian matrix.
pool: indicating whether to pool variances across samples
trace: logical, indicating whether information on the progress of the optimization is printed.
iterations: the number of times the optimization method is run from different starting points. Default is NULL, meaning the optimization is run once.
iter.sd: defines the standard deviation of the Gaussian distribution from which starting values for the optimization routine is run. Default is 1.
user.init.diag.A: starting values for the optimization routine of the diagonal elements of the A matrix. Default is NULL.
user.init.diag.R: starting values for the optimization routine of the diagonal elements of the R matrix. Default is NULL.
user.init.off.diag.A: starting values for the optimization routine of the off-diagonal elements of the A matrix. Default is NULL.
user.init.off.diag.R: starting values for the optimization routine of the off-diagonal elements of the R matrix. Default is NULL.
user.init.theta: starting values for the optimization routine of the optima. Default is NULL.
user.init.anc: starting values for the optimization routine of the ancestral values. Default is NULL.

Value

First part of the output reports the log-likelihood of the model and its AICc score. The second part of the output is the maximum log-likelihood model parameters (ancestral.values, optima, A, and R). The half-life is also provided, which is the The last part of the output gives information about the number of parameters in the model (K), number of samples in the data (n) and number of times the optimization routine was run (iter).

Details

A detailed explanation of the predefined models that can be fitted using the function is given in the online vignette (https://klvoje.github.io/evoTS/index.html), but a short summary is provided here. Note that this function provides the user with fixed options for how to parameterize the A and R matrices. For full flexibility, the user is allowed to customize the parameterization of the A and R matrix in the 'fit.multivariate.OU.user.defined' function. The type of trait dynamics is defined based on how the pull matrix (A) and drift matrix (R) are defined. The function allows testing four broad categories of models: 1 Independent evolution (A.matrix ="diag", R.matrix = "diag"); 2 Independent adaptation (A.matrix ="diag", R.matrix = "symmetric"); 3 Non-independent adaptation (A.matrix = "upper.tri"/"lower.tri"/full", R.matrix = "diagonal"); 4 Non-independent evolution (A.matrix = "upper.tri"/"lower.tri"/"full", R.matrix = "symmetric"). Setting the A.matrix to "diagonal" means the traits do not affect each others optimum (A matrix). A "diagonal" R matrix means the stochastic changes in the traits are assumed to be uncorrelated. A "symmetric" R matrix means the stochastic changes in the traits are assumed to be correlated, i.e. that they are non-independent. A "full" parameterization of A estimates the effect of each trait on the optima on the other traits. The "upper.tri" option parameterize the model in such a way that the first layer (first trait in the data set) adapts non-independently because its optimum is affected by all other traits included in the data set, while the bottom layer (the last trait in the data set) adapts independently (as an Ornstein Uhlenbeck process). Layers in between the upper- and lower layer (not the first or last trait in the data set (if there are more than two traits in the data set)) evolve non-independently as their optimum is affected by all layers/traits below themselves. The option "lower.tri" defines the causality the opposite way compared to "upper.tri". It is also possible to implement a model where the bottom layer (last trait in the data set) evolve as an Unbiased random walk (akin to a Brownian motion) which affects the optima for all other traits in the data set (i.e. all layers except the bottom layer). This model can be fitted by defining A.matrix ="OUBM", which will override how the R matrix is defined.

The function searches - using an optimization routine - for the maximum-likelihood solution for the chosen multivariate Ornstein-Uhlenbeck model. The argument 'method' is passed to the 'optim' function and is included for the convenience of users to better control the optimization routine. Note that the the default method (Nelder-Mead) seems to work for most evolutionary sequences. The method L-BFGS-B allows box-constraints on some parameters (e.g. non-negative variance parameters) and is faster than Nelder-Mead, but is less stable than the default method (Nelder-Mead).

Initial estimates to start the optimization come from maximum-likelihood estimates of the univariate Ornstein-Uhlenbeck model (from the paleoTS package) fitted to each time-series separately.

It is good practice to repeat any numerical optimization procedure from different starting points. This is especially important for complex models as the log-likelihood surface might contain more than one peak. The number of iterations is controlled by the argument 'iterations'. The function will report the model parameters from the iteration with the highest log-likelihood.

There is no guarantee that the likelihood can be computed with the initial parameters provided by the function. The starting values for fitting the multivariate OU model are based on maximum likelihood parameter estimates for the univariate OU model fitted to each trait separately, which seems to provide sensible (and working) initial parameter estimates for almost all tested data sets. However, the provided initial parameters may fail depending on the nature of the data. If an error message is returned saying "function cannot be evaluated at initial parameters", the user can try to start the optimization procedure from other initial parameter values using "user.init.diag.A", "user.init.diag.R", "user.init.off.diag.A", "user.init.off.diag.R", "user.init.theta", and "user.init.anc." It is usually the initial guess of the off-diagonal elements of the A and R matrices that prevents the optimization routine to work. It is therefore recommended to only try to change these initial values before experimenting with different starting values for the diagonal of the A and R matrices.

Note

The models have been implemented to be compatible with the joint parameterization routine in the package paleoTS. The optimization is therefore fit using the actual sample values, with the autocorrelation among samples accounted for in the log-likelihood function. The joint distribution of sample means is multivariate normal, with means and variance-covariances determined by evolutionary parameters and sampling errors.

References

Reitan, T., Schweder, T. & Henderiks, J. Phenotypic evolution studied by layered stochastic differential equations. Ann Appl Statistics 6, 1531–1551 (2012).

Bartoszek, K., Pienaar, J., Mostad, P., Andersson, S. & Hansen, T. F. A phylogenetic comparative method for studying multivariate adaptation. J Theor Biol 314, 204–215 (2012).

Clavel, J., Escarguel, G. & Merceron, G. mvmorph: an r package for fitting multivariate evolutionary models to morphometric data. Methods Ecol Evol 6, 1311–1319 (2015).

Author

Kjetil Lysne Voje

Examples


## Generate a evoTS object by simulating a multivariate dataset
x <- sim.multi.OU(15)

# \donttest{
##Fit a multivariate Ornstein-Uhlenbeck model to the data. This example will run for a long time.
fit.multivariate.OU(x, A.matrix="diag", R.matrix="symmetric")
#> $converge
#> [1] "Model converged successfully"
#> 
#> $modelName
#> [1] "Multivariate model: OU; diagonal A matrix, symmetric R matrix"
#> 
#> $logL
#> [1] 25.30772
#> 
#> $AICc
#> [1] 3.384557
#> 
#> $ancestral.values
#> [1]  0.00289230 -0.01889143
#> 
#> $SE.anc
#> [1] NA
#> 
#> $optima
#> [1] 3.036543 1.948427
#> 
#> $SE.optima
#> [1] NA
#> 
#> $A
#>          [,1]     [,2]
#> [1,] 7.341044 0.000000
#> [2,] 0.000000 7.036698
#> 
#> $SE.A
#> [1] NA
#> 
#> $half.life
#> [1] 0.09442079 0.09850461
#> 
#> $R
#>              [,1]         [,2]
#> [1,] 0.0001036816 0.0003064294
#> [2,] 0.0003064294 0.0101301644
#> 
#> $SE.R
#> [1] NA
#> 
#> $method
#> [1] "Joint"
#> 
#> $K
#> [1] 9
#> 
#> $n
#> [1] 15
#> 
#> $iter
#> [1] NA
#> 
#> attr(,"class")
#> [1] "evoTSmvFit"
# }