Package rSymPy
offers a nice possibility to do symbolic computation in R. In this post I want to suggest an alternative to it, which helped me out to derive some complex expressions of conditional moments for multivariate normal distribution.
rSymPy
was simply too slow for my tasks, so I remembered computer algebra system - Maxima.
What is also nice, that this option also produces Tex
output.
In this post I will provide some instructions for Windows users. The first step is to download and install wxMaxima.
I wrote a function, which feeds commands into Maxima, executes them, returns strings, that can be evaluated in R, and allows the user to get Tex
expressions.
Function
require(magrittr)
#' Execute lines in Maxima
#' @param obj - intermediate Maxima calculations
#' @param out - output from Maxima
#' @param pathm - path to maxima.bat
#' @param tex - include tex expressions
#' @return returns list, first element is a list of strings, that can be evaluated in R.
#' second element - list of tex expression
runMaxima <- function(obj, out, pathm = "C:\\maxima-5.38.1\\bin\\maxima", tex=TRUE){
com <- c(paste0(obj, " ; "), paste0(out, " ; "))%>%paste0( collapse = "")
# execute lines in maxima
a <- shell(paste0('echo ', com , ' | "', pathm, '"'), intern=TRUE)
## form output blocks, they are separated by (%o
# start of block
start <- c(1, grep("\\(%o",a) )%>%unique # output in Maxima begins with lines "(%o\\d)"
# end of block
end <- start-1
end <- c(end[-1],length(a))
# form output blocks
aa <- lapply(1:length(start), function(i) a[start[i]:end[i]])
## search for tex output
# in Maxima tex output begins with $$ and ends with $$
aa2 <- aa[sapply(aa, function(x) x%>%grepl("^\\$\\$|\\$\\$$", .)%>%any)]
aa2 <- lapply(aa2, function(x)x%>%grep("false|done|new", ., value=TRUE, invert=TRUE)%>%paste0(., collapse = ""))
#search the blocks that belong to output
aa <- aa[sapply(aa, function(x) x%>%grepl("\\$$",.)%>%any)] #in Maxima the last output of line is given by $
aa <- aa[sapply(aa, function(x)!(x%>%grepl("^\\$\\$|\\$\\$$", .)%>%any))] #remove tex blocks, they begin with $, remove $ and spaces
aa <- lapply(aa, function(x)x%>%grep("done|new|false", ., invert=TRUE, value=TRUE)%>%paste0(., collapse = "")%>%gsub("( )|\\$$","",.)) #remove some unnec. output
names(aa) <- 1:length(aa)
if (length(aa2)>0) names(aa2) <- 1:length(aa2)
list(expr=aa, tex=aa2)
}
Example
Suppose for some reason, one needs to get expressions of trivariate normal distribution. First lets form variance-covariance matrix for usage in Maxima. R output:
library(magrittr)
Define number of random variables:
neq <- 3
Maxima commands
Form symbolic variance-covariance matrix:
t1 <- neq%>%"^"(2)%>%rep("sigma", .)
t2 <- neq%>%rep(1:., .)
t3 <- t2%>%sort
t4 <- neq%>%"^"(2)%>%rep("rho[", .)
sigma <- paste0(t1, "[", t3 , "]*", t1,"[",t2 ,"]*", t4, t3,",", t2, "]" )
sigma <- matrix(sigma, neq, neq, byrow = TRUE)
diag(sigma) %<>% gsub("\\*sigma\\[\\d{1,2}\\]\\*rho\\[\\d{1,2},\\d{1,2}\\]","**2", .)
sigma[lower.tri(sigma)] <- sort(sigma[upper.tri(sigma)])
X1 | X2 | X3 |
---|---|---|
sigma[1]**2 | sigma[1]sigma[2]rho[1,2] | sigma[1]sigma[3]rho[1,3] |
sigma[1]sigma[2]rho[1,2] | sigma[2]**2 | sigma[2]sigma[3]rho[2,3] |
sigma[1]sigma[3]rho[1,3] | sigma[2]sigma[3]rho[2,3] | sigma[3]**2 |
Maxima command for variance-covariance matrix:
spy <- paste0(sapply(1:neq, function(i) paste0("[", paste0("",sigma[i,],"", collapse = ", "), "]")), collapse = ",")
spy <- paste0("Sigma : matrix(", spy, ")")
Random variable vector:
ym <- 1:neq%>%paste0("[y[", .,"]]")%>%paste0(., collapse=", ")%>%paste0("y: matrix(", ., ")")
Vector of means:
mum <- 1:neq%>%paste0("[mu[", .,"]]")%>%paste0(., collapse=", ")%>%paste0("mu: matrix(", ., ")")
Determinant of variance-covariance matrix, function ratsimp
simplifies an expression:
detm <- "dets : ratsimp(determinant(Sigma))"
Inverse of variance-covariance matrix:
invm <- "invs : ratsimp(invert(Sigma))"
Exponent in multivariate normal distribution:
expinside <- "inside: ratsimp(transpose(y-mu).invert(Sigma).(y-mu))"
Function for multivariate normal distribution:
MNf <- paste0("phi(x, mu, var):= 1/(sqrt((2*%pi)**",neq,"*ratsimp(determinant(var))))*%e**(-1/2*ratsimp(transpose(x-mu).invert(var).(x-mu)))")
Get symbolic expression of MN with vector y, mu and matrix Sigma:
expr <- "expt : phi(y, mu, Sigma)"
Vector with of strings that will be evaluated in Maxima:
obj <- c(ym, mum, spy, detm, invm, expinside, MNf, expr)
Additional commands for Maxima to get evaluable expressions (grind function produces output which can be evaluated), print(new) is simply a place holder, :
out <- c('print(new)', 'grind(dets)', 'print(new)', 'grind(invs)', 'print(new)', 'grind(expt)', 'print(new)', 'grind(inside)')
“’’i(\d)” takes object from the input supplied to Maxima and tex
function produces Tex
output:
ind <- c(4, 5, 6, 8)
out[ind]
## [1] "grind(invs)" "print(new)" "grind(expt)" "grind(inside)"
out <- c(out, paste0("tex (''%i", ind,")"))
Run Maxima:
res <- runMaxima(obj, out)
Some Tex output.
Determinant of \(\Sigma\), `r res$tex[[1]]`
:
\[\left(-\sigma_{1}^2\,\sigma_{2}^2\,\rho_{2,3}^2+2\,\sigma_{1}^2\, \rho_{1,2}\,\rho_{1,3}\,\sigma_{2}^2\,\rho_{2,3}+\left(-\sigma_{1}^2 \,\rho_{1,3}^2-\sigma_{1}^2\,\rho_{1,2}^2+\sigma_{1}^2\right)\, \sigma_{2}^2\right)\,\sigma_{3}^2\]
Inverse of \(\Sigma\), `r res$tex[[2]]`
:
\[\begin{bmatrix}{\frac{\rho_{2,3}^2-1}{\sigma_{1}^2\,\rho_{2,3}^2-2\, \sigma_{1}^2\,\rho_{1,2}\,\rho_{1,3}\,\rho_{2,3}+\sigma_{1}^2\,\rho _{1,3}^2+\sigma_{1}^2\,\rho_{1,2}^2-\sigma_{1}^2}}&-{\frac{\rho_{1,3}\, \rho_{2,3}-\rho_{1,2}}{\sigma_{1}\,\sigma_{2}\,\rho_{2,3}^2-2\, \sigma_{1}\,\rho_{1,2}\,\rho_{1,3}\,\sigma_{2}\,\rho_{2,3}+\left( \sigma_{1}\,\rho_{1,3}^2+\sigma_{1}\,\rho_{1,2}^2-\sigma_{1}\right) \,\sigma_{2}}}&-{\frac{\rho_{1,2}\,\rho_{2,3}-\rho_{1,3}}{\left( \sigma_{1}\,\rho_{2,3}^2-2\,\sigma_{1}\,\rho_{1,2}\,\rho_{1,3}\,\rho _{2,3}+\sigma_{1}\,\rho_{1,3}^2+\sigma_{1}\,\rho_{1,2}^2-\sigma_{1} \right)\,\sigma_{3}}}\cr -{\frac{\rho_{1,3}\,\rho_{2,3}-\rho_{1,2}}{ \sigma_{1}\,\sigma_{2}\,\rho_{2,3}^2-2\,\sigma_{1}\,\rho_{1,2}\,\rho _{1,3}\,\sigma_{2}\,\rho_{2,3}+\left(\sigma_{1}\,\rho_{1,3}^2+\sigma _{1}\,\rho_{1,2}^2-\sigma_{1}\right)\,\sigma_{2}}}&{\frac{\rho_{1,3}^2-1 }{\sigma_{2}^2\,\rho_{2,3}^2-2\,\rho_{1,2}\,\rho_{1,3}\,\sigma _{2}^2\,\rho_{2,3}+\left(\rho_{1,3}^2+\rho_{1,2}^2-1\right)\,\sigma _{2}^2}}&{\frac{\rho_{2,3}-\rho_{1,2}\,\rho_{1,3}}{\left(\sigma_{2} \,\rho_{2,3}^2-2\,\rho_{1,2}\,\rho_{1,3}\,\sigma_{2}\,\rho_{2,3}+ \left(\rho_{1,3}^2+\rho_{1,2}^2-1\right)\,\sigma_{2}\right)\,\sigma _{3}}}\cr -{\frac{\rho_{1,2}\,\rho_{2,3}-\rho_{1,3}}{\left(\sigma_{1 }\,\rho_{2,3}^2-2\,\sigma_{1}\,\rho_{1,2}\,\rho_{1,3}\,\rho_{2,3}+ \sigma_{1}\,\rho_{1,3}^2+\sigma_{1}\,\rho_{1,2}^2-\sigma_{1}\right) \,\sigma_{3}}}&{\frac{\rho_{2,3}-\rho_{1,2}\,\rho_{1,3}}{\left( \sigma_{2}\,\rho_{2,3}^2-2\,\rho_{1,2}\,\rho_{1,3}\,\sigma_{2}\,\rho _{2,3}+\left(\rho_{1,3}^2+\rho_{1,2}^2-1\right)\,\sigma_{2}\right)\, \sigma_{3}}}&{\frac{\rho_{1,2}^2-1}{\left(\rho_{2,3}^2-2\,\rho_{1,2} \,\rho_{1,3}\,\rho_{2,3}+\rho_{1,3}^2+\rho_{1,2}^2-1\right)\,\sigma _{3}^2}}\cr \end{bmatrix}\]
Exponent, `r res$tex[[3]]`
:
\[\frac{\left(\sigma_{1}^2\,\rho_{1,2}^2-\sigma_{1}^2\right)\,\sigma_{2}^ 2\,y_{3}^2+\left(\left(\left(2\,\sigma_{1}^2\,\sigma_{2}\,y_{2}+ \left(2\,\mu_{1}\,\sigma_{1}-2\,\sigma_{1}\,y_{1}\right)\,\rho_{1,2} \,\sigma_{2}^2-2\,\sigma_{1}^2\,\mu_{2}\,\sigma_{2}\right)\,\rho_{2, 3}-2\,\sigma_{1}^2\,\rho_{1,2}\,\rho_{1,3}\,\sigma_{2}\,y_{2}+\left( 2\,\sigma_{1}\,y_{1}-2\,\mu_{1}\,\sigma_{1}\right)\,\rho_{1,3}\, \sigma_{2}^2+2\,\sigma_{1}^2\,\rho_{1,2}\,\rho_{1,3}\,\mu_{2}\, \sigma_{2}\right)\,\sigma_{3}+\left(2\,\sigma_{1}^2-2\,\sigma_{1}^2 \,\rho_{1,2}^2\right)\,\sigma_{2}^2\,\mu_{3}\right)\,y_{3}+\left( \left(y_{1}^2-2\,\mu_{1}\,y_{1}+\mu_{1}^2\right)\,\sigma_{2}^2\,\rho _{2,3}^2+\left(\left(2\,\mu_{1}\,\sigma_{1}-2\,\sigma_{1}\,y_{1} \right)\,\rho_{1,3}\,\sigma_{2}\,y_{2}+\left(2\,\sigma_{1}\,y_{1}-2 \,\mu_{1}\,\sigma_{1}\right)\,\rho_{1,3}\,\mu_{2}\,\sigma_{2}\right) \,\rho_{2,3}+\left(\sigma_{1}^2\,\rho_{1,3}^2-\sigma_{1}^2\right)\,y _{2}^2+\left(\left(2\,\sigma_{1}\,y_{1}-2\,\mu_{1}\,\sigma_{1} \right)\,\rho_{1,2}\,\sigma_{2}+\left(2\,\sigma_{1}^2-2\,\sigma_{1}^ 2\,\rho_{1,3}^2\right)\,\mu_{2}\right)\,y_{2}+\left(-y_{1}^2+2\,\mu _{1}\,y_{1}-\mu_{1}^2\right)\,\sigma_{2}^2+\left(2\,\mu_{1}\,\sigma _{1}-2\,\sigma_{1}\,y_{1}\right)\,\rho_{1,2}\,\mu_{2}\,\sigma_{2}+ \left(\sigma_{1}^2\,\rho_{1,3}^2-\sigma_{1}^2\right)\,\mu_{2}^2 \right)\,\sigma_{3}^2+\left(\left(-2\,\sigma_{1}^2\,\sigma_{2}\,y_{2 }+\left(2\,\sigma_{1}\,y_{1}-2\,\mu_{1}\,\sigma_{1}\right)\,\rho_{1, 2}\,\sigma_{2}^2+2\,\sigma_{1}^2\,\mu_{2}\,\sigma_{2}\right)\,\rho_{ 2,3}+2\,\sigma_{1}^2\,\rho_{1,2}\,\rho_{1,3}\,\sigma_{2}\,y_{2}+ \left(2\,\mu_{1}\,\sigma_{1}-2\,\sigma_{1}\,y_{1}\right)\,\rho_{1,3} \,\sigma_{2}^2-2\,\sigma_{1}^2\,\rho_{1,2}\,\rho_{1,3}\,\mu_{2}\, \sigma_{2}\right)\,\mu_{3}\,\sigma_{3}+\left(\sigma_{1}^2\,\rho_{1,2 }^2-\sigma_{1}^2\right)\,\sigma_{2}^2\,\mu_{3}^2}{\left(\sigma _{1}^2\,\sigma_{2}^2\,\rho_{2,3}^2-2\,\sigma_{1}^2\,\rho_{1,2}\,\rho _{1,3}\,\sigma_{2}^2\,\rho_{2,3}+\left(\sigma_{1}^2\,\rho_{1,3}^2+ \sigma_{1}^2\,\rho_{1,2}^2-\sigma_{1}^2\right)\,\sigma_{2}^2\right) \,\sigma_{3}^2}\]
Density, `r res$tex[[4]]`
:
\[\frac{e^ {- {\frac{\left(\sigma_{1}^2\,\rho_{1,2}^2-\sigma_{1}^2\right)\, \sigma_{2}^2\,y_{3}^2+\left(\left(\left(2\,\sigma_{1}^2\,\sigma_{2} \,y_{2}+\left(2\,\mu_{1}\,\sigma_{1}-2\,\sigma_{1}\,y_{1}\right)\, \rho_{1,2}\,\sigma_{2}^2-2\,\sigma_{1}^2\,\mu_{2}\,\sigma_{2}\right) \,\rho_{2,3}-2\,\sigma_{1}^2\,\rho_{1,2}\,\rho_{1,3}\,\sigma_{2}\,y _{2}+\left(2\,\sigma_{1}\,y_{1}-2\,\mu_{1}\,\sigma_{1}\right)\,\rho _{1,3}\,\sigma_{2}^2+2\,\sigma_{1}^2\,\rho_{1,2}\,\rho_{1,3}\,\mu_{2 }\,\sigma_{2}\right)\,\sigma_{3}+\left(2\,\sigma_{1}^2-2\,\sigma_{1} ^2\,\rho_{1,2}^2\right)\,\sigma_{2}^2\,\mu_{3}\right)\,y_{3}+\left( \left(y_{1}^2-2\,\mu_{1}\,y_{1}+\mu_{1}^2\right)\,\sigma_{2}^2\,\rho _{2,3}^2+\left(\left(2\,\mu_{1}\,\sigma_{1}-2\,\sigma_{1}\,y_{1} \right)\,\rho_{1,3}\,\sigma_{2}\,y_{2}+\left(2\,\sigma_{1}\,y_{1}-2 \,\mu_{1}\,\sigma_{1}\right)\,\rho_{1,3}\,\mu_{2}\,\sigma_{2}\right) \,\rho_{2,3}+\left(\sigma_{1}^2\,\rho_{1,3}^2-\sigma_{1}^2\right)\,y _{2}^2+\left(\left(2\,\sigma_{1}\,y_{1}-2\,\mu_{1}\,\sigma_{1} \right)\,\rho_{1,2}\,\sigma_{2}+\left(2\,\sigma_{1}^2-2\,\sigma_{1}^ 2\,\rho_{1,3}^2\right)\,\mu_{2}\right)\,y_{2}+\left(-y_{1}^2+2\,\mu _{1}\,y_{1}-\mu_{1}^2\right)\,\sigma_{2}^2+\left(2\,\mu_{1}\,\sigma _{1}-2\,\sigma_{1}\,y_{1}\right)\,\rho_{1,2}\,\mu_{2}\,\sigma_{2}+ \left(\sigma_{1}^2\,\rho_{1,3}^2-\sigma_{1}^2\right)\,\mu_{2}^2 \right)\,\sigma_{3}^2+\left(\left(-2\,\sigma_{1}^2\,\sigma_{2}\,y_{2 }+\left(2\,\sigma_{1}\,y_{1}-2\,\mu_{1}\,\sigma_{1}\right)\,\rho_{1, 2}\,\sigma_{2}^2+2\,\sigma_{1}^2\,\mu_{2}\,\sigma_{2}\right)\,\rho_{ 2,3}+2\,\sigma_{1}^2\,\rho_{1,2}\,\rho_{1,3}\,\sigma_{2}\,y_{2}+ \left(2\,\mu_{1}\,\sigma_{1}-2\,\sigma_{1}\,y_{1}\right)\,\rho_{1,3} \,\sigma_{2}^2-2\,\sigma_{1}^2\,\rho_{1,2}\,\rho_{1,3}\,\mu_{2}\, \sigma_{2}\right)\,\mu_{3}\,\sigma_{3}+\left(\sigma_{1}^2\,\rho_{1,2 }^2-\sigma_{1}^2\right)\,\sigma_{2}^2\,\mu_{3}^2}{2\,\left( \sigma_{1}^2\,\sigma_{2}^2\,\rho_{2,3}^2-2\,\sigma_{1}^2\,\rho_{1,2} \,\rho_{1,3}\,\sigma_{2}^2\,\rho_{2,3}+\left(\sigma_{1}^2\,\rho_{1,3 }^2+\sigma_{1}^2\,\rho_{1,2}^2-\sigma_{1}^2\right)\,\sigma_{2}^2 \right)\,\sigma_{3}^2}} }}{2^{\frac{3}{2}}\,\pi^{\frac{3}{2 }}\,\sqrt{-\sigma_{1}^2\,\sigma_{2}^2\,\rho_{2,3}^2+2\,\sigma_{1}^2 \,\rho_{1,2}\,\rho_{1,3}\,\sigma_{2}^2\,\rho_{2,3}+\left(-\sigma_{1} ^2\,\rho_{1,3}^2-\sigma_{1}^2\,\rho_{1,2}^2+\sigma_{1}^2\right)\, \sigma_{2}^2}\,\left| \sigma_{3}\right| }\]
Evaluate expression from Maxima in R:
expr <- res$expr[4]%>%paste0("1/(sqrt((2*pi)**",neq,"*",res$expr[1],"))*exp(-1/2*", ., ")")%>%gsub("\\[(\\d)\\]", "[[\\1]]",.)
yy <- replicate(neq, {rnorm(100, mean=4, sd=2)})
sigma <- apply(yy, 2, sd)
rho <- cor(yy)
y <- lapply(1:neq, function(x)yy[,x])
mu <- apply(yy, 2, mean)
r1 <- expr%>%parse(text=.)%>%eval
Comparison with results from other functions and packages:
require(emdbook)
yy%>%dmvnorm(.,mu=mu,Sigma=cov(.))%>%"-"(r1)%>%abs%>%sum
## [1] 1.803223e-16
require(mnormt)
yy%>%dmnorm(.,mean=mu,cov(.))%>%"-"(r1)%>%abs%>%sum
## [1] 1.411284e-16
Solution is not perfect, but it made my life so much easier.