This post is a short R demonstration on how to make fake data from three statistical models. All models are different in that each model is usually taught or encountered in a different context. But they are all the same because each model involves the generation of a “latent” normal random variable.
I am struck by the ubiquity of the latent random normal variable. It is useful for
- Time series data (Model 1)
- Distance correlated data (Model 2)
- clustered data (Model 3)
library(MASS) library(ggplot2) theme_set(theme_bw(base_size = 16))
Random draw from a multi-normal with a time-series like correlation matrix:
n <- 100 x <- seq(0, 1, length = n) Sigma <- .7 ^ as.matrix(dist(x)) y <- mvrnorm(1, mu = rep(0,n), Sigma = Sigma)
dat1 <- data.frame(x, y) ggplot(dat1, aes(x, y)) + geom_line() + ggtitle("Example Time Series")
Random draw from a multi-normal with the ‘usual’ correlation matrix:
x <- seq(0, 10, length = n) d <- as.matrix(dist(x)) ^ 2 Sigma <- exp(-d) y <- mvrnorm(1, mu = rep(0, n), Sigma=Sigma)
dat2 <- data.frame(x = x, y = y) ggplot(dat2, aes(x, y)) + geom_line() + ggtitle("Example Gaussian Process")
Random draw from a multi-normal matrix (no correlations)
b0 <- 1 var_b <- 1 n_re <- 10 D_re <- model.matrix( ~ gl(n_re, n / n_re) - 1) b <- mvrnorm(1, rep(0, n_re), Sigma = var_b * diag(rep(1, n_re))) y <- b0 + D_re %*% b + rnorm(n)
dat3 <- data.frame(Factor = gl(n_re, n / n_re), y = y) ggplot(dat3, aes(x = Factor, y)) + geom_point(position = position_jitter(width = 0.1)) + ggtitle("Example Random Effect Data")
In a sense, I might just call all of these models “random effects” models, because the glmmTMB package can estimate parameters for all of these situations.