Many packages provide tools for downstream processing — e.g. plot
diagnostics, model comparisons, visual plotting, and regression tables –
that are compatible with merMod objects. This vignette
provides instructions on some recommended packages with examples. Rather
than attempting an exhaustive survey, We aimed to include the more
popular packages.
While plot(merMod_object) provides some diagnostic plots
(see ?lme4::plot.merMod), other packages provide more
complete functionality.
This section emphasizes the performance and
DHARMa package.
library(lme4)
# Example of a linear mixed-effects model (LMM)
mod_ss <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
# Assume a non-mixed model for comparison
mod_ss2 <- lm(Reaction ~ Days, sleepstudy)
# Another (LMM) example, and others for comparison
mod_cw = lmer(weight ~ Time + Diet + (1|Chick), data = ChickWeight, REML = FALSE)
mod_cw2 = lm(weight ~ Time + Diet, data = ChickWeight)
mod_cw3 = lmer(weight ~ Time + (1|Chick), data = ChickWeight, REML = FALSE)
# Example of a Poisson generalized linear mixed-effects model (GLMM)
form <- TICKS~YEAR+HEIGHT+(1|BROOD)+(1|INDEX)+(1|LOCATION)
mod_gt <- glmer(form, family="poisson",data=grouseticks)The performance package
is excellent for evaluating the quality of model fit and works well with
merMod objects (and many others). The first example we’ll
show is performance::check_model(), which does checks for
classical assumptions used for most linear models: normality of
residuals, linear relationship, homogeneity of variance, outliers.
For mixed-effect models, it also checks for normality of random effects. The uncommonly used posterior predictive checks is meant to see whether the distributional family used fits well to the data.
For those working with mixed models, it may important to check whether some dimensions of the variance-covariance are estimated to be zero.
## [1] FALSE
The DHARMa package uses a simulation-based method for residual checks for generalized linear (mixed) models.
The left plot performs three tests, the Kolmogorov–Smirnov test,
simulation-based dispersion tests (see
??DHARMa::testDispersion), and simulation-based outlier
tests (see ??DHARMa::testOutliers).
The car package
was designed for “An R Companion to Applied Regression” to provide
functions that are applied to a fitted regression model. The
car::influencePlot() creates a diagnostic bubble plot that
visualizes influential observations in a regression by plotting
Studentized residuals against leverage (hat values), with bubble size
representing Cook’s distance.
## StudRes Hat CookD
## 10 1.6589487 0.2931772 0.5707626
## 20 0.4498749 0.2931772 0.0419733
## 57 5.5270053 0.1218763 2.1198891
## 60 -4.5704359 0.2931772 4.3321637
Most model comparisons for merMod objects will focus on
the fixed effects, and the ones shown will be just for fixed effects
unless otherwise stated.
## # Indices of model performance
##
## AIC | AICc | BIC | R2 (cond.) | R2 (marg.) | ICC | RMSE | Sigma
## ----------------------------------------------------------------------------------
## 1755.628 | 1756.114 | 1774.786 | 0.799 | 0.279 | 0.722 | 23.438 | 25.592
We can also use performance::compare_performance().
## # Comparison of Model Performance Indices
##
## Name | Model | AIC (weights) | AICc (weights) | BIC (weights) | RMSE | Sigma | R2 (cond.)
## ---------------------------------------------------------------------------------------------------
## mod_ss | lmerMod | 1764.0 (>.999) | 1764.5 (>.999) | 1783.1 (>.999) | 23.438 | 25.592 | 0.799
## mod_ss2 | lm | 1906.3 (<.001) | 1906.4 (<.001) | 1915.9 (<.001) | 47.449 | 47.715 |
##
## Name | R2 (marg.) | ICC | R2 | R2 (adj.)
## ------------------------------------------------
## mod_ss | 0.279 | 0.722 | |
## mod_ss2 | | | 0.286 | 0.282
The pbkrtest
package implements three tests to examine fixed effects from linear
mixed-effects models (LMM): - Parametric bootstrap test
pbkrtest::PBmodcomp() - Kenward-Roger-type F-test
pbkrtest::KRmodcomp() - Satterthwaite-type F-test
pbkrtest::SATmodcomp()
fm0 <- lmer(weight ~ Time + (1|Chick), data = ChickWeight)
fm1 <- update(fm0, .~.-Time)
pbkrtest::PBmodcomp(fm0, fm1)## Bootstrap test; time: 21.32 sec; samples: 1000; extremes: 0;
## large : weight ~ Time + (1 | Chick)
## stat df p.value
## LRT 926.47 1 < 2.2e-16 ***
## PBtest 926.47 0.000999 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## large : weight ~ Time + (1 | Chick)
## small : weight ~ (1 | Chick)
## stat ndf ddf F.scaling p.value
## Ftest 2471.07 1.00 530.53 1 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## large : weight ~ Time + (1 | Chick)
## small : weight ~ (1 | Chick)
## statistic ndf ddf p.value
## [1,] 2471.7 1.0 530.93 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Note: the pbkrtest::KRmodcomp() function was package was
intended for lmerMod objects only, as the
Kenward-Roger-type F-test was designed for fitted linear mixed
models.
The MuMIn package provides tools for model selection by automating the process through subsets of the maximum model.
## Global model call: lmer(formula = weight ~ Time + Diet + (1 | Chick), data = ChickWeight,
## REML = FALSE)
## ---
## Model selection table
## (Intrc) Diet Time df logLik AICc delta weight
## 4 11.23 + 8.718 7 -2802.600 5619.4 0.00 0.996
## 3 27.84 8.726 4 -2811.172 5630.4 11.02 0.004
## 2 101.80 + 6 -3265.144 6542.4 923.04 0.000
## 1 121.00 3 -3274.405 6554.9 935.46 0.000
## Models ranked by AICc(x)
## Random terms (all models):
## 1 | Chick
MuMIn::dredge() doesn’t print the fitted model object,
so to find which the labelled models 1, 2, etc., correspond to, use
get.models(mumin_compare, subset = TRUE). The output tends
to be long and is omitted for space reasons.
A common way to test the significance of random effects is through a
likelihood ratio test. The RLRsim package includes
fast simulation-based exact tests that is compatible with
lmerMod objects.
# First model is one with the random effects,
# the second is must be a lm-object.
RLRsim::exactLRT(m = mod_cw, m0 = mod_cw2)## No restrictions on fixed effects. REML-based inference preferable.
##
## simulated finite sample distribution of LRT. (p-value based on 10000 simulated values)
##
## data:
## LRT = 172.41, p-value < 2.2e-16
The multcomp package was designed for performing general linear hypotheses in parametric models.
##
## Simultaneous Tests for General Linear Hypotheses
##
## Fit: lmer(formula = weight ~ Time + Diet + (1 | Chick), data = ChickWeight,
## REML = FALSE)
##
## Linear Hypotheses:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) == 0 11.2311 5.5780 2.013 0.17828
## Time == 0 8.7175 0.1753 49.742 < 0.001 ***
## Diet2 == 0 16.2193 9.0788 1.787 0.28082
## Diet3 == 0 36.5527 9.0788 4.026 < 0.001 ***
## Diet4 == 0 30.0255 9.0855 3.305 0.00447 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)
We can also look at confidence intervals for the pairwise difference between groups:
car::Anova() can be used traditional ANOVA-style tables
using Wald \(\chi^{2}\) statistics on
the fixed effects only.
## Analysis of Deviance Table (Type II Wald chisquare tests)
##
## Response: weight
## Chisq Df Pr(>Chisq)
## Time 2474.247 1 < 2.2e-16 ***
## Diet 20.466 3 0.0001359 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
When interpreting results, we may be interested in seeing the
coefficient plots or effect plots. Usually, the first step requires
predicting the response and then fitting the model (using
plot() or ggplot()).
The dotwhisker package was designed to easily make dot-and-whisker plots from regression results. We recommend scaling by two standard deviations as recommended by Gelman (2008)..
effects_gt <- broom.mixed::tidy(mod_cw, effect = "fixed", conf.int = TRUE)
dotwhisker::dwplot(effects_gt, by_2sd = TRUE)Notice that we combined the usage of broom.mixed package to extract the fixed effects (for the dot-and-whisker plots). Details can be found by clicking the url for broom.mixed or see this section.
As its name implies, the emmeans package obtains the estimated marginal means.
More details and examples for lmerMod objects can be
found here.
The ggEffects package also computes the marginal effects, returning ready-to-graph results via the ggplot2 package.
The ggEffects
documentation already includes more details and examples for
lmerMod objects.
predict_ss <- ggeffects::predict_response(mod_ss, "Days [0:9]")
ggplot(predict_ss, aes(x, predicted)) +
geom_line() + labs(x = "Days", y = "Predicted Reaction") +
geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = 0.1)The ggEffects package is supposed to be superseded by modelbased, however,
it may run errors when using lmerMod objects.
As its package name implies, the goal of marginaleffects is to compute
the marginal means as well as various other model comparisons for a lot
of different classes, merMod objects being one of them.
More details and examples for mixed models can be found here.
library(marginaleffects)
# Note: datagrid depends on marginaleffects
pred_ss <- predictions(mod_ss, newdata = datagrid(Days = 0:9))
# Plot using ggplot2
ggplot(pred_ss, aes(x = Days, y = estimate)) +
geom_line() +
geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = 0.1) +
labs(x = "Days", y = "Predicted Reaction")Although the output can easily be seen by applying
summary() to the merMod object, it can be
useful to use packages that automatically format the results nicely into
\(\LaTeX\), HTML, and text output.
There are many such functions that appear to be compatiable with
merMod objects.
In this section, we display the code (though it is not executed) along with the resulting HTML output. The corresponding versions for \(\LaTeX\) and plain text are equally straightforward.
The sjPlot package can be used for table outputs in HTML. There is a fantastic example written in their documentation here.
The Parameters package, as its name suggests, extracts parameters from a fitted model. Their own documentation also includes examples of mixed models from lme4 here. and exact table formatting can be found here.
The Huxtable
package can be used to create \(\LaTeX\) (see
huxtable::print_latex()) and HTML (see
huxtable::print_html()) output.
| (1) | |
|---|---|
| (Intercept) | 251.405 |
| (6.825) | |
| Days | 10.467 |
| (1.546) | |
| sd__(Intercept) | 24.741 |
| (NA) | |
| cor__(Intercept).Days | 0.066 |
| (NA) | |
| sd__Days | 5.922 |
| (NA) | |
| sd__Observation | 25.592 |
| (NA) | |
| N | 180 |
| logLik | -871.814 |
| AIC | 1755.628 |
| *** p < 0.001; ** p < 0.01; * p < 0.05. | |
Stargazer was intended to create well formatted regression tables in \(\LaTeX\), HTML/CSS, and plain text.
| Dependent variable: | |
| Reaction | |
| Days | 10.467*** |
| (1.546) | |
| Constant | 251.405*** |
| (6.825) | |
| Observations | 180 |
| Log Likelihood | -871.814 |
| Akaike Inf. Crit. | 1,755.628 |
| Bayesian Inf. Crit. | 1,774.786 |
| Note: | p<0.1; p<0.05; p<0.01 |
We could also use the texreg package to
create \(\LaTeX\)
texreg::texreg() and HTML texreg::htmlreg()
tables.
| Model 1 | |
|---|---|
| (Intercept) | 251.41*** |
| (6.82) | |
| Days | 10.47*** |
| (1.55) | |
| AIC | 1755.63 |
| BIC | 1774.79 |
| Log Likelihood | -871.81 |
| Num. obs. | 180 |
| Num. groups: Subject | 18 |
| Var: Subject (Intercept) | 612.10 |
| Var: Subject Days | 35.07 |
| Cov: Subject (Intercept) Days | 9.60 |
| Var: Residual | 654.94 |
| ***p < 0.001; **p < 0.01; *p < 0.05 | |
The goal of the memisc
package is to make it easier to deal with survey data sets, such as
table formatting in \(\LaTeX\) (using
memisc::mtable_format_latex()) and HTML (using
memisc::mtable_format_html()).
mem_tab <- memisc::mtable("Model 1"=mod_cw,"Model 2"=mod_cw2,
"Model 3"=mod_cw3, summary.stats=c("sigma","R-squared","F","p","N"))
memisc::mtable_format_html(mem_tab)| Model 1 | Model 2 | Model 3 | |||||||
| (Intercept) | 11 | . | 244 | 10 | . | 924** | 27 | . | 845*** |
| (5 | . | 789) | (3 | . | 361) | (4 | . | 388) | |
| Time | 8 | . | 717*** | 8 | . | 750*** | 8 | . | 726*** |
| (0 | . | 175) | (0 | . | 222) | (0 | . | 176) | |
| Diet: 2/1 | 16 | . | 210 | 16 | . | 166*** | |||
| (9 | . | 464) | (4 | . | 086) | ||||
| Diet: 3/1 | 36 | . | 543*** | 36 | . | 499*** | |||
| (9 | . | 464) | (4 | . | 086) | ||||
| Diet: 4/1 | 30 | . | 013** | 30 | . | 233*** | |||
| (9 | . | 471) | (4 | . | 107) | ||||
| N | 578 | 578 | 578 | ||||||
| R-squared | 0 | . | 745 | ||||||
| sigma | 35 | . | 993 | ||||||
| F | 419 | . | 177 | ||||||
| p | 0 | . | 000 | ||||||
|
Significance: *** = p < 0.001; ** = p < 0.01; * = p < 0.05 |
|||||||||
This section covers other external packages that may be useful but does not fall under the category of previous sections.
The broom package takes
output from built-in functions in R and converts them into a tibble, which is an alternative
to R’s built in data.frame.
Broom.mixed
is a spin-off of broom where it takes outputs from outputs from
specifically mixed models from various popular mixed model packages, and
of course lme4 is one of them.
## # A tibble: 6 × 6
## effect group term estimate std.error statistic
## <chr> <chr> <chr> <dbl> <dbl> <dbl>
## 1 fixed <NA> (Intercept) 251. 6.82 36.8
## 2 fixed <NA> Days 10.5 1.55 6.77
## 3 ran_pars Subject sd__(Intercept) 24.7 NA NA
## 4 ran_pars Subject cor__(Intercept).Days 0.0656 NA NA
## 5 ran_pars Subject sd__Days 5.92 NA NA
## 6 ran_pars Residual sd__Observation 25.6 NA NA
The equatiomatic package takes a fitted model and writes the equation for you, formatted in \(\LaTeX\).
The output provides \(\LaTeX\) code that outputs the following equation: \[ \begin{aligned} \operatorname{Reaction}_{i} &\sim N \left(\alpha_{j[i]} + \beta_{1j[i]}(\operatorname{Days}), \sigma^2 \right) \\ \left( \begin{array}{c} \begin{aligned} &\alpha_{j} \\ &\beta_{1j} \end{aligned} \end{array} \right) &\sim N \left( \left( \begin{array}{c} \begin{aligned} &\mu_{\alpha_{j}} \\ &\mu_{\beta_{1j}} \end{aligned} \end{array} \right) , \left( \begin{array}{cc} \sigma^2_{\alpha_{j}} & \rho_{\alpha_{j}\beta_{1j}} \\ \rho_{\beta_{1j}\alpha_{j}} & \sigma^2_{\beta_{1j}} \end{array} \right) \right) \text{, for Subject j = 1,} \dots \text{,J} \end{aligned} \]