This short entry summarizes what I found (and learned) about using smooth terms (splines) within brms mixed-models fitted with the brms R-package and a zero-one-inflated beta family (zoib) and, about the importance of modelling sigma in these cases.
The main reference to this entry was found in Estimating Distributional Models with brms (Paul Bürkner, 2023) and specifically in the final section about Additive distributional models.
In my particular case, I’m trying to fit a mixed-model to predict the percentage (ratio) of online homework coverage (outcome) ranging from 0 to 1 where there are some zeroes and ones and it did not follow a normal distribution (this is why I’m using a zoib family).
To predict the response variable (homework coverage) I have to deal with smooth terms because one of my predictors (Attempts by item) didn’t show a linear relationship with the outcome. Furthermore, my model includes two varying intercepts: (1|Topic) and (1|IDnumber) because observations are nested by topics and pupils.
Smooth term predictors can be added into a brms formula as s(smooth_term_predictor) as indicated here and in this another useful example. However, as I want to go further and investigate how the smooth term «interact» with another linear predictor to explain the outcome, I have two choices depending on whether the linear predictor is continuous (continuous_predictor) or categorical (categorical_predictor). This is:
If it is continuous we can use:
and, if it is categorical we can use:
s(smooth_term_predictor by= categorical_predictor)
The interpretation of outcomes from a zoib mixed-models for these «particular» explanatory variables is not straightforward. Firstly, because coefficients are given in logit scale which need to be transformed to properly understand their effects, and secondly, because the effects of smooth terms are different depending on which zone of the spline you are looking at. Therefore, it is recommended to look at their plots (e.g., with brms::conditional_effects() or marginaleffects::plot_predictions()) to get a first intuition about their relationships (effects) with the outcome.
Interestingly, I also learned how important is «not to assume that the residual standard deviation sigma was the same» . On the contrary, and very often, one should expect it to vary by the smoothing terms and the varying intercepts as explained here:
[…] In the above example, we only considered the mean of the response distribution to vary by predictor1 and predictor2, but this my not necessarily reasonable assumption, as the variation of the response might vary with these variables as well. Accordingly, we fit splines and effects of district for both the location and the scale parameter, which is called sigma in Gaussian models.
Therefore, as soon as I began to model sigma, I noticed a big improvement in my models fits:
sigma ~ t2(smooth_term, continuous_predictor)+ (1|ID1|TOPIC) +(1|ID2|IDnumber)
Notice as well how, instead of adding the random intercepts as (1|TOPIC) +(1 |IDnumber), I now use (1|ID1|TOPIC) +(1 | ID2|IDnumber) in both formulas (the random effect part and the sigma part) to model the varying intercepts of both model parts as correlated as it is also explained in page 5 here.
Outcome ~ covariate + t2(smooth_term,continuous_predictor) + (1|ID1|TOPIC) + (1|ID2|IDnumber), sigma ~ covariate + t2(smooth_term,continuous_predictor) + (1|ID1|TOPIC) + (1|ID2|IDnumber)
For example, in one of my comparisons through loo_compare() I got the next difference for model1 (without modelling sigma) and model 2 (after modelling sigma):
check_loo_results <- loo_compare(model1, model2) compare(check_loo_results) elpd_diff se_diff model1: 0.0 0.0 model2: -16.5 7.5
This is what I think I know at the moment of writing this post about smooth terms and about modelling residual variance. Readers (if any) must be cautious about my explanations because I’m still a novice in most of these things.