-
-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible bug in brms/emmeans integration #1654
Comments
Thank you for reporting this issue. I am no emmeans expert so for me it's hard to tell what is going on. @rvlenth do you happen to have an idea perhaps? |
I have no clue. I am bothered by the fact that there are two (very) different objects named As for the "custom" code, I disagree that it is what My suggestion for finding out more is to try this, using the second version of
So far, we are now seeing directly what If you still see the serious discrepancies, do this:
This gives you the grid of all fixed-effects factors, which is the basis for all |
Edited #1654 (comment) to use
Edited #1654 (comment) to clarify that comment.
Yes: emm_itself <- emmeans(
object = model_brms,
specs = ~ARMCD:AVISIT,
wt.nuis = "proportional",
nuisance = c("USUBJID", "RACE", "SEX")
)
summary(emm_itself)
#> ARMCD AVISIT emmean lower.HPD upper.HPD
#> PBO VIS1 -18.08 -22.29 -13.617
#> TRT VIS1 -14.81 -18.28 -11.236
#> PBO VIS2 -16.08 -19.48 -12.372
#> TRT VIS2 -12.68 -16.50 -9.329
#> PBO VIS3 -12.53 -15.84 -8.909
#> TRT VIS3 -9.71 -13.31 -6.333
#> PBO VIS4 -7.93 -11.54 -4.363
#> TRT VIS4 -3.46 -7.11 0.102
#>
#> Results are averaged over the levels of: 2 nuisance factors
#> Point estimate displayed: median
#> HPD interval probability: 0.95
as.data.frame(summary_brms_emmeans)
#> ARMCD AVISIT mean lower upper source
#> 1 PBO VIS1 -18.083219 -22.287808 -13.6172246 4_brms_emmeans
#> 2 TRT VIS1 -14.812490 -18.276953 -11.2362895 4_brms_emmeans
#> 3 PBO VIS2 -16.079485 -19.477840 -12.3717137 4_brms_emmeans
#> 4 TRT VIS2 -12.679113 -16.503203 -9.3292318 4_brms_emmeans
#> 5 PBO VIS3 -12.527884 -15.841525 -8.9088424 4_brms_emmeans
#> 6 TRT VIS3 -9.709981 -13.307893 -6.3334955 4_brms_emmeans
#> 7 PBO VIS4 -7.928348 -11.537075 -4.3630501 4_brms_emmeans
#> 8 TRT VIS4 -3.462008 -7.109919 0.1019503 4_brms_emmeans
summary(emm_itself)$emmean - summary_brms_emmeans$mean
#> [1] 0 0 0 0 0 0 0 0
summary(emm_itself)$lower.HPD - summary_brms_emmeans$lower
#> [1] 0 0 0 0 0 0 0 0
summary(emm_itself)$upper.HPD - summary_brms_emmeans$upper
#> [1] 0 0 0 0 0 0 0 0
The summary says the results are averaged over two nuisance variables, whereas the code supplies three.
Only slight differences: summary_emmeans <- summary(emm_itself, point.est = mean)
max(abs(summary_emmeans$emmean - summary_brms_emmeans$mean))
#> [1] 0.0202332
When I do that, I see close enough agreement with the native # Predictions
new_data <- emmeans::ref_grid(model_brms)@grid
predictions <- predict(model_brms, newdata = new_data)
grid <- mutate(new_data, estimate = predictions[, "Estimate"])
# Proportional weights
weighted_grid <- grid %>%
left_join(y = count(data, RACE, SEX), by = c("RACE", "SEX")) %>%
rename(.wgt. = n)
# Marginal means
custom <- weighted_grid %>%
group_by(ARMCD, AVISIT) %>%
summarize(mean = sum(estimate * .wgt.) / sum(.wgt.)) %>%
arrange(AVISIT, ARMCD)
custom
#> # A tibble: 8 × 3
#> # Groups: ARMCD [2]
#> ARMCD AVISIT mean
#> <fct> <fct> <dbl>
#> 1 PBO VIS1 -4.67
#> 2 TRT VIS1 -1.24
#> 3 PBO VIS2 -2.47
#> 4 TRT VIS2 0.957
#> 5 PBO VIS3 1.00
#> 6 TRT VIS3 3.78
#> 7 PBO VIS4 5.57
#> 8 TRT VIS4 10.1
# Good enough agreement with lm marginal means
summary_lm_emmeans
#> # A tibble: 8 × 6
#> ARMCD AVISIT mean lower upper source
#> <fct> <fct> <dbl> <dbl> <dbl> <chr>
#> 1 PBO VIS1 -4.60 -5.98 -3.22 2_lm_emmeans
#> 2 TRT VIS1 -1.29 -2.76 0.185 2_lm_emmeans
#> 3 PBO VIS2 -2.54 -3.92 -1.17 2_lm_emmeans
#> 4 TRT VIS2 0.847 -0.625 2.32 2_lm_emmeans
#> 5 PBO VIS3 0.984 -0.393 2.36 2_lm_emmeans
#> 6 TRT VIS3 3.80 2.33 5.27 2_lm_emmeans
#> 7 PBO VIS4 5.60 4.22 6.98 2_lm_emmeans
#> 8 TRT VIS4 10.1 8.58 11.5 2_lm_emmeans
max(abs(custom$mean - summary_lm_emmeans$mean))
#> [1] 0.1104108
# Disagreement with the native emmeans/brms integration
max(abs(custom$mean - summary_brms_emmeans$mean))
#> [1] 13.63619 |
Also, thanks for explaining the role of But whether we take the # Create the reference grid.
new_data <- emmeans::ref_grid(model_lm)@grid
grid <- mutate(new_data, estimate = predict(model_lm, newdata = new_data))
# Apply proportional weights.
weighted_grid <- grid %>%
left_join(y = count(data, RACE, SEX), by = c("RACE", "SEX")) %>%
mutate(.wgt. = n)
# Compute marginal means using the weighted grid.
summary_lm_emmeans_using_grid <- weighted_grid %>%
group_by(ARMCD, AVISIT) %>%
summarize(mean = sum(estimate * .wgt.) / sum(.wgt.)) %>%
arrange(AVISIT, ARMCD)
# Both approaches agree:
max(abs(summary_lm_emmeans_using_grid$mean - summary_lm_emmeans$mean))
#> [1] 5.329071e-15 |
We can go all over the place looking at examples and trying to guess what is done, but it shouldn't be too difficult to tell by looking at the code. The emmeans package provides the infrastructure, but what it does to actually estimate things depends on the
In the arguments, This is not a very complex function (seems simpler than a lot of the code in this issue), and I suggets trying to understand what it does. For example, maybe what you need to do is add the argument |
@wlandau PS -- of course, you should also look at |
Related: #1630, https://discourse.mc-stan.org/t/trouble-with-brms-emmeans-integration/34664. I am posting here because I think the issue might be a bug in
brms
, and the comment section in my Stan Discourse post has not been active.brms
integrates withemmeans
for marginal mean calculations, but the results seem off. The reprex below uses themmrm
package's FEV1 dataset, a simulation of a clinical trial with treatment groups inARMCD
and discrete time points for repeated measures inAVISIT
. The example compares 4 different methods of estimating marginal means for each combination ofARMCD
andAVISIT
:lm()
+emmeans
: fit a model withlm()
and get marginal means withemmeans
.brms
+ custom: fit a model withbrms
and use a custom linear transformation to map model parameters to marginal means.brms
+emmeans
: use the nativebrms
/emmeans
integration to estimate marginal means from the fittedbrms
model.There is reasonable agreement among approaches (1), (2), and (3), and approach (4) gives very different results from all the others. I ran the following on the current development version of
brms
in themaster
branch (298b947)The text was updated successfully, but these errors were encountered: