Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add loo_epred #1643

Merged
merged 4 commits into from
May 27, 2024
Merged

add loo_epred #1643

merged 4 commits into from
May 27, 2024

Conversation

avehtari
Copy link
Contributor

@avehtari avehtari commented Apr 8, 2024

Fixes #1641, but requires couple decisions before merged

@paul-buerkner
Copy link
Owner

Can you remind me which decisions need to be made?

@avehtari
Copy link
Contributor Author

What should be the order of dimensions for multioutput target? See how I coded and documented now, but there is also an alternative and I was not sure which you would prefer. Now I don't remember why I wrote "couple"

@paul-buerkner
Copy link
Owner

Thanks! I would prefer observations to be rows and response variables to be columns. I understand that you currently implemented in the other way around?

@avehtari
Copy link
Contributor Author

The current doc says for loo_predict() and loo_linpred()

#' @return \code{loo_predict} and \code{loo_linpred} return a vector with one
#' element per observation. The only exception is if \code{type = "quantile"}
#' and \code{length(probs) >= 2}, in which case a separate vector for each
#' element of \code{probs} is computed and they are returned in a matrix with
#' \code{length(probs)} rows and one column per observation.

So here one column per observation.

For loo_predictive_interval()

#' \code{loo_predictive_interval} returns a matrix with one row per
#' observation and two columns.

So here one row per observation

For posterior_epred() the doc says

#' @return An \code{array} of draws. For
#' categorical and ordinal models, the output is an S x N x C array.
#' Otherwise, the output is an S x N matrix, where S is the number of
#' posterior draws, N is the number of observations, and C is the number of
#' categories. In multivariate models, an additional dimension is added to the
#' output which indexes along the different response variables.

So one column per observation, and possible third dimension for multioutput. So it seems column vs row is 2-1

But, then loo_epred() is different from posterior_epred() as following other loo predictive functions, it's using E_loo() and S draws is used to form a weighted mean. I find it logical that following posterior_epred() the dimensions would be 1 x N x C, but you can choose which order you want.

loo_epred() has also argument type which could be quantile with more than one probs and the one dimension is for quantiles.

Not in this PR, but mentioning in case if it would affect the logic, that it might be useful to have also such LOO predictive functions that return S importance-resampled draws instead of using E_loo()

@paul-buerkner paul-buerkner added this to the brms 2.22.0 milestone May 27, 2024
@paul-buerkner
Copy link
Owner

Thanks! I will think about it today and ideally even edit and merge this PR today. :-)

@paul-buerkner
Copy link
Owner

I have now made a couple of edits to make the output format of all these loo_* functions consistent with out post-processing functions that also return per-observation information. That is, observation are now always represented as rows.

@paul-buerkner paul-buerkner merged commit 652a7c0 into paul-buerkner:master May 27, 2024
5 checks passed
@paul-buerkner
Copy link
Owner

Thank you for this PR!

@avehtari
Copy link
Contributor Author

I kept thinking that there was another thing to make decision and now I remembered it! Note that there is a comment on line

# #' @importFrom rstantools loo_epred

as corresponding rstantools PR stan-dev/rstantools#122 is not yet in CRAN

@paul-buerkner
Copy link
Owner

I know. I have temporarily added the loo_epred generic in brms. Once the new rstantools is on CRAN, I will remove it from brms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add loo_epred
2 participants