Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prototype for set_inits() #1646

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Conversation

venpopov
Copy link
Contributor

@venpopov venpopov commented Apr 11, 2024

As discussed in #1645, here is a prototype for a set_inits() function modeled after set_prior()

Here's a functioning example, setting inits on Intercept and b_sigma:

library(brms)

formula <- bf(count ~ Trt * Age,
              sigma ~ Trt * Age)

inits <- set_inits('normal(0, 1)', class = "Intercept", dpar = "mu") +
  set_inits('uniform(-0.1, 0.1)', class = "b", dpar = "sigma")

inits
#>         distribution     class coef group  dpar nlpar
#> 1       normal(0, 1) Intercept                       
#> 2 uniform(-0.1, 0.1)         b            sigma

fit <- brm(formula, epilepsy, init = inits, cores = 2, chains = 2, refresh = 0, backend = 'cmdstanr')
#> Start sampling
#> Init values were only set for a subset of parameters. 
#> Missing init values for the following parameters:
#>  - chain 1: b, Intercept_sigma
#>  - chain 2: b, Intercept_sigma
#> 
#> To disable this message use options(cmdstanr_warn_inits = FALSE).
#> Running MCMC with 2 parallel chains...
#> 
#> Chain 2 finished in 1.5 seconds.
#> Chain 1 finished in 1.5 seconds.
#> 
#> Both chains finished successfully.
#> Mean chain execution time: 1.5 seconds.
#> Total execution time: 1.6 seconds.

str(fit$stan_args$init)
#> List of 2
#>  $ :List of 2
#>   ..$ Intercept: num 0.717
#>   ..$ b_sigma  : num [1:3(1d)] -0.0496 0.0666 -0.0425
#>  $ :List of 2
#>   ..$ Intercept: num -0.784
#>   ..$ b_sigma  : num [1:3(1d)] -0.029 0.0712 0.077

Created on 2024-04-11 with reprex v2.1.0

Implementation

  • set_inits() takes arguments following the set_prior() syntax and creates a data.frame of class brmsinits
  • outputs of set_inits() can be added together just like for set_prior()
  • in brm() there is now the following piece of new code:
    # generate inits
    if (is.brmsinits(init)) {
      init <- replicate(
        chains,
        .inits_fun(init, bterms = bterms, data = data, sdata = sdata),
        simplify = FALSE
      )
    }

which calls .inits_fun() internal function to generate a list of inits from the specification of the brmsinits object

  • .inits_fun() takes as argument the brmsinits object, the bterms object, the data and the standata
  • it calls an internal function par_info() which generates lists of type:
list(b_type = "vector", b_dim_name = "Kc_sigma", b_par = "b_sigma")

by using some of the code from stan_predictor

  • it then parses the distributions specified in brmsinits object, and generates random values with the appropriate dimensions and returns a named list that can then be passed to Stan

Limitations and generalization

This currently only works for the Intercept and b type parameters for all population parameters and families. It does not work for random effects sd and for the z vector. This is where I would appreciate your help to generalize this (the current code uses a par_info_fe() function to generate the information for the fixed effects. Similar functions would be needed for the random effects and other special parameters

Tests

I added tests for all the internal helper functions, and also a test for brm with set_inits() in the tests/local folder

Additional features

Would be nice to have

  • an exported function staninits, which takes the formula, data and set_inits and generates the inits list
  • fix some inits to a constant (perhaps with the same constant(x) syntax as for the prior
  • an exported function default_inits, similar to default_prior, which shows the structure of how to specify inits for the valid parameters. the default values themselves are nto very informative - uniform(-2,2) on the unconstained scale, but the point is to see which parameters you can specify inits for

- works with Intercept and b for all distributional parameters
- does not work with random effects and special parameters
- tests for helper functions and local test for brm with set_inits
@GidonFrischkorn
Copy link

This is a great addition to brms! I wanted to add that some users might want to set lb and ub similar to the priors. So, adding this as an option and also making the set_inits consistent with set_prior would be great.

But, I also saw that @venpopov included a uniform distribution as an init distribution. So this might already be one option to include lower and upper bounds. So maybe, this could be added in a vignette to tell users how to set bounds using the set_inits function.

@paul-buerkner paul-buerkner added this to the brms 2.22.0 milestone Apr 12, 2024
@paul-buerkner
Copy link
Owner

Thank you a lot @venpopov! Currently, I am a bit overwhelmed with things so it may take a while until I get back to this issue. But I highly appreciate your work and I am confident it will provide a good basis for making this feature work generally for any kinds of brms models.

@venpopov
Copy link
Contributor Author

I understand. Should I keep working on this if I want to add things, or would you prefer me to leave it as is until you get a chance to look over it?

@paul-buerkner
Copy link
Owner

paul-buerkner commented Apr 13, 2024 via email

@paul-buerkner
Copy link
Owner

I like the general outline of the functionality. Especially the user interface is very pretty I think.

As for the par_info. Perhaps we can abstract the functionality of mapping to the Stan parameters to be reusable for both priors (stan code generation) and setting inits. If we just copy the relevant code, I worry this will become hard to maintain with new terms being potentially added in the future. I will think about it a bit more.

@venpopov
Copy link
Contributor Author

Glad you like the general design! I totally agree we should abstract/refactor common functionality - to make it easier to maintain, less error prone, allow for easier future extension. I didn't do it in the prototype because I wanted to make something simple that works without messing around and potentially breaking the existing code. But we are on the same page about the need for abstraction. I just saw your email, will respond there in a bit

@venpopov
Copy link
Contributor Author

A brief summary of what we discussed.

par_info:

  • par_info should return an object that we can pass to the functions that generate the parameters block of the stan code
  • for now impelement it for fixed and random effects and can temporarily break the code for other parameter types to see if it works to pass the object and use the info from it to reduce code duplication
  • return a data.frame with a column that specifies the stan block information should go in. For now just for parameters, but eventually for all variables defined in stan (transformed parameters, generated quantiles, etc)
  • eventually this will be stored in a brmsdata object

quality control:

  • need to add a check whether the provided set_inits information matches parameters, give an error otherwise
  • need to double check whether rstan and cmdstanr use the inits on the contrained space

UI:

  • set_inits should accept different types of first argument: characters for common distributions (like now), distributional objects, functions that take a single argument of N and output a vector of N values, constants or numeric vectors or list of vectors
  • an exported function like default_inits or inits_summary or inits_info that gives information about the structure just like default_prior

@venpopov
Copy link
Contributor Author

Just a quick update from my side - unfortunately I didn't find the time to work on this and I'll be on vacation for 2 weeks. I'll get back to it when I come back

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants