Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hessian NaN with ordered_logistic_lpmf #3016

Open
avehtari opened this issue Jan 27, 2024 · 5 comments
Open

Hessian NaN with ordered_logistic_lpmf #3016

avehtari opened this issue Jan 27, 2024 · 5 comments

Comments

@avehtari
Copy link

I'm testing this via CmdStanR $hessian() method, but I guess the problem is in math.

$hessian() method works for the following Stan model if data y=2 but gives NaN if y=3. $laplace() method is able to compute finite Hessian (finite difference?), so the reason is not that Hessian would not exist (gradient is also close to 0, so the Hessian is computed (near) the mode).

data {
  int Y;
}
parameters {
  real mu;
}
model {
  target += ordered_logistic_lpmf(Y | mu, [-1, 1]');
  target += std_normal_lpdf(mu);
}

R code to reproduce

library(cmdstanr)
code_ordered <-
"data {
  int Y;
}
parameters {
  real mu;
}
model {
  target += ordered_logistic_lpmf(Y | mu, [-1, 1]');
  target += std_normal_lpdf(mu);
}
"
file_ordered <- write_stan_file(code_ordered)
model_ordered <- cmdstan_model(file_ordered,
                               compile_model_methods=TRUE,
                               compile_hessian_method=TRUE,
                               force_recompile=TRUE)
# This works
data_ordered <- list(Y = 2)
opt_ordered <- model_ordered$optimize(data = data_ordered, refresh=0)
opt_ordered$hessian(as.numeric(opt_ordered$unconstrain_draws(draws=opt_ordered$draws())))

# This works
lap_ordered <- model_ordered$laplace(data = data_ordered, refresh=0)
-1/var(lap_ordered$draws(variables="mu")) # Hessian from the draws

# This gives NaN
data_ordered <- list(Y = 3)
opt_ordered <- model_ordered$optimize(data = data_ordered, refresh=0);
opt_ordered$hessian(as.numeric(opt_ordered$unconstrain_draws(draws=opt_ordered$draws())))

# This works
lap_ordered <- model_ordered$laplace(data = data_ordered, refresh=0)
-1/var(lap_ordered$draws(variables="mu")) # Hessian from the draws
@WardBrian
Copy link
Member

I've traced this down to the reverse specialization of exp: it gets passed a var whos' value is inf and adjoint is 0, and 0 * inf is NAN

These appear to be created on line 144 of ordered_logistic_lpmf.hpp

Here is a test which could be added to show this:

TEST(mathMixScalFun, ordered_logistic_lpmf) {
  auto f = [](const auto y) {
    return [=](const auto& lambda, const auto& cutpoints) {
      return stan::math::ordered_logistic_lpmf(y, lambda, cutpoints);
    };
  };

  int y = 3;
  double mu;
  Eigen::Matrix<double, -1, 1> c(2, 1);
  c << -1, 1;

  stan::test::expect_ad(f(y), mu, c);
}

@avehtari
Copy link
Author

For the same case, Hessian works in BridgeStan. What is the difference to CmdStanR?

@WardBrian
Copy link
Member

BridgeStan defaults to using finite differences since that would be available for all models. If you opt into AD Hessians at compile time you should see the same NaNs

@avehtari
Copy link
Author

Thanks. I missed BRIDGESTAN_AD_HESSIAN=true and the whole page with "Optional: Customizing BridgeStan" as I installed BridgeStan using the R package.

@WardBrian
Copy link
Member

It seems like the changes which cause this were made in #2087. Not sure if @t4c1 could weigh in on alternatives

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants