New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor plot_ecdf arguments #2316
base: main
Are you sure you want to change the base?
Conversation
arviz/plots/ecdfplot.py
Outdated
band. | ||
For simultaneous confidence bands to be correctly calibrated, provide `eval_points` that | ||
are not dependent on the `values`. | ||
band_prob : float, default 0.95 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd argue that this should default to 0.94 for consistency with hdi_prob
, but that would be breaking (since fpr
was 0.05).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will ultimately depend on the hdi_prob, band_prob, ci_prob discussion, but in my opinion changing the value of hdi_prob (or any other rcParam defined probability) is not a breaking change. It is documented in multiple places that these are completely arbitrary values and that might also change.
The only guarantee should be that is someone was using fpr=0.05 it still works for a while but changing the probability of the band they get when not providing fpr is ok.
arviz/rcparams.py
Outdated
@@ -262,6 +262,7 @@ def validate_iterable(value): | |||
"mean", | |||
_make_validate_choice({"mean", "median", "mode"}, allow_none=True), | |||
), | |||
"plot.band_prob": (0.95, _validate_probability), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potentially this should be stats.band_prob
, especially if we add the ECDF confidence band computation functions to the API.
@@ -90,7 +90,7 @@ def ecdf_confidence_band( | |||
A function that takes an integer `ndraws` and optionally the object passed to | |||
`random_state` and returns an array of `ndraws` samples from the same distribution | |||
as the original dataset. Required if `method` is "simulated" and variable is discrete. | |||
num_trials : int, default 1000 | |||
num_trials : int, default 500 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just changed for consistency with original behavior.
arviz/plots/ecdfplot.py
Outdated
@@ -46,26 +52,41 @@ def plot_ecdf( | |||
values : array-like | |||
Values to plot from an unknown continuous or discrete distribution. | |||
values2 : array-like, optional | |||
Values to compare to the original sample. | |||
deprecated: values to compare to the original sample. Instead use | |||
`cdf=scipy.stats.ecdf(values2).cdf.evaluate`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope in a future PR to add an ECDF-(difference)plot option to plot_ranks
and then recommend that here.
arviz/plots/ecdfplot.py
Outdated
confidence_bands : str or bool, optional | ||
- False: No confidence bands are plotted. | ||
- "pointwise": Compute the pointwise (i.e. marginal) confidence band. | ||
- True or "simulated": Use Monte Carlo simulation to estimate a simultaneous confidence |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the next PR I will add deterministic bands, which will become the default here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would then put "True" and "simulated" on different lines. True saying bands are computed with the default algorithm (subject to change), and simulated can keep the current description.
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
@OriolAbril I've implemented all suggestions and updated the above description and changelog. This should be ready for final review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed some DeprecationWarnings that are user-facing. I'll batch commit the suggestions and merge, all the changes are extremely minor
changing user-facing DeprecationWarnings to FutureWarnings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After seeing the warnings and trying it out I am a bit on the fence on the behaviour of eval_points
. It is basically a required argument right now, otherwise you get a FutureWarning.
It would be nice to continue allowing plot_ecdf(samples) to work without warnings.
The reason it raises a |
Maybe we could create a specific warning class for this? Something like We could also silence it in the examples of the docstring. Now all examples use eval_points, but to illustrate how to generate confidence bands or how to make it a difference plot it doesn't matter which is the default behaviour of eval_points (and tests don't use it). So we could use the first example to describe the behaviour change, show how to maintain old behaviour and then silence the warning so following examples focus on what they want to illustrate without worrying about the warning. What do you think? |
Description
This PR introduces new keyword arguments to
plot_ecdf
and deprecates a few existing ones, following suggestions in #2309.New keywords and features:
confidence_bands
now may take string arguments as well as booleanci_prob
specifies band probability.stats.ci_prob
is a new rcParam.eval_points
allows the user to specify the evaluation pointsrvs
,random_state
can be provided for simulation confidence bandsDeprecated arguments:
values2
is now deprecated, in favor of users passing an empirical CDFDeprecated keywords:
pointwise
is nowconfidence_bands="pointwise"
fpr
is replaced with1-ci_prob
for consistency with other plotting functions.pit
is deprecated. We only need this for LOO-PIT, users who need it for something else will probably know how to make the plot, and if we really want to include it, it should be its own plotting function. There's now a documented example of how to plot PIT.npoints
Deprecated rcparams:
stats.hdi_prob
has been deprecated and replaced withstats.ci_prob
Additional changes:
eval_points
not provided, there's a warning that in the futureeval_points
will be the unique values of the sample. This would be breaking and is saved for a future release.None of the changes are breaking.
Checklist
📚 Documentation preview 📚: https://arviz--2316.org.readthedocs.build/en/2316/