Skip to content

Releases: mjskay/ggdist

ggdist 3.3.2

05 Mar 16:14
Compare
Choose a tag to compare

Major changes:

  • The geom_slabinterval() and geom_dotsinterval() families gain "sub-guides",
    which can be passed to the subguide parameter to create axis annotations for
    the thickness aesthetic (for slabs) and the dot count (for dots) (#183).
  • The weight aesthetic is now supported in stat_slabinterval(), including
    weighted calculations for densities, CDFs, all interval types (quantile
    intervals, highest density intervals, and highest density continuous intervals),
    and all point summaries (mean, median, and mode) (#41). This includes support
    for the upcoming weighted random variable type in the posterior package.
  • Blurry dotplots are now supported using geom_blur_dots(), which accepts an
    sd aesthetic to set the standard deviation of the blur on each dot.
    Intervals can also be used in place of blur by passing blur = "interval".
    This geom is used by the new stat_mcse_dots() to show quantiles along with
    their error using blur (#63).
  • The new breaks_quantiles() histogram breaks function allows the construction
    of quantile histograms with density_histogram(), stat_histinterval(), etc.
  • The color ramp scales (e.g. scale_colour_ramp_continuous(), ...) now use
    an explicit data type, partial_colour_ramp(), to encode color ramps and
    their origin colors, and provide the ramp_colours() function for applying
    colour ramps. This should make it easier to pass explicit color ramps
    without using scale functions, and for packages building on {ggdist} to
    use the colour ramp scales (#209).

Minor changes:

  • The default histogram bin selection algorithm is now "Scott" instead of
    "Sturges", as "Sturges" tends to be too conservative (#214).
  • The at parameter to stat_spike() (or its names) now determines values of
    an at computed variable, which can be mapped onto aesthetics via after_stat()
    to more easily label spikes. (#203; thanks @mattansb for the suggestion).
  • The arrow parameter is now supported for intervals in geom_slabinterval()
    (#206; thanks to @ASKurz for the suggestion).
  • The default value of overflow in geom_dotsinterval() is now the new
    "warn" mode, which works the same as "keep" except that it warns users
    if the dots will overflow the geometry bounds and suggests solutions (#213).
  • Optional arguments to automatically partially-applied functions can now be
    passed a waiver() to use their default value (see auto_partial()).
  • Several dependency reductions: removed {cowplot}, {purrr}, {forcats},
    {palmerpenguins}, and {modelr} from Suggests; moved {tidyselect} and {dplyr}
    from Imports to Suggests. The latter two are only strictly necessary for
    curve_interval() due to its use of grouped data frames and tidy selection to
    specify which columns are conditional and which are joint (the use of grouped
    data frames with point_interval() is less strictly necessary, and not used
    by stats, so is easier to avoid as an absolute dependency).

Documentation:

  • The pkgdown documentation now includes an online article on the thickness
    aesthetic with comprehensive examples of how slab scaling works (#205).

Bug fixes:

  • Ensure Mode() works on analytical constant distributions.
  • Various fixes to ensure compatibility with {ggplot2} 3.5.0.

ggdist 3.3.1

30 Nov 23:22
Compare
Choose a tag to compare

New features and enhancements:

  • Use derivatives supplied by transformations in scales >= 1.2.2 to make
    transformations of densities more reliable (r-lib/scales#341).
  • New layout = "bar" for geom_dotsinterval() that provides better bar
    dotplots (with thanks to @Sharoz for feedback; #190).
  • Bandwidth estimators (including the default, bandwidth_dpi()) now fall back
    to bandwidth_nrd0() when they fail, with a warning that suggests trying
    a dotplot or histogram (as these failures tend to happen on data that is not
    a good candidate for a density plot in the first place) (#196).
  • Much faster (C++) implementation of Wilkinson dotplot binning, especially
    for large dotplots.

Bug fixes:

  • Ensure scale_side_mirrored() supports start = "left" and start = "right"
  • Ensure geom_spike() draws the point on the correct end of the line depending on side.
  • Future-proof guide_rampbar() for ggplot2 > 3.4.2 (#186). Thanks to @teunbrand.
  • Future-proof some minor tests for ggplot2 > 3.4.2 (#187).
  • Allow the size aesthetic to be overridden for the geom_dots() legend.
  • Ensure hdi() supports constants. (#194)

ggdist 3.3.0

13 May 22:11
Compare
Choose a tag to compare

Breaking changes: The following changes, mostly due to new default density
estimators, may cause some plots on sample data to change. Changes should usually
be small, and generally should result in more accurate density estimation. Revert
to the old behavior by setting density = density_unbounded(bandwidth = "nrd0").

  • stat_slabinterval() now uses density_bounded() as its default density
    estimator, which uses a bounded density estimator that also estimates the
    bounds of the data. The default bandwidth estimator is also now bandwidth_dpi(),
    which is the Sheather-Jones direct plug-in estimator (the same as
    stats::bw.SJ(..., method = "dpi")). These changes may cause existing charts
    using densities to change; usually only slightly. These changes should be worth
    it, as they should drastically improve the accuracy of density estimates,
    especially on bounded data, and should have little noticeable impact on densities
    on unbounded data.
  • density_bounded() now estimates bounds from the data when not provided
    (i.e. when one of bounds is NA). See the bounder_ functions (e.g.
    bounder_cdf(), bounder_cooke()) for more on bounds estimation.
  • Improved Mode() and hdi() estimators based on bounded density estimator.

New features and enhancements:

  • Improved hdci() estimator using quantile estimation.
  • Histograms are now implemented using density_histogram(), a histogram
    density estimator. Finer-grained control of bin positions is now possible
    using the breaks argument (including the new breaks_fixed() for manually-specified
    bin widths) and the align argument (including the new align_boundary() and
    align_center() for choosing how to align bin positions to reference points). (#118)
  • New geom_spike() and stat_spike() for adding spike annotations to slabs
    created with geom_slabinterval() or stat_slabinterval(). See example
    in vignette("slabinterval"). (#58, #124)
  • parse_dist() now outputs distributional objects in a .dist_obj column in
    addition to the name-plus-arguments (.dist+.args) format, and these objects respect truncation
    parameters from prior specifications. This makes it easier to visualize standard
    deviation priors, for example, giving a better solution to #20.
  • marginalize_lkjcorr() adjusts the .dist_obj column output by parse_dist()
    in addition to the .dist and .args columns.
  • geom_lineribbon() now obeys the order aesthetic, allowing you to arbitrarily
    set the draw order of ribbons (#171). Enabled by this change, stat_lineribbon()
    now sets order = after_stat(level) by default, making its draw order more correct
    by ensuring all ribbons of the same level are drawn together.
  • Some improved error messages using cli.
  • Very experimental adaptive KDE is available through the adapt parameter;
    note that it is unsupported and both the implementation and interface are
    highly likely to change.

Deprecations:

  • The slab_type parameter for stat_slabinterval() is now deprecated in favor
    of mapping the corresponding computed variable (pdf or cdf) onto the desired
    aesthetic. For slab_type = "histogram", use the pdf computed variable
    combined with the new density_histogram() density estimator (e.g. set
    density = "histogram"). (#165)

Bug fixes:

  • Ensure scale transformations work even when no slab is present; e.g. in
    stat_interval(). (#168)
  • Ensure curve_interval() works with posterior::rvars. (#158)
  • geom_lineribbon() draw order is now correct even when some portions of a
    ribbon has NA widths. (#171)
  • Improve the appearance of logical fill conditions at bin edges on histograms. (#175)

ggdist 3.2.1

18 Jan 17:24
Compare
Choose a tag to compare

New features and enhancements:

  • Support for non-numeric distributions in stat_slabinterval() and
    stat_dotsinterval(), including dist_categorical(), dist_bernoulli(),
    and the upcoming posterior::rvar_factor() type. (#108)
  • Various improvements to dotplot layout in geom_dotsinterval():
    • new layout = "hex" allows a hexagonal circle-packing style layout (#161).
    • new mechanism for smoothing dotplots using the smooth parameter, including
      smooth = "bounded" / smooth = "unbounded" (for "density dotplots") and
      smooth = "discrete" / smooth = "bar" (for improved layout of large-n
      discrete distributions). (#161)
    • a better bin/dot-nudging algorithm using constrained optimization (#163)
    • new overlaps = "keep" option disables bin/dot nudging in "bin", "hex",
      and "weave" layouts. This means layout = "weave" with overlaps = "keep"
      will yield exact dot positions. (#161)
    • The "weave" layout now works properly with side = "both"
    • fixed binning artifacts when there is high density on the edges, particularly
      right edges (#144)
    • use a max binwidth of 1 for discrete distributions (#159)
    • new overflow = "compress" allows layouts to be compressed to fit into the
      geom bounds if a user-specified binwidth would otherwise cause the dots
      to exceed the geom bounds. (#162)
  • Two new shortcut geoms for geom_dotsinterval(): geom_swarm() and geom_weave().
    Both can be used to quickly create "beeswarm"-like plots.
  • A new "mirrored" scale for the side aesthetic, scale_side_mirrored(), makes it
    easier to create mirrored slabs and dotplots. (#142)
  • Custom density estimators can now be used with stat_slabinterval() via the
    density argument, including a new bounded density estimator (density_bounded()).
    (#113)
  • Following the split between size and linewidth aesthetics in ggplot2 3.4,
    the following aesthetics have been updated (#138):
    • interval_size is now linewidth
    • slab_size is now slab_linewidth
    • in geom_slab(), geom_dots(), and geom_lineribbon(), size is now linewidth
  • A new experimental mini domain-specific language for probability expressions
    in ggdist stats: the Pr_() and p_() functions can be used to generate
    after_stat() expressions in terms of ggdist computed variables; e.g.
    aes(thickness = !!Pr_(X <= x)) maps the CDF of the distribution onto the
    thickness aesthetic; aes(thickness = !!p_(x)) maps the PDF onto the
    thickness aesthetic. (#160)
  • Several function families in ggdist now use "currying" (automatic partial
    function application). These function families partially apply themselves until all
    non-optional arguments have been supplied: point_interval(), smooth_...,
    and density_.... See help("automatic-partial-functions").
  • Performance improvements for point_interval() on grouped data frames. (#154)

Documentation:

  • Uses of stat() have been replaced with after_stat() to be consistent with
    the deprecation of stat() in ggplot2 3.4.

ggdist 3.2.0

19 Jul 17:11
Compare
Choose a tag to compare

New features and enhancements:

  • Several computed variables in stat_slabinterval() can now be shared across
    sub-geometries:
    • The .width and level computed variables can now be used in slab / dots
      sub-geometries. These values correspond to the smallest interval computed
      in the interval sub-geometry containing that portion of the slab. This
      gives a more flexible alternative to using cut_cdf_qi() to create densities
      filled according to a set of intervals (this approach which also works on
      highest-density intervals, which cut_cdf_qi() does not). Examples in
      vignette("slabinterval") have been updated to use the new approach, and
      an example has been added to vignette("dotsinterval") showing how to
      color dots by intervals.
    • As an experimental feature (currently a bit fragile) enabled via
      options(ggdist.experimental.slab_data_in_intervals = TRUE),
      the pdf and cdf computed variables can now be used in interval
      sub-geometries to get the PDF and CDF at the point summary. pdf_min,
      pdf_max, cdf_min, and cdf_max also give the PDF and CDF at the lower
      and upper ends of the interval. An example in vignette("lineribbon")
      shows how to use this to make lineribbon gradients whose color approximates
      density (as opposed to the classic gradient fan chart examples already
      in that vignette, where color approximates the CDF).
  • scale_thickness_shared() is now provided to allow the thickness scale to be
    shared across geometries, making certain plot types easier to create
    (e.g. plots of prior and posterior densities together). See
    vignette("slabinterval") for an example.
  • If thickness is less than 0 it is normalized to have a minimum of zero when
    normalization is turned on; this makes it easier to use slab functions that
    go below zero. A new example in vignette("slabinterval") shows how to use
    this to create raindrop plots.
  • The stacking order of dots within bins for geom_dotsinterval(layout = "bin")
    can now be set using the order aesthetic. This makes it possible to create
    "stacked" dotplots by mapping a discrete variable onto the order aesthetic
    (#132). As part of this change, bin_dots() now maintains the original data
    order within bins when layout = "bin". See an example in
    vignette("dotsinterval").
  • A new verbose = TRUE flag in geom_dotsinterval() outputs the selected
    binwidth in both data units and normalized parent coordinates. This may be
    useful if you want to start with an automatically-selected bin width and then
    adjust it manually. Though note: if you just want to scale the selected
    bin width to fit within a desired area, it is probably better to use scale,
    and if you want to provide constraints on the bin width, you can pass a
    2-vector to binwidth.
  • The expand argument in stat_slabinterval() can now take a length-two logical
    vector to control expansion to the lower and upper limits respectively (#129).
    Thanks to @teunbrand.
  • geom_dotsinterval() now supports the family aesthetic for setting the font
    used to display its dots (based on a conversation with @gdbassett).
  • Experimental guide_rampbar() for creating gradient-like legends for
    continuous color/fill ramp scales, based on ggplot2::guide_colorbar().
    See an example in vignette("lineribbon").

Bug fixes:

  • If there are NAs in the thickness aesthetic of a slab, these are now
    rendered as gaps in the slab (#129).
  • Fixed the check for empty x/y scales to avoid extending the scale to cover 0/1
    when plotting distributional objects whose bulk lies outside that region
    (when there is nothing else on the plot).

ggdist 3.1.1

27 Feb 06:13
Compare
Choose a tag to compare

Bug fixes:

  • If a string is supplied to the point_interval argument of stat_slabinterval(),
    a function with that name will be searched for in the calling environment and
    the ggdist package environment. The latter ensures that stats work when
    ggdist is loaded but not attached to the search path (#128).

ggdist 3.1.0

13 Feb 04:01
Compare
Choose a tag to compare

New features and enhancements:

  • The stat_sample_... and stat_dist_... families of stats have been merged (#83).
    • All stat_dist_... stats are deprecated in favor of their stat_... counterparts,
      which now understand the dist, args, and arg1...arg9 aesthetics.
    • xdist and ydist can now be used in place of the dist aesthetic to
      specify the axis one is mapping a distribution onto (dist may be
      deprecated in the future).
    • Passing dist-like objects to the x or y aesthetics now raise a helpful
      error message suggesting you probably want to use xdist or ydist.
    • Restructured internals for stats and geoms makes it much easier to maintain
      shortcut geoms and stats, eliminating a large amount of code duplication (#106).
    • New expand parameter to stat_slabinterval() allows explicitly setting
      whether or not the slab is expanded to the limits of the scale (rather than
      implicitly setting this based on slab_type).
  • The point_interval() family of functions can now be passed distributional
    and posterior::rvar() objects, meaning that means and modes (in addition
    to medians) and highest-density intervals (in addition to quantile intervals)
    can now be visualized for analytical distributions.
    • As part of this, multivariate distribution objects and rvars will generate
      a .index column when passed to point_interval() functions (#111).
      Based on a suggestion from @mitchelloharawild.
  • New stat_ribbon() provided as a shortcut stat for stat_lineribbon() with
    no line (#48). Also, if you supply only an x or y aesthetic to
    geom_lineribbon(), you will get ribbons without a line (#127).
  • One-sided intervals (i.e. quantiles) can now be calculated using ul() (upper
    limit) or ll() (lower limit), e.g. with point_interval() explicitly or
    via mean_ll(), median_ll(), mode_ll(), mean_ul(), median_ul(),
    or mode_ul() (#49).
  • Constant distributions are now reliably detected in a variety of situations
    and rendered as point masses in both density plots and histograms (#103, #32).
  • Minor improvements and changes to dotplot layouts that may result in minor
    changes to the appearance of existing dotplots:
    • Minor improvements to automatic bin width selection; the maximum
      dot stack height should be closer to or equal to scale more often.
    • A formerly-internal fudge factor of 1.07 for dot sizes is now exposed as
      the default value of the dotsize parameter instead of being applied
      internally. This fudge factor tends (in my opinion) to make dotplots look a
      bit better due to the visual distance between circles, but is (I think)
      better used as an explicit value than an implicit one, hence the change.
      This may create subtle changes to plots that use the dotsize or stackratio
      parameters, but allows those parameters to have a more precise
      geometric interpretation.

Documentation:

  • New vignette for the stat_dotsinterval() sub-family: vignette("dotsinterval") (#120).
  • Vastly improved and expanded documentation for the stat_slabinterval() and
    geom_slabinterval() family: each shortcut stat/geom now has its own documentation
    page that comprehensively lists all parameters, aesthetics, and computed variables,
    including those pulled in via ... from typically-paired geoms. These docs are
    auto-generated and should be easy to maintain going forward. (#36)
  • The stat_lineribbon() and geom_lineribbon() family now also has separate
    documentation pages with a comprehensive listing of aesthetics and parameters (#107).
  • Ridge plot-like example in vignette("slabinterval") using the new expand
    parameter (#115).

Deprecations and removals:

  • The .prob argument, a long-deprecated alias for .width, was removed.
  • The limits_function, limits_args, slab_function, slab_args, interval_function,
    and interval_args arguments to stat_slabinterval() were removed: these were
    largely internal-use parameters only needed by subclasses of the base class for
    creating shortcut stats, yet added a lot of noise to the documentation, so these
    were replaced with the $compute_limits(), $compute_slabs(), and
    $compute_intervals() methods on the new AbstractStatSlabinterval
    internal base class.

Bug fixes:

  • Improved handling of NAs for analytical distributions.
  • Fixed bug where within-bin order of dots in dotplots for "bin" and "weave"
    layouts could be incorrect with aesthetics mapped at a sub-bin level.
  • stackratios that are not equal to 1 are now accounted for in
    find_dotplot_binwidth() (i.e. automatic dotplot bin width selection).
  • Ensure distinct fill colors in lineribbons are still treated as distinct for
    grouping even if the fill_ramp aesthetic ramps them to the same color.

ggdist 3.0.1

30 Nov 21:29
Compare
Choose a tag to compare

Bug fixes:

  • Forward-compatibility fixes for distributional >= 0.2.2.9000 (#91).
  • Allow densities for samples of size 1 in stat_sample_slabinterval() (#98).
  • Avoid NOTE about missing linearGradient() function on R < 4.1.
  • Do not draw legend components for inactive sub-geoms in geom_slabinterval().

ggdist 3.0.0

19 Jul 21:31
Compare
Choose a tag to compare

Breaking changes:

  • The positioning of geom_slabinterval() family geoms when using position_dodge()
    is now slightly different in order to match up with how other geoms are positioned (#85).
    This may slightly change existing charts that use position = "dodge", and in
    some cases may cause slabs to be drawn slightly outside plot boundaries, but makes
    it much easier to combine geom_slabinterval() with other geoms in the expected
    way. If dodging more similar to the old approach is needed, use the new
    "justification-preserving dodge", position_dodgejust(), in place of position_dodge().

New features:

  • For geom_slabinterval(), side, justification, and scale can now be
    used as aesthetics instead of parameters, allowing them to vary across slabs
    within the same geom.
  • Varying fills within a slab in geom_slabinterval() can now be drawn as
    true gradients rather than segmented polygons in R >= 4.1 by setting
    fill_type = "gradient". This substantially improves the appearance of
    gradient fills in graphics engines that support it (#44).
  • Improved support for discrete distributions:
    • stat_dist_slabinterval() and company now detect discrete distributions and
      display them as histograms (#19).
    • geom_dotsinterval() now adjusts bin widths on discrete distributions when
      they would result in bins that are taller than the allocated space to ensure
      that they fit within the required space (#42).
  • Allow user-specified lower and/or upper bounds on dynamic geom_dotsinterval()
    bin width by passing a vector of two values to the binwidth parameter.
  • The automatic bin selection algorithm used by geom_dotsinterval() has been
    factored out and exported as find_dotplot_binwidth() and bin_dots() for
    others to use (#77).
  • Previously, curve_interval() used a common (but naive) approach to finding
    a cutoff on data depth to identify the X% "deepest" curves, simply taking the
    envelope around the X% quantile of curves ranked by depth. This is quite
    conservative and tends to create intervals that are too wide; curve_interval()
    now searches for a cutoff in data depth such that X% of curves are contained
    within its envelope (#67).
  • point_interval() and company now accept distributional objects and
    posterior::rvar()s (full support for distributional objects requires
    distributional > 0.2.2).
  • Reduce dependencies substantially, making the geoms more suitable for use by
    other packages (thanks to Brenton Wiernik for the help).

New documentation:

  • Substantial improvements to the documentation of aesthetics and computed
    variables in geom_slabinterval(), stat_slabinterval(), and company, listing
    all custom aesthetics, computed variables, and their usage.

  • Several new examples in vignette("slabinterval"), including "rain cloud"
    plots and an example of histograms for discrete analytical distributions.

Bug fixes:

  • Ensure stat_dist_slabinterval() preserves group order (#88).
  • Improve test coverage up to ~96%.
  • Restore computed variable n for stat_sample_slabinterval().
  • Various improvements in correct NA handling across the geoms (#74, #51).

ggdist 2.4.1

10 Jun 14:54
Compare
Choose a tag to compare

New features:

  • Added "weave" and "swarm" layouts for dots geoms (#64). These provide
    alternative layouts that keep datapoints in their actual positions on the
    data axis. The "weave" layout maintains rows but not columns and works well
    for quantile dotplots; the "swarm" layout uses the "compactswarm" method from
    beeswarm::beeswarm() (courtesy James Trimble) and works well on sample data.
    See the dotplot section of vignette("slabinterval") for comparisons.
  • Allow the use of unit() to specify bin widths manually for dots geoms and stats,
    which can be helpful when you need dotplots across facets to have the same bin width
    (#53).

New documentation:

  • Add example of lineribbon gradients using fill_ramp in vignette("lineribbon").
  • Add example of Tukey-like pencils in vignette("slabinterval").
  • Add example of two slab used together (densities and dotplots to make "rain clouds")
    in vignette("slabinterval").

Bug fixes:

  • Fix issues with ggplot2 3.3.4 (#72) and vdiffr 1.0.
  • Handle interactions between alpha and fill/color properly when not set by user (#62).
  • Use step function for all ECDFs, which should also fix constant CDFs (#55).
  • Move fda to suggests as it brings in a large number of dependencies and is rarely used.
  • Use trimmed density for mode estimation (#57).