Stats

Summary statistics

PosteriorStats.SummaryStatsType
struct SummaryStats

A container for a column table of values computed by summarize.

This object implements the Tables and TableTraits column table interfaces. It has a custom show method.

Note

SummaryStats behaves like an OrderedDict of columns, where the columns can be accessed using either Symbols or a 1-based integer index. However, this interface is not part of the public API and may change in the future. We recommend using it only interactively.

Constructor

SummaryStats(data; name="SummaryStats"[, labels])

Construct a SummaryStats from tabular data.

data must implement the Tables interface. If it contains a column label, this will be used for the row labels or will be replaced with the labels if provided.

Keywords

  • name::AbstractString: The name of the collection of summary statistics, used as the table title in display.
  • labels::AbstractVector: The names of the parameters in data, used as row labels in display. If not provided, then the column label in data will be used if it exists. Otherwise, the parameter names will be numeric indices.
source
PosteriorStats.default_summary_statsFunction
default_summary_stats(kind::Symbol=:all; kwargs...)

Return a collection of stats functions based on the named preset kind.

These functions are then passed to summarize.

Arguments

  • kind::Symbol: The named collection of summary statistics to be computed:
    • :all: Everything in :stats and :diagnostics
    • :stats: mean, std, <ci>
    • :diagnostics: ess_tail, ess_bulk, rhat, mcse_mean, mcse_std
    • :all_median: Everything in :stats_median and :diagnostics_median
    • :stats_median: median, mad, <ci>
    • :diagnostics_median: ess_median, ess_tail, rhat, mcse_median

Keywords

  • ci_fun=eti: The function to compute the credible interval <ci>, if any. Supported options are eti and hdi. CI column name is <ci_fun><100*ci_prob>.
  • ci_prob=0.94: The probability mass to be contained in the credible interval <ci>.
source
PosteriorStats.summarizeFunction
summarize(data; kind=:all,kwargs...) -> SummaryStats
summarize(data, stats_funs...; kwargs...) -> SummaryStats

Compute summary statistics on each param in data.

Arguments

  • data: a 3D array of real samples with shape (draws, chains, params) or another object for which a summarize method is defined.
  • stats_funs: a collection of functions that reduces a matrix with shape (draws, chains) to a scalar or a collection of scalars. Alternatively, an item in stats_funs may be a Pair of the form name => fun specifying the name to be used for the statistic or of the form (name1, ...) => fun when the function returns a collection. When the function returns a collection, the names in this latter format must be provided.

Keywords

  • var_names: a collection specifying the names of the parameters in data. If not provided, the names the indices of the parameter dimension in data.

  • name::String: the name of the summary statistics, used as the table title in display.

  • kind::Symbol: The named collection of summary statistics to be computed:

    • :all: Everything in :stats and :diagnostics
    • :stats: mean, std, <ci>
    • :diagnostics: ess_tail, ess_bulk, rhat, mcse_mean, mcse_std
    • :all_median: Everything in :stats_median and :diagnostics_median
    • :stats_median: median, mad, <ci>
    • :diagnostics_median: ess_median, ess_tail, rhat, mcse_median
  • kwargs: additional keyword arguments passed to default_summary_stats, including:

    • ci_fun=eti: The function to compute the credible interval <ci>, if any. Supported options are eti and hdi. CI column name is <ci_fun><100*ci_prob>.
    • ci_prob=0.94: The probability mass to be contained in the credible interval <ci>.

See also SummaryStats, default_summary_stats

Extended Help

Examples

Compute all summary statistics (the default):

Display precision

When an estimator and its MCSE are both computed, the MCSE is used to determine the number of significant digits that will be displayed.

julia> using Statistics, StatsBase

julia> x = randn(1000, 4, 3) .+ reshape(0:10:20, 1, 1, :);

julia> summarize(x)
SummaryStats
       mean   std  eti94          ess_tail  ess_bulk  rhat  mcse_mean  mcse_std
 1   0.0003  0.99  -1.83 .. 1.89      3567      3663  1.00      0.016     0.012
 2  10.02    0.99   8.17 .. 11.9      3841      3906  1.00      0.016     0.011
 3  19.98    0.99   18.1 .. 21.9      3892      3749  1.00      0.016     0.012

Compute just the default statistics with an 89% HDI, and provide the parameter names:

julia> var_names=[:x, :y, :z];

julia> summarize(x; var_names, kind=:stats, ci_fun=hdi, ci_prob=0.89)
SummaryStats
         mean    std  hdi89
 x   0.000275  0.989  -1.63 .. 1.52
 y  10.0       0.988   8.53 .. 11.6
 z  20.0       0.988   18.5 .. 21.6

Compute Statistics.mean, Statistics.std and the Monte Carlo standard error (MCSE) of the mean estimate:

julia> summarize(x, mean, std, :mcse_mean => sem; name="Mean/Std")
Mean/Std
       mean    std  mcse_mean
 1   0.0003  0.989      0.016
 2  10.02    0.988      0.016
 3  19.98    0.988      0.016

Compute multiple quantiles simultaneously:

julia> percs = (5, 25, 50, 75, 95);

julia> summarize(x, Symbol.(:q, percs) => Base.Fix2(quantile, percs ./ 100))
SummaryStats
       q5     q25       q50     q75    q95
 1  -1.61  -0.668   0.00447   0.653   1.64
 2   8.41   9.34   10.0      10.7    11.6
 3  18.4   19.3    20.0      20.6    21.6

Extending summarize to custom types

To support computing summary statistics from a custom object MyType, overload the method summarize(::MyType, stats_funs...; kwargs...), which should ultimately call summarize(::AbstractArray{<:Union{Real,Missing},3}, stats_funs...; other_kwargs...), where other_kwargs are the keyword arguments passed to summarize.

source

Credible intervals

PosteriorStats.etiFunction
eti(samples::AbstractVecOrMat{<:Real}; [prob, kwargs...]) -> IntervalSets.ClosedInterval
eti(samples::AbstractArray{<:Real}; [prob, kwargs...]) -> Array{<:IntervalSets.ClosedInterval}

Estimate the equal-tailed interval (ETI) of samples for the probability prob.

The ETI of a given probability is the credible interval wih the property that the probability of being below the interval is equal to the probability of being above it. That is, it is defined by the (1-prob)/2 and 1 - (1-prob)/2 quantiles of the samples.

See also: eti!, hdi, hdi!.

Arguments

  • samples: an array of shape (draws[, chains[, params...]]). If multiple parameters are present

Keywords

  • prob: the probability mass to be contained in the ETI. Default is 0.94.
  • kwargs: remaining keywords are passed to Statistics.quantile.

Returns

  • intervals: If samples is a vector or matrix, then a single IntervalSets.ClosedInterval is returned. Otherwise, an array with the shape (params...,), is returned, containing a marginal ETI for each parameter.
Note

Any default value of prob is arbitrary. The default value of prob=0.94 instead of a more common default like prob=0.95 is chosen to reminder the user of this arbitrariness.

Examples

Here we calculate the 83% ETI for a normal random variable:

julia> x = randn(2_000);

julia> eti(x; prob=0.83)
-1.3740585250299766 .. 1.2860771129421198

We can also calculate the ETI for a 3-dimensional array of samples:

julia> x = randn(1_000, 1, 1) .+ reshape(0:5:10, 1, 1, :);

julia> eti(x)
3-element Vector{IntervalSets.ClosedInterval{Float64}}:
 -1.951006825019686 .. 1.9011666217153793
 3.048993174980314 .. 6.9011666217153795
 8.048993174980314 .. 11.90116662171538
source
eti(data::InferenceData; kwargs...) -> Dataset
eti(data::Dataset; kwargs...) -> Dataset

Calculate the equal-tailed interval (ETI) for each parameter in the data.

For more details and a description of the kwargs, see PosteriorStats.eti.

source
PosteriorStats.eti!Function
eti!(samples::AbstractArray{<:Real}; [prob, kwargs...])

A version of eti that partially sorts samples in-place while computing the ETI.

See also: eti, hdi, hdi!.

source
PosteriorStats.hdiFunction
hdi(samples::AbstractVecOrMat{<:Real}; [prob, sorted, method]) -> IntervalSets.ClosedInterval
hdi(samples::AbstractArray{<:Real}; [prob, sorted, method]) -> Array{<:IntervalSets.ClosedInterval}

Estimate the highest density interval (HDI) of samples for the probability prob.

The HDI is the minimum width Bayesian credible interval (BCI). That is, it is the smallest possible interval containing (100*prob)% of the probability mass.Hyndman1996 This implementation uses the algorithm of ChenShao1999.

See also: hdi!, eti, eti!.

Arguments

  • samples: an array of shape (draws[, chains[, params...]]). If multiple parameters are present, a marginal HDI is computed for each.

Keywords

  • prob: the probability mass to be contained in the HDI. Default is 0.94.
  • sorted=false: if true, the input samples are assumed to be sorted.
  • method::Symbol: the method used to estimate the HDI. Available options are:
    • :unimodal: Assumes a unimodal distribution (default). Bounds are entries in samples.
    • :multimodal: Fits a density estimator to samples and finds the HDI of the estimated density. For continuous data, the density estimator is a kernel density estimate (KDE) computed using kde_reflected. For discrete data, a histogram with bin width 1 is used.
    • :multimodal_sample: Like :multimodal, but uses the density estimator to find the entries in samples with the highest density and computes the HDI from those points. This can correct for inaccuracies in the density estimator.
  • is_discrete::Union{Bool,Nothing}=nothing: Specify if the data is discrete (integer-valued). If nothing, it's automatically determined.
  • kwargs: For continuous data and multimodal methods, remaining keywords are forwarded to kde_reflected.

Returns

  • intervals: If samples is a vector or matrix, then a single IntervalSets.ClosedInterval is returned for :unimodal method, or a vector of ClosedInterval for multimodal methods. For higher dimensional inputs, an array with the shape (params...,) is returned, containing marginal HDIs for each parameter.
Note

Any default value of prob is arbitrary. The default value of prob=0.94 instead of a more common default like prob=0.95 is chosen to remind the user of this arbitrariness.

Examples

Here we calculate the 83% HDI for a normal random variable:

julia> x = randn(2_000);

julia> hdi(x; prob=0.83)
-1.3826605224220527 .. 1.259817149822839

We can also calculate the HDI for a 3-dimensional array of samples:

julia> x = randn(1_000, 1, 1) .+ reshape(0:5:10, 1, 1, :);

julia> hdi(x)
3-element Vector{IntervalSets.ClosedInterval{Float64}}:
 -1.6402043796029502 .. 2.041852066407182
 3.35979562039705 .. 7.041852066407182
 8.35979562039705 .. 12.041852066407182

For multimodal distributions, you can use the :multimodal method:

julia> x = vcat(randn(1000), randn(1000) .+ 5);

julia> hdi(x; method=:multimodal)
2-element Vector{IntervalSets.ClosedInterval{Float64}}:
 -1.8882401079608677 .. 2.0017686164555037
 2.9839268475847436 .. 6.9235952578447275

References

  • Hyndman1996 Hyndman, J. Comput. Graph. Stat., 50:2 (1996)
  • ChenShao1999 Chen & Shao, J Comput. Graph. Stat., 8:1 (1999)
source
hdi(data::InferenceData; kwargs...) -> Dataset
hdi(data::Dataset; kwargs...) -> Dataset

Calculate the highest density interval (HDI) for each parameter in the data.

For more details and a description of the kwargs, see PosteriorStats.hdi.

source
PosteriorStats.hdi!Function
hdi!(samples::AbstractArray{<:Real}; [prob, sorted])

A version of hdi that partially sorts samples in-place while computing the HDI.

See also: hdi, eti, eti!.

source

Pareto-smoothed importance sampling

PSIS.PSISResultType
PSISResult

Result of Pareto-smoothed importance sampling (PSIS) using psis.

Properties

  • log_weights: un-normalized Pareto-smoothed log weights
  • weights: normalized Pareto-smoothed weights (allocates a copy)
  • pareto_shape: Pareto $k=ξ$ shape parameter
  • nparams: number of parameters in log_weights
  • ndraws: number of draws in log_weights
  • nchains: number of chains in log_weights
  • reff: the ratio of the effective sample size of the unsmoothed importance ratios and the actual sample size.
  • ess: estimated effective sample size of estimate of mean using smoothed importance samples (see ess_is)
  • tail_length: length of the upper tail of log_weights that was smoothed
  • tail_dist: the generalized Pareto distribution that was fit to the tail of log_weights. Note that the tail weights are scaled to have a maximum of 1, so tail_dist * exp(maximum(log_ratios)) is the corresponding fit directly to the tail of log_ratios.
  • normalized::Bool:indicates whether log_weights are log-normalized along the sample dimensions.

Diagnostic

The pareto_shape parameter $k=ξ$ of the generalized Pareto distribution tail_dist can be used to diagnose reliability and convergence of estimates using the importance weights VehtariSimpson2021.

  • if $k < \frac{1}{3}$, importance sampling is stable, and importance sampling (IS) and PSIS both are reliable.
  • if $k ≤ \frac{1}{2}$, then the importance ratio distributon has finite variance, and the central limit theorem holds. As $k$ approaches the upper bound, IS becomes less reliable, while PSIS still works well but with a higher RMSE.
  • if $\frac{1}{2} < k ≤ 0.7$, then the variance is infinite, and IS can behave quite poorly. However, PSIS works well in this regime.
  • if $0.7 < k ≤ 1$, then it quickly becomes impractical to collect enough importance weights to reliably compute estimates, and importance sampling is not recommended.
  • if $k > 1$, then neither the variance nor the mean of the raw importance ratios exists. The convergence rate is close to zero, and bias can be large with practical sample sizes.

See PSISPlots.paretoshapeplot for a diagnostic plot.

References

source
PSIS.ess_isFunction
ess_is(weights; reff=1)

Estimate effective sample size (ESS) for importance sampling over the sample dimensions.

Given normalized weights $w_{1:n}$, the ESS is estimated using the L2-norm of the weights:

\[\mathrm{ESS}(w_{1:n}) = \frac{r_{\mathrm{eff}}}{\sum_{i=1}^n w_i^2}\]

where $r_{\mathrm{eff}}$ is the relative efficiency of the log_weights.

ess_is(result::PSISResult; bad_shape_nan=true)

Estimate ESS for Pareto-smoothed importance sampling.

Note

ESS estimates for Pareto shape values $k > 0.7$, which are unreliable and misleadingly high, are set to NaN. To avoid this, set bad_shape_nan=false.

source
PSIS.PSISPlots.paretoshapeplotFunction
paretoshapeplot(values; showlines=false, ...)
paretoshapeplot!(values; showlines=false, kwargs...)

Plot shape parameters of fitted Pareto tail distributions for diagnosing convergence.

values may be either a vector of Pareto shape parameters or a PSIS.PSISResult.

If showlines==true, horizontal lines indicating relevant Pareto shape thresholds are drawn. See PSIS.PSISResult for an explanation of the thresholds.

All remaining kwargs are forwarded to the plotting function.

See psis, PSISResult.

Examples

using PSIS, Distributions, Plots
proposal = Normal()
target = TDist(7)
x = rand(proposal, 1_000, 100)
log_ratios = logpdf.(target, x) .- logpdf.(proposal, x)
result = psis(log_ratios)
paretoshapeplot(result)

We can also plot the Pareto shape parameters directly:

paretoshapeplot(result.pareto_shape)

We can also use plot directly:

plot(result.pareto_shape; showlines=true)
source
PSIS.psisFunction
psis(log_ratios, reff = 1.0; kwargs...) -> PSISResult
psis!(log_ratios, reff = 1.0; kwargs...) -> PSISResult

Compute Pareto smoothed importance sampling (PSIS) log weights VehtariSimpson2021.

While psis computes smoothed log weights out-of-place, psis! smooths them in-place.

Arguments

  • log_ratios: an array of logarithms of importance ratios, with size (draws, [chains, [parameters...]]), where chains>1 would be used when chains are generated using Markov chain Monte Carlo.
  • reff::Union{Real,AbstractArray}: the ratio(s) of effective sample size of log_ratios and the actual sample size reff = ess/(draws * chains), used to account for autocorrelation, e.g. due to Markov chain Monte Carlo. If an array, it must have the size (parameters...,) to match log_ratios.

Keywords

  • warn=true: If true, warning messages are delivered
  • normalize=true: If true, the log-weights will be log-normalized so that exp.(log_weights) sums to 1 along the sample dimensions.

Returns

  • result: a PSISResult object containing the results of the Pareto-smoothing.

A warning is raised if the Pareto shape parameter $k ≥ 0.7$. See PSISResult for details and PSISPlots.paretoshapeplot for a diagnostic plot.

Examples

Here we smooth log importance ratios for importance sampling 30 isotropic Student $t$-distributed parameters using standard normal distributions as proposals.

julia> using Distributions

julia> proposal, target = Normal(), TDist(7);

julia> x = rand(proposal, 1_000, 1, 30);  # (ndraws, nchains, nparams)

julia> log_ratios = @. logpdf(target, x) - logpdf(proposal, x);

julia> result = psis(log_ratios)
┌ Warning: 9 parameters had Pareto shape values 0.7 < k ≤ 1. Resulting importance sampling estimates are likely to be unstable.
└ @ PSIS ~/.julia/packages/PSIS/...
┌ Warning: 1 parameters had Pareto shape values k > 1. Corresponding importance sampling estimates are likely to be unstable and are unlikely to converge with additional samples.
└ @ PSIS ~/.julia/packages/PSIS/...
PSISResult with 1000 draws, 1 chains, and 30 parameters
Pareto shape (k) diagnostic values:
                        Count       Min. ESS
 (-Inf, 0.5]  good       7 (23.3%)  959
  (0.5, 0.7]  okay      13 (43.3%)  938
    (0.7, 1]  bad        9 (30.0%)  ——
    (1, Inf)  very bad   1 (3.3%)   ——

If the draws were generated using MCMC, we can compute the relative efficiency using MCMCDiagnosticTools.ess.

julia> using MCMCDiagnosticTools

julia> reff = ess(log_ratios; kind=:basic, split_chains=1, relative=true);

julia> result = psis(log_ratios, reff)
┌ Warning: 9 parameters had Pareto shape values 0.7 < k ≤ 1. Resulting importance sampling estimates are likely to be unstable.
└ @ PSIS ~/.julia/packages/PSIS/...
┌ Warning: 1 parameters had Pareto shape values k > 1. Corresponding importance sampling estimates are likely to be unstable and are unlikely to converge with additional samples.
└ @ PSIS ~/.julia/packages/PSIS/...
PSISResult with 1000 draws, 1 chains, and 30 parameters
Pareto shape (k) diagnostic values:
                        Count       Min. ESS
 (-Inf, 0.5]  good       9 (30.0%)  806
  (0.5, 0.7]  okay      11 (36.7%)  842
    (0.7, 1]  bad        9 (30.0%)  ——
    (1, Inf)  very bad   1 (3.3%)   ——

References

source
PSIS.psis!Function
psis(log_ratios, reff = 1.0; kwargs...) -> PSISResult
psis!(log_ratios, reff = 1.0; kwargs...) -> PSISResult

Compute Pareto smoothed importance sampling (PSIS) log weights VehtariSimpson2021.

While psis computes smoothed log weights out-of-place, psis! smooths them in-place.

Arguments

  • log_ratios: an array of logarithms of importance ratios, with size (draws, [chains, [parameters...]]), where chains>1 would be used when chains are generated using Markov chain Monte Carlo.
  • reff::Union{Real,AbstractArray}: the ratio(s) of effective sample size of log_ratios and the actual sample size reff = ess/(draws * chains), used to account for autocorrelation, e.g. due to Markov chain Monte Carlo. If an array, it must have the size (parameters...,) to match log_ratios.

Keywords

  • warn=true: If true, warning messages are delivered
  • normalize=true: If true, the log-weights will be log-normalized so that exp.(log_weights) sums to 1 along the sample dimensions.

Returns

  • result: a PSISResult object containing the results of the Pareto-smoothing.

A warning is raised if the Pareto shape parameter $k ≥ 0.7$. See PSISResult for details and PSISPlots.paretoshapeplot for a diagnostic plot.

Examples

Here we smooth log importance ratios for importance sampling 30 isotropic Student $t$-distributed parameters using standard normal distributions as proposals.

julia> using Distributions

julia> proposal, target = Normal(), TDist(7);

julia> x = rand(proposal, 1_000, 1, 30);  # (ndraws, nchains, nparams)

julia> log_ratios = @. logpdf(target, x) - logpdf(proposal, x);

julia> result = psis(log_ratios)
┌ Warning: 9 parameters had Pareto shape values 0.7 < k ≤ 1. Resulting importance sampling estimates are likely to be unstable.
└ @ PSIS ~/.julia/packages/PSIS/...
┌ Warning: 1 parameters had Pareto shape values k > 1. Corresponding importance sampling estimates are likely to be unstable and are unlikely to converge with additional samples.
└ @ PSIS ~/.julia/packages/PSIS/...
PSISResult with 1000 draws, 1 chains, and 30 parameters
Pareto shape (k) diagnostic values:
                        Count       Min. ESS
 (-Inf, 0.5]  good       7 (23.3%)  959
  (0.5, 0.7]  okay      13 (43.3%)  938
    (0.7, 1]  bad        9 (30.0%)  ——
    (1, Inf)  very bad   1 (3.3%)   ——

If the draws were generated using MCMC, we can compute the relative efficiency using MCMCDiagnosticTools.ess.

julia> using MCMCDiagnosticTools

julia> reff = ess(log_ratios; kind=:basic, split_chains=1, relative=true);

julia> result = psis(log_ratios, reff)
┌ Warning: 9 parameters had Pareto shape values 0.7 < k ≤ 1. Resulting importance sampling estimates are likely to be unstable.
└ @ PSIS ~/.julia/packages/PSIS/...
┌ Warning: 1 parameters had Pareto shape values k > 1. Corresponding importance sampling estimates are likely to be unstable and are unlikely to converge with additional samples.
└ @ PSIS ~/.julia/packages/PSIS/...
PSISResult with 1000 draws, 1 chains, and 30 parameters
Pareto shape (k) diagnostic values:
                        Count       Min. ESS
 (-Inf, 0.5]  good       9 (30.0%)  806
  (0.5, 0.7]  okay      11 (36.7%)  842
    (0.7, 1]  bad        9 (30.0%)  ——
    (1, Inf)  very bad   1 (3.3%)   ——

References

source

LOO and WAIC

PosteriorStats.AbstractELPDResultType
abstract type AbstractELPDResult

An abstract type representing the result of an ELPD computation.

Every subtype stores estimates of both the expected log predictive density (elpd) and the effective number of parameters p, as well as standard errors and pointwise estimates of each, from which other relevant estimates can be computed.

Subtypes implement the following functions:

source
PosteriorStats.PSISLOOResultType

Results of Pareto-smoothed importance sampling leave-one-out cross-validation (PSIS-LOO).

See also: loo, AbstractELPDResult

  • estimates: Estimates of the expected log pointwise predictive density (ELPD) and effective number of parameters (p)

  • pointwise: Pointwise estimates

  • psis_result: A PSIS.PSISResult with Pareto-smoothed importance sampling (PSIS) results

source
PosteriorStats.WAICResultType

Results of computing the widely applicable information criterion (WAIC).

See also: waic, AbstractELPDResult

  • estimates: Estimates of the expected log pointwise predictive density (ELPD) and effective number of parameters (p)

  • pointwise: Pointwise estimates

source
PosteriorStats.elpd_estimatesFunction
elpd_estimates(result::AbstractELPDResult; pointwise=false) -> (; elpd, se_elpd, lpd)

Return the (E)LPD estimates from the result.

source
PosteriorStats.information_criterionFunction
information_criterion(elpd, scale::Symbol)

Compute the information criterion for the given scale from the elpd estimate.

scale must be one of (:deviance, :log, :negative_log).

See also: loo, waic

source
information_criterion(result::AbstractELPDResult, scale::Symbol; pointwise=false)

Compute information criterion for the given scale from the existing ELPD result.

scale must be one of (:deviance, :log, :negative_log).

If pointwise=true, then pointwise estimates are returned.

source
PosteriorStats.looFunction
loo(log_likelihood; reff=nothing, kwargs...) -> PSISLOOResult{<:NamedTuple,<:NamedTuple}

Compute the Pareto-smoothed importance sampling leave-one-out cross-validation (PSIS-LOO). Vehtari2017, LOOFAQ

log_likelihood must be an array of log-likelihood values with shape (chains, draws[, params...]).

Keywords

  • reff::Union{Real,AbstractArray{<:Real}}: The relative effective sample size(s) of the likelihood values. If an array, it must have the same data dimensions as the corresponding log-likelihood variable. If not provided, then this is estimated using MCMCDiagnosticTools.ess.
  • kwargs: Remaining keywords are forwarded to PSIS.psis.

See also: PSISLOOResult, waic

Examples

Manually compute $R_\mathrm{eff}$ and calculate PSIS-LOO of a model:

julia> using ArviZExampleData, LogExpFunctions, MCMCDiagnosticTools

julia> idata = load_example_data("centered_eight");

julia> log_like = PermutedDimsArray(idata.log_likelihood.obs, (:draw, :chain, :school));

julia> reff = ess(softmax(log_like; dims=(1, 2)); kind=:basic, split_chains=1, relative=true);

julia> loo(log_like; reff)
PSISLOOResult with estimates
 elpd  se_elpd    p  se_p
  -31      1.4  0.9  0.33

and PSISResult with 500 draws, 4 chains, and 8 parameters
Pareto shape (k) diagnostic values:
                    Count      Min. ESS
 (-Inf, 0.5]  good  4 (50.0%)  270
  (0.5, 0.7]  okay  4 (50.0%)  307

References

  • Vehtari2017 Vehtari et al. Stat. Comput. 27 (2017).
  • LOOFAQ Vehtari. Cross-validation FAQ.
source
loo(data::Dataset; [var_name::Symbol,] kwargs...) -> PSISLOOResult{<:NamedTuple,<:Dataset}
loo(data::InferenceData; [var_name::Symbol,] kwargs...) -> PSISLOOResult{<:NamedTuple,<:Dataset}

Compute PSIS-LOO from log-likelihood values in data.

If more than one log-likelihood variable is present, then var_name must be provided.

For more details and a description of the kwargs, see PosteriorStats.loo.

Examples

Calculate PSIS-LOO of a model:

julia> using ArviZExampleData, PosteriorStats

julia> idata = load_example_data("centered_eight");

julia> loo(idata)
PSISLOOResult with estimates
 elpd  se_elpd    p  se_p
  -31      1.4  0.9  0.33

and PSISResult with 500 draws, 4 chains, and 8 parameters
Pareto shape (k) diagnostic values:
                    Count      Min. ESS
 (-Inf, 0.5]  good  4 (50.0%)  270
  (0.5, 0.7]  okay  4 (50.0%)  307
source
PosteriorStats.waicFunction
waic(log_likelihood::AbstractArray) -> WAICResult{<:NamedTuple,<:NamedTuple}

Compute the widely applicable information criterion (WAIC). Watanabe2010

log_likelihood must be an array of log-likelihood values with shape (chains, draws[, params...]).

See also: WAICResult, loo

Examples

Calculate WAIC of a model:

julia> using ArviZExampleData

julia> idata = load_example_data("centered_eight");

julia> log_like = PermutedDimsArray(idata.log_likelihood.obs, (:draw, :chain, :school));

julia> waic(log_like)
WAICResult with estimates
 elpd  se_elpd    p  se_p
  -31      1.4  0.9  0.32

References

source
waic(data::Dataset; [var_name::Symbol]) -> WAICResult{<:NamedTuple,<:Dataset}
waic(data::InferenceData; [var_name::Symbol]) -> WAICResult{<:NamedTuple,<:Dataset}

Compute WAIC from log-likelihood values in data.

If more than one log-likelihood variable is present, then var_name must be provided.

See PosteriorStats.waic for more details.

Examples

Calculate WAIC of a model:

julia> using ArviZExampleData, PosteriorStats

julia> idata = load_example_data("centered_eight");

julia> waic(idata)
WAICResult with estimates
 elpd  se_elpd    p  se_p
  -31      1.4  0.9  0.32
source

Model comparison

PosteriorStats.ModelComparisonResultType
ModelComparisonResult

Result of model comparison using ELPD.

This struct implements the Tables and TableTraits interfaces.

Each field returns a collection of the corresponding entry for each model:

  • name: Names of the models, if provided.

  • rank: Ranks of the models (ordered by decreasing ELPD)

  • elpd_diff: ELPD of a model subtracted from the largest ELPD of any model

  • se_elpd_diff: Standard error of the ELPD difference

  • weight: Model weights computed with weights_method

  • elpd_result: AbstactELPDResults for each model, which can be used to access useful stats like ELPD estimates, pointwise estimates, and Pareto shape values for PSIS-LOO

  • weights_method: Method used to compute model weights with model_weights

source
PosteriorStats.compareFunction
compare(models; kwargs...) -> ModelComparisonResult

Compare models based on their expected log pointwise predictive density (ELPD).

The ELPD is estimated either by Pareto smoothed importance sampling leave-one-out cross-validation (LOO) or using the widely applicable information criterion (WAIC). loo is the default and recommended method for computing the ELPD. For more theory, see Spiegelhalter2002.

Arguments

  • models: a Tuple, NamedTuple, or AbstractVector whose values are either AbstractELPDResult entries or any argument to elpd_method.

Keywords

  • weights_method::AbstractModelWeightsMethod=Stacking(): the method to be used to weight the models. See model_weights for details
  • elpd_method=loo: a method that computes an AbstractELPDResult from an argument in models.
  • sort::Bool=true: Whether to sort models by decreasing ELPD.

Returns

  • ModelComparisonResult: A container for the model comparison results. The fields contain a similar collection to models.

Examples

Compare the centered and non centered models of the eight school problem using the defaults: loo and Stacking weights. A custom myloo method formates the inputs as expected by loo.

julia> using ArviZExampleData

julia> models = (
           centered=load_example_data("centered_eight"),
           non_centered=load_example_data("non_centered_eight"),
       );

julia> function myloo(idata)
           log_like = PermutedDimsArray(idata.log_likelihood.obs, (2, 3, 1))
           return loo(log_like)
       end;

julia> mc = compare(models; elpd_method=myloo)
┌ Warning: 1 parameters had Pareto shape values 0.7 < k ≤ 1. Resulting importance sampling estimates are likely to be unstable.
└ @ PSIS ~/.julia/packages/PSIS/...
ModelComparisonResult with Stacking weights
               rank  elpd  se_elpd  elpd_diff  se_elpd_diff  weight    p  se_p
 non_centered     1   -31      1.5       0            0.0       1.0  0.9  0.32
 centered         2   -31      1.4       0.03         0.061     0.0  0.9  0.33
julia> mc.weight |> pairs
pairs(::NamedTuple) with 2 entries:
  :non_centered => 1.0
  :centered     => 3.50546e-31

Compare the same models from pre-computed PSIS-LOO results and computing BootstrappedPseudoBMA weights:

julia> elpd_results = mc.elpd_result;

julia> compare(elpd_results; weights_method=BootstrappedPseudoBMA())
ModelComparisonResult with BootstrappedPseudoBMA weights
               rank  elpd  se_elpd  elpd_diff  se_elpd_diff  weight    p  se_p
 non_centered     1   -31      1.5       0            0.0      0.51  0.9  0.32
 centered         2   -31      1.4       0.03         0.061    0.49  0.9  0.33

References

source
PosteriorStats.model_weightsFunction
model_weights(elpd_results; method=Stacking())
model_weights(method::AbstractModelWeightsMethod, elpd_results)

Compute weights for each model in elpd_results using method.

elpd_results is a Tuple, NamedTuple, or AbstractVector with AbstractELPDResult entries. The weights are returned in the same type of collection.

Stacking is the recommended approach, as it performs well even when the true data generating process is not included among the candidate models. See Yao2018 for details.

See also: AbstractModelWeightsMethod, compare

Examples

Compute Stacking weights for two models:

julia> using ArviZExampleData

julia> models = (
           centered=load_example_data("centered_eight"),
           non_centered=load_example_data("non_centered_eight"),
       );

julia> elpd_results = map(models) do idata
           log_like = PermutedDimsArray(idata.log_likelihood.obs, (2, 3, 1))
           return loo(log_like)
       end;
┌ Warning: 1 parameters had Pareto shape values 0.7 < k ≤ 1. Resulting importance sampling estimates are likely to be unstable.
└ @ PSIS ~/.julia/packages/PSIS/...

julia> model_weights(elpd_results; method=Stacking()) |> pairs
pairs(::NamedTuple) with 2 entries:
  :centered     => 3.50546e-31
  :non_centered => 1.0

Now we compute BootstrappedPseudoBMA weights for the same models:

julia> model_weights(elpd_results; method=BootstrappedPseudoBMA()) |> pairs
pairs(::NamedTuple) with 2 entries:
  :centered     => 0.492513
  :non_centered => 0.507487

References

  • Yao2018 Yao et al. Bayesian Analysis 13, 3 (2018)
source

The following model weighting methods are available

PosteriorStats.BootstrappedPseudoBMAType
struct BootstrappedPseudoBMA{R<:Random.AbstractRNG, T<:Real} <: AbstractModelWeightsMethod

Model weighting method using pseudo Bayesian Model Averaging using Akaike-type weighting with the Bayesian bootstrap (pseudo-BMA+)Yao2018.

The Bayesian bootstrap stabilizes the model weights.

BootstrappedPseudoBMA(; rng=Random.default_rng(), samples=1_000, alpha=1)
BootstrappedPseudoBMA(rng, samples, alpha)

Construct the method.

  • rng::Random.AbstractRNG: The random number generator to use for the Bayesian bootstrap

  • samples::Int64: The number of samples to draw for bootstrapping

  • alpha::Real: The shape parameter in the Dirichlet distribution used for the Bayesian bootstrap. The default (1) corresponds to a uniform distribution on the simplex.

See also: Stacking

References

  • Yao2018 Yao et al. Bayesian Analysis 13, 3 (2018)
source
PosteriorStats.PseudoBMAType
struct PseudoBMA <: AbstractModelWeightsMethod

Model weighting method using pseudo Bayesian Model Averaging (pseudo-BMA) and Akaike-type weighting.

PseudoBMA(; regularize=false)
PseudoBMA(regularize)

Construct the method with optional regularization of the weights using the standard error of the ELPD estimate.

Note

This approach is not recommended, as it produces unstable weight estimates. It is recommended to instead use BootstrappedPseudoBMA to stabilize the weights or Stacking. For details, see Yao2018.

See also: Stacking

References

  • Yao2018 Yao et al. Bayesian Analysis 13, 3 (2018)
source
PosteriorStats.StackingType
struct Stacking{O<:Optim.AbstractOptimizer} <: AbstractModelWeightsMethod

Model weighting using stacking of predictive distributionsYao2018.

Stacking(; optimizer=Optim.LBFGS(), options=Optim.Options()
Stacking(optimizer[, options])

Construct the method, optionally customizing the optimization.

  • optimizer::Optim.AbstractOptimizer: The optimizer to use for the optimization of the weights. The optimizer must support projected gradient optimization via a manifold field.

  • options::Optim.Options: The Optim options to use for the optimization of the weights.

See also: BootstrappedPseudoBMA

References

  • Yao2018 Yao et al. Bayesian Analysis 13, 3 (2018)
source

Predictive checks

PosteriorStats.loo_pitFunction
loo_pit(y, y_pred, log_weights) -> Union{Real,AbstractArray}

Compute leave-one-out probability integral transform (LOO-PIT) checks.

Arguments

  • y: array of observations with shape (params...,)
  • y_pred: array of posterior predictive samples with shape (draws, chains, params...).
  • log_weights: array of normalized log LOO importance weights with shape (draws, chains, params...).

Returns

  • pitvals: LOO-PIT values with same size as y. If y is a scalar, then pitvals is a scalar.

LOO-PIT is a marginal posterior predictive check. If $y_{-i}$ is the array $y$ of observations with the $i$th observation left out, and $y_i^*$ is a posterior prediction of the $i$th observation, then the LOO-PIT value for the $i$th observation is defined as

\[P(y_i^* \le y_i \mid y_{-i}) = \int_{-\infty}^{y_i} p(y_i^* \mid y_{-i}) \mathrm{d} y_i^*\]

The LOO posterior predictions and the corresponding observations should have similar distributions, so if conditional predictive distributions are well-calibrated, then for continuous data, all LOO-PIT values should be approximately uniformly distributed on $[0, 1]$. Gabry2019

Warning

For discrete data, the LOO-PIT values will typically not be uniformly distributed on $[0, 1]$, and this function is not recommended.

Examples

Calculate LOO-PIT values using as test quantity the observed values themselves.

julia> using ArviZExampleData

julia> idata = load_example_data("centered_eight");

julia> y = idata.observed_data.obs;

julia> y_pred = PermutedDimsArray(idata.posterior_predictive.obs, (:draw, :chain, :school));

julia> log_like = PermutedDimsArray(idata.log_likelihood.obs, (:draw, :chain, :school));

julia> log_weights = loo(log_like).psis_result.log_weights;

julia> loo_pit(y, y_pred, log_weights)
┌ 8-element DimArray{Float64, 1} ┐
├────────────────────────────────┴─────────────────────────────── dims ┐
  ↓ school Categorical{String} ["Choate", …, "Mt. Hermon"] Unordered
└──────────────────────────────────────────────────────────────────────┘
 "Choate"            0.942759
 "Deerfield"         0.641057
 "Phillips Andover"  0.32729
 "Phillips Exeter"   0.581451
 "Hotchkiss"         0.288523
 "Lawrenceville"     0.393741
 "St. Paul's"        0.886175
 "Mt. Hermon"        0.638821

Calculate LOO-PIT values using as test quantity the square of the difference between each observation and mu.

julia> using Statistics

julia> mu = idata.posterior.mu;

julia> T = y .- median(mu);

julia> T_pred = y_pred .- mu;

julia> loo_pit(T .^ 2, T_pred .^ 2, log_weights)
┌ 8-element DimArray{Float64, 1} ┐
├────────────────────────────────┴─────────────────────────────── dims ┐
  ↓ school Categorical{String} ["Choate", …, "Mt. Hermon"] Unordered
└──────────────────────────────────────────────────────────────────────┘
 "Choate"            0.868148
 "Deerfield"         0.27421
 "Phillips Andover"  0.321719
 "Phillips Exeter"   0.193169
 "Hotchkiss"         0.370422
 "Lawrenceville"     0.195601
 "St. Paul's"        0.817408
 "Mt. Hermon"        0.326795

References

  • Gabry2019 Gabry et al. J. R. Stat. Soc. Ser. A Stat. Soc. 182 (2019).
source
loo_pit(idata::InferenceData, log_weights; kwargs...) -> DimArray

Compute LOO-PIT values using existing normalized log LOO importance weights.

Keywords

  • y_name: Name of observed data variable in idata.observed_data. If not provided, then the only observed data variable is used.
  • y_pred_name: Name of posterior predictive variable in idata.posterior_predictive. If not provided, then y_name is used.
  • kwargs: Remaining keywords are forwarded to the base method PosteriorStats.loo_pit.

See PosteriorStats.loo_pit for more details.

Examples

Calculate LOO-PIT values using already computed log weights.

julia> using ArviZExampleData, PosteriorStats

julia> idata = load_example_data("centered_eight");

julia> loo_result = loo(idata; var_name=:obs);

julia> loo_pit(idata, loo_result.psis_result.log_weights; y_name=:obs)
┌ 8-element DimArray{Float64, 1} loo_pit_obs ┐
├────────────────────────────────────────────┴─────────────────── dims ┐
  ↓ school Categorical{String} ["Choate", …, "Mt. Hermon"] Unordered
└──────────────────────────────────────────────────────────────────────┘
 "Choate"            0.942759
 "Deerfield"         0.641057
 "Phillips Andover"  0.32729
 "Phillips Exeter"   0.581451
 "Hotchkiss"         0.288523
 "Lawrenceville"     0.393741
 "St. Paul's"        0.886175
 "Mt. Hermon"        0.638821
source
loo_pit(idata::InferenceData; kwargs...) -> DimArray

Compute LOO-PIT from groups in idata using PSIS-LOO.

Keywords

  • y_name: Name of observed data variable in idata.observed_data. If not provided, then the only observed data variable is used.
  • y_pred_name: Name of posterior predictive variable in idata.posterior_predictive. If not provided, then y_name is used.
  • log_likelihood_name: Name of log-likelihood variable in idata.log_likelihood. If not provided, then y_name is used if idata has a log_likelihood group, otherwise the only variable is used.
  • reff::Union{Real,AbstractArray{<:Real}}: The relative effective sample size(s) of the likelihood values. If an array, it must have the same data dimensions as the corresponding log-likelihood variable. If not provided, then this is estimated using ess.
  • kwargs: Remaining keywords are forwarded to PosteriorStats.loo_pit.

See PosteriorStats.loo_pit for more details.

Examples

Calculate LOO-PIT values using as test quantity the observed values themselves.

julia> using ArviZExampleData, PosteriorStats

julia> idata = load_example_data("centered_eight");

julia> loo_pit(idata; y_name=:obs)
┌ 8-element DimArray{Float64, 1} loo_pit_obs ┐
├────────────────────────────────────────────┴─────────────────── dims ┐
  ↓ school Categorical{String} ["Choate", …, "Mt. Hermon"] Unordered
└──────────────────────────────────────────────────────────────────────┘
 "Choate"            0.942759
 "Deerfield"         0.641057
 "Phillips Andover"  0.32729
 "Phillips Exeter"   0.581451
 "Hotchkiss"         0.288523
 "Lawrenceville"     0.393741
 "St. Paul's"        0.886175
 "Mt. Hermon"        0.638821
source
PosteriorStats.r2_scoreFunction
r2_score(y_true::AbstractVector, y_pred::AbstractArray) -> (; r2, r2_std)

$R²$ for linear Bayesian regression models.Gelman2019

Arguments

  • y_true: Observed data of length noutputs
  • y_pred: Predicted data with size (ndraws[, nchains], noutputs)

Examples

julia> using ArviZExampleData

julia> idata = load_example_data("regression1d");

julia> y_true = idata.observed_data.y;

julia> y_pred = PermutedDimsArray(idata.posterior_predictive.y, (:draw, :chain, :y_dim_0));

julia> r2_score(y_true, y_pred) |> pairs
pairs(::NamedTuple) with 2 entries:
  :r2     => 0.683197
  :r2_std => 0.0368838

References

  • Gelman2019 Gelman et al, The Am. Stat., 73(3) (2019)
source
r2_score(idata::InferenceData; y_name, y_pred_name) -> (; r2, r2_std)

Compute $R²$ from idata, automatically formatting the predictions to the correct shape.

Keywords

  • y_name: Name of observed data variable in idata.observed_data. If not provided, then the only observed data variable is used.
  • y_pred_name: Name of posterior predictive variable in idata.posterior_predictive. If not provided, then y_name is used.

See PosteriorStats.r2_score for more details.

Examples

julia> using ArviZExampleData, PosteriorStats

julia> idata = load_example_data("regression10d");

julia> r2_score(idata) |> pairs
pairs(::NamedTuple) with 2 entries:
  :r2     => 0.998385
  :r2_std => 0.000100621
source

Utilities

PosteriorStats.kde_reflectedFunction
kde_reflected(data::AbstractVector{<:Real}; bounds=extrema(data), kwargs...)

Compute the boundary-corrected kernel density estimate (KDE) of data using reflection.

For $x \in (l, u)$, the reflected KDE has the density

\[\hat{f}_R(x) = \hat{f}(x) + \hat{f}(2l - x) + \hat{f}(2u - x),\]

where $\hat{f}$ is the usual KDE of data. This is equivalent to augmenting the original data with 2 additional copies of the data reflected around each bound, computing the usual KDE, trimming the KDE to the bounds, and renormalizing.

Any non-finite bounds are ignored. Remaining kwargs are passed to KernelDensity.kde. The default bandwidth is estimated using the Improved Sheather-Jones (ISJ) method Botev2010.

References

  • Botev2010 Botev et al. Ann. Stat., 38: 5 (2010)
source
PosteriorStats.pointwise_loglikelihoodsFunction
pointwise_loglikelihoods

Compute pointwise conditional log-likelihoods for ELPD-based model comparison/validation.

Given model parameters $\theta$ and observations $y$, the pointwise conditional log-likelihood of $y_i$ given $y_{-i}$ (the elements of $y$ excluding $y_i$) and $\theta$ is defined as

\[\log p(y_i \mid y_{-i}, \theta)\]

This method is a utility function that dependant packages can override to provide pointwise conditional log-likelihoods for their own models/distributions.

See also: loo

source