Stats

PSIS.PSISResult
PosteriorStats.AbstractELPDResult
PosteriorStats.AbstractModelWeightsMethod
PosteriorStats.BootstrappedPseudoBMA
PosteriorStats.ModelComparisonResult
PosteriorStats.PSISLOOResult
PosteriorStats.PseudoBMA
PosteriorStats.Stacking
PosteriorStats.SummaryStats
PosteriorStats.WAICResult
PSIS.PSISPlots.paretoshapeplot
PSIS.ess_is
PSIS.psis
PSIS.psis!
PosteriorStats.compare
PosteriorStats.default_diagnostics
PosteriorStats.default_stats
PosteriorStats.default_summary_stats
PosteriorStats.elpd_estimates
PosteriorStats.hdi
PosteriorStats.hdi!
PosteriorStats.information_criterion
PosteriorStats.loo
PosteriorStats.loo_pit
PosteriorStats.model_weights
PosteriorStats.r2_score
PosteriorStats.smooth_data
PosteriorStats.summarize
PosteriorStats.waic
StatsBase.summarystats

Summary statistics

PosteriorStats.SummaryStats — Type

struct SummaryStats{D, V<:(AbstractVector)}

A container for a column table of values computed by summarize.

This object implements the Tables and TableTraits column table interfaces. It has a custom show method.

SummaryStats behaves like an OrderedDict of columns, where the columns can be accessed using either Symbols or a 1-based integer index.

name::String: The name of the collection of summary statistics, used as the table title in display.
data::Any: The summary statistics for each parameter. It must implement the Tables interface.
parameter_names::AbstractVector: Names of the parameters

SummaryStats([name::String,] data[, parameter_names])
SummaryStats(data[, parameter_names]; name::String="SummaryStats")

Construct a SummaryStats from tabular data with optional stats name and param_names.

data must not contain a column :parameter, as this is reserved for the parameter names, which are always in the first column.

source

PosteriorStats.default_summary_stats — Function

default_summary_stats(focus=Statistics.mean; kwargs...)

Combinatiton of default_stats and default_diagnostics to be used with summarize.

source

PosteriorStats.default_stats — Function

default_stats(focus=Statistics.mean; prob_interval=0.94, kwargs...)

Default statistics to be computed with summarize.

The value of focus determines the statistics to be returned:

Statistics.mean: mean, std, hdi_3%, hdi_97%
Statistics.median: median, mad, eti_3%, eti_97%

If prob_interval is set to a different value than the default, then different HDI and ETI statistics are computed accordingly. hdi refers to the highest-density interval, while eti refers to the equal-tailed interval (i.e. the credible interval computed from symmetric quantiles).

General statistics

PosteriorStats.hdi — Function

hdi(samples::AbstractArray{<:Real}; prob=0.94) -> (; lower, upper)

Estimate the unimodal highest density interval (HDI) of samples for the probability prob.

The HDI is the minimum width Bayesian credible interval (BCI). That is, it is the smallest possible interval containing (100*prob)% of the probability mass.^{[Hyndman1996]}

samples is an array of shape (draws[, chains[, params...]]). If multiple parameters are present, then lower and upper are arrays with the shape (params...,), computed separately for each marginal.

This implementation uses the algorithm of ^{[ChenShao1999]}.

Note

Any default value of prob is arbitrary. The default value of prob=0.94 instead of a more common default like prob=0.95 is chosen to reminder the user of this arbitrariness.

Examples

Here we calculate the 83% HDI for a normal random variable:

julia> x = randn(2_000);

julia> hdi(x; prob=0.83) |> pairs
pairs(::NamedTuple) with 2 entries:
  :lower => -1.38266
  :upper => 1.25982

We can also calculate the HDI for a 3-dimensional array of samples:

julia> x = randn(1_000, 1, 1) .+ reshape(0:5:10, 1, 1, :);

julia> hdi(x) |> pairs
pairs(::NamedTuple) with 2 entries:
  :lower => [-1.9674, 3.0326, 8.0326]
  :upper => [1.90028, 6.90028, 11.9003]

source

hdi(data::InferenceData; kwargs...) -> Dataset
hdi(data::Dataset; kwargs...) -> Dataset

Calculate the highest density interval (HDI) for each parameter in the data.

For more details and a description of the kwargs, see PosteriorStats.hdi.

source

PosteriorStats.hdi! — Function

hdi!(samples::AbstractArray{<:Real}; prob=0.94) -> (; lower, upper)

A version of hdi that sorts samples in-place while computing the HDI.

source

PosteriorStats.r2_score — Function

r2_score(y_true::AbstractVector, y_pred::AbstractArray) -> (; r2, r2_std)

$R²$ for linear Bayesian regression models.^{[GelmanGoodrich2019]}

Arguments

y_true: Observed data of length noutputs
y_pred: Predicted data with size (ndraws[, nchains], noutputs)

Examples

julia> using ArviZExampleData

julia> idata = load_example_data("regression1d");

julia> y_true = idata.observed_data.y;

julia> y_pred = PermutedDimsArray(idata.posterior_predictive.y, (:draw, :chain, :y_dim_0));

julia> r2_score(y_true, y_pred) |> pairs
pairs(::NamedTuple) with 2 entries:
  :r2     => 0.683197
  :r2_std => 0.0368838

source

r2_score(idata::InferenceData; y_name, y_pred_name) -> (; r2, r2_std)

Compute $R²$ from idata, automatically formatting the predictions to the correct shape.

Keywords

y_name: Name of observed data variable in idata.observed_data. If not provided, then the only observed data variable is used.
y_pred_name: Name of posterior predictive variable in idata.posterior_predictive. If not provided, then y_name is used.

See PosteriorStats.r2_score for more details.

Examples

julia> using ArviZExampleData, PosteriorStats

julia> idata = load_example_data("regression10d");

julia> r2_score(idata) |> pairs
pairs(::NamedTuple) with 2 entries:
  :r2     => 0.998385
  :r2_std => 0.000100621

source

Pareto-smoothed importance sampling

PSIS.PSISResult — Type

PSISResult

Result of Pareto-smoothed importance sampling (PSIS) using psis.

Properties

log_weights: un-normalized Pareto-smoothed log weights
weights: normalized Pareto-smoothed weights (allocates a copy)
pareto_shape: Pareto $k=ξ$ shape parameter
nparams: number of parameters in log_weights
ndraws: number of draws in log_weights
nchains: number of chains in log_weights
reff: the ratio of the effective sample size of the unsmoothed importance ratios and the actual sample size.
ess: estimated effective sample size of estimate of mean using smoothed importance samples (see ess_is)
tail_length: length of the upper tail of log_weights that was smoothed
tail_dist: the generalized Pareto distribution that was fit to the tail of log_weights. Note that the tail weights are scaled to have a maximum of 1, so tail_dist * exp(maximum(log_ratios)) is the corresponding fit directly to the tail of log_ratios.
normalized::Bool:indicates whether log_weights are log-normalized along the sample dimensions.

Diagnostic

The pareto_shape parameter $k=ξ$ of the generalized Pareto distribution tail_dist can be used to diagnose reliability and convergence of estimates using the importance weights VehtariSimpson2021.

if $k < \frac{1}{3}$, importance sampling is stable, and importance sampling (IS) and PSIS both are reliable.
if $k ≤ \frac{1}{2}$, then the importance ratio distributon has finite variance, and the central limit theorem holds. As $k$ approaches the upper bound, IS becomes less reliable, while PSIS still works well but with a higher RMSE.
if $\frac{1}{2} < k ≤ 0.7$, then the variance is infinite, and IS can behave quite poorly. However, PSIS works well in this regime.
if $0.7 < k ≤ 1$, then it quickly becomes impractical to collect enough importance weights to reliably compute estimates, and importance sampling is not recommended.
if $k > 1$, then neither the variance nor the mean of the raw importance ratios exists. The convergence rate is close to zero, and bias can be large with practical sample sizes.

See PSISPlots.paretoshapeplot for a diagnostic plot.

References

VehtariSimpson2021 Vehtari et al. JMLR 25:72 (2021).

source

PSIS.ess_is — Function

ess_is(weights; reff=1)

Estimate effective sample size (ESS) for importance sampling over the sample dimensions.

Given normalized weights $w_{1:n}$, the ESS is estimated using the L2-norm of the weights:

\[\mathrm{ESS}(w_{1:n}) = \frac{r_{\mathrm{eff}}}{\sum_{i=1}^n w_i^2}\]

where $r_{\mathrm{eff}}$ is the relative efficiency of the log_weights.

ess_is(result::PSISResult; bad_shape_nan=true)

Estimate ESS for Pareto-smoothed importance sampling.

Note

ESS estimates for Pareto shape values $k > 0.7$, which are unreliable and misleadingly high, are set to NaN. To avoid this, set bad_shape_nan=false.

source

PSIS.PSISPlots.paretoshapeplot — Function

paretoshapeplot(values; showlines=false, ...)
paretoshapeplot!(values; showlines=false, kwargs...)

Plot shape parameters of fitted Pareto tail distributions for diagnosing convergence.

values may be either a vector of Pareto shape parameters or a PSIS.PSISResult.

If showlines==true, horizontal lines indicating relevant Pareto shape thresholds are drawn. See PSIS.PSISResult for an explanation of the thresholds.

All remaining kwargs are forwarded to the plotting function.

See psis, PSISResult.

Examples

using PSIS, Distributions, Plots
proposal = Normal()
target = TDist(7)
x = rand(proposal, 1_000, 100)
log_ratios = logpdf.(target, x) .- logpdf.(proposal, x)
result = psis(log_ratios)
paretoshapeplot(result)

We can also plot the Pareto shape parameters directly:

paretoshapeplot(result.pareto_shape)

We can also use plot directly:

plot(result.pareto_shape; showlines=true)

source

PSIS.psis — Function

psis(log_ratios, reff = 1.0; kwargs...) -> PSISResult
psis!(log_ratios, reff = 1.0; kwargs...) -> PSISResult

Compute Pareto smoothed importance sampling (PSIS) log weights VehtariSimpson2021.

While psis computes smoothed log weights out-of-place, psis! smooths them in-place.

Arguments

log_ratios: an array of logarithms of importance ratios, with size (draws, [chains, [parameters...]]), where chains>1 would be used when chains are generated using Markov chain Monte Carlo.
reff::Union{Real,AbstractArray}: the ratio(s) of effective sample size of log_ratios and the actual sample size reff = ess/(draws * chains), used to account for autocorrelation, e.g. due to Markov chain Monte Carlo. If an array, it must have the size (parameters...,) to match log_ratios.

Keywords

warn=true: If true, warning messages are delivered
normalize=true: If true, the log-weights will be log-normalized so that exp.(log_weights) sums to 1 along the sample dimensions.

Returns

result: a PSISResult object containing the results of the Pareto-smoothing.

A warning is raised if the Pareto shape parameter $k ≥ 0.7$. See PSISResult for details and PSISPlots.paretoshapeplot for a diagnostic plot.

Examples

Here we smooth log importance ratios for importance sampling 30 isotropic Student $t$-distributed parameters using standard normal distributions as proposals.

julia> using Distributions

julia> proposal, target = Normal(), TDist(7);

julia> x = rand(proposal, 1_000, 1, 30);  # (ndraws, nchains, nparams)

julia> log_ratios = @. logpdf(target, x) - logpdf(proposal, x);

julia> result = psis(log_ratios)
┌ Warning: 9 parameters had Pareto shape values 0.7 < k ≤ 1. Resulting importance sampling estimates are likely to be unstable.
└ @ PSIS ~/.julia/packages/PSIS/...
┌ Warning: 1 parameters had Pareto shape values k > 1. Corresponding importance sampling estimates are likely to be unstable and are unlikely to converge with additional samples.
└ @ PSIS ~/.julia/packages/PSIS/...
PSISResult with 1000 draws, 1 chains, and 30 parameters
Pareto shape (k) diagnostic values:
                        Count       Min. ESS
 (-Inf, 0.5]  good       7 (23.3%)  959
  (0.5, 0.7]  okay      13 (43.3%)  938
    (0.7, 1]  bad        9 (30.0%)  ——
    (1, Inf)  very bad   1 (3.3%)   ——

If the draws were generated using MCMC, we can compute the relative efficiency using MCMCDiagnosticTools.ess.

julia> using MCMCDiagnosticTools

julia> reff = ess(log_ratios; kind=:basic, split_chains=1, relative=true);

julia> result = psis(log_ratios, reff)
┌ Warning: 9 parameters had Pareto shape values 0.7 < k ≤ 1. Resulting importance sampling estimates are likely to be unstable.
└ @ PSIS ~/.julia/packages/PSIS/...
┌ Warning: 1 parameters had Pareto shape values k > 1. Corresponding importance sampling estimates are likely to be unstable and are unlikely to converge with additional samples.
└ @ PSIS ~/.julia/packages/PSIS/...
PSISResult with 1000 draws, 1 chains, and 30 parameters
Pareto shape (k) diagnostic values:
                        Count       Min. ESS
 (-Inf, 0.5]  good       9 (30.0%)  806
  (0.5, 0.7]  okay      11 (36.7%)  842
    (0.7, 1]  bad        9 (30.0%)  ——
    (1, Inf)  very bad   1 (3.3%)   ——

References

VehtariSimpson2021 Vehtari et al. JMLR 25:72 (2021).

source

PSIS.psis! — Function

psis(log_ratios, reff = 1.0; kwargs...) -> PSISResult
psis!(log_ratios, reff = 1.0; kwargs...) -> PSISResult

Compute Pareto smoothed importance sampling (PSIS) log weights VehtariSimpson2021.

While psis computes smoothed log weights out-of-place, psis! smooths them in-place.

Arguments

log_ratios: an array of logarithms of importance ratios, with size (draws, [chains, [parameters...]]), where chains>1 would be used when chains are generated using Markov chain Monte Carlo.
reff::Union{Real,AbstractArray}: the ratio(s) of effective sample size of log_ratios and the actual sample size reff = ess/(draws * chains), used to account for autocorrelation, e.g. due to Markov chain Monte Carlo. If an array, it must have the size (parameters...,) to match log_ratios.

Keywords

warn=true: If true, warning messages are delivered
normalize=true: If true, the log-weights will be log-normalized so that exp.(log_weights) sums to 1 along the sample dimensions.

Returns

result: a PSISResult object containing the results of the Pareto-smoothing.

A warning is raised if the Pareto shape parameter $k ≥ 0.7$. See PSISResult for details and PSISPlots.paretoshapeplot for a diagnostic plot.

Examples

Here we smooth log importance ratios for importance sampling 30 isotropic Student $t$-distributed parameters using standard normal distributions as proposals.

julia> using Distributions

julia> proposal, target = Normal(), TDist(7);

julia> x = rand(proposal, 1_000, 1, 30);  # (ndraws, nchains, nparams)

julia> log_ratios = @. logpdf(target, x) - logpdf(proposal, x);

julia> result = psis(log_ratios)
┌ Warning: 9 parameters had Pareto shape values 0.7 < k ≤ 1. Resulting importance sampling estimates are likely to be unstable.
└ @ PSIS ~/.julia/packages/PSIS/...
┌ Warning: 1 parameters had Pareto shape values k > 1. Corresponding importance sampling estimates are likely to be unstable and are unlikely to converge with additional samples.
└ @ PSIS ~/.julia/packages/PSIS/...
PSISResult with 1000 draws, 1 chains, and 30 parameters
Pareto shape (k) diagnostic values:
                        Count       Min. ESS
 (-Inf, 0.5]  good       7 (23.3%)  959
  (0.5, 0.7]  okay      13 (43.3%)  938
    (0.7, 1]  bad        9 (30.0%)  ——
    (1, Inf)  very bad   1 (3.3%)   ——

If the draws were generated using MCMC, we can compute the relative efficiency using MCMCDiagnosticTools.ess.

julia> using MCMCDiagnosticTools

julia> reff = ess(log_ratios; kind=:basic, split_chains=1, relative=true);

julia> result = psis(log_ratios, reff)
┌ Warning: 9 parameters had Pareto shape values 0.7 < k ≤ 1. Resulting importance sampling estimates are likely to be unstable.
└ @ PSIS ~/.julia/packages/PSIS/...
┌ Warning: 1 parameters had Pareto shape values k > 1. Corresponding importance sampling estimates are likely to be unstable and are unlikely to converge with additional samples.
└ @ PSIS ~/.julia/packages/PSIS/...
PSISResult with 1000 draws, 1 chains, and 30 parameters
Pareto shape (k) diagnostic values:
                        Count       Min. ESS
 (-Inf, 0.5]  good       9 (30.0%)  806
  (0.5, 0.7]  okay      11 (36.7%)  842
    (0.7, 1]  bad        9 (30.0%)  ——
    (1, Inf)  very bad   1 (3.3%)   ——

References

VehtariSimpson2021 Vehtari et al. JMLR 25:72 (2021).

source

LOO and WAIC

PosteriorStats.AbstractELPDResult — Type

abstract type AbstractELPDResult

An abstract type representing the result of an ELPD computation.

Every subtype stores estimates of both the expected log predictive density (elpd) and the effective number of parameters p, as well as standard errors and pointwise estimates of each, from which other relevant estimates can be computed.

Subtypes implement the following functions:

elpd_estimates
information_criterion

source

PosteriorStats.PSISLOOResult — Type

Results of Pareto-smoothed importance sampling leave-one-out cross-validation (PSIS-LOO).

Model comparison

PosteriorStats.ModelComparisonResult — Type

ModelComparisonResult

Result of model comparison using ELPD.

This struct implements the Tables and TableTraits interfaces.

Each field returns a collection of the corresponding entry for each model:

name: Names of the models, if provided.
rank: Ranks of the models (ordered by decreasing ELPD)
elpd_diff: ELPD of a model subtracted from the largest ELPD of any model
elpd_diff_mcse: Monte Carlo standard error of the ELPD difference
weight: Model weights computed with weights_method
elpd_result: AbstactELPDResults for each model, which can be used to access useful stats like ELPD estimates, pointwise estimates, and Pareto shape values for PSIS-LOO
weights_method: Method used to compute model weights with model_weights

source

PosteriorStats.compare — Function

compare(models; kwargs...) -> ModelComparisonResult

Compare models based on their expected log pointwise predictive density (ELPD).

The ELPD is estimated either by Pareto smoothed importance sampling leave-one-out cross-validation (LOO) or using the widely applicable information criterion (WAIC). We recommend loo. Read more theory here - in a paper by some of the leading authorities on model comparison dx.doi.org/10.1111/1467-9868.00353

Arguments

models: a Tuple, NamedTuple, or AbstractVector whose values are either AbstractELPDResult entries or any argument to elpd_method.

Keywords

weights_method::AbstractModelWeightsMethod=Stacking(): the method to be used to weight the models. See model_weights for details
elpd_method=loo: a method that computes an AbstractELPDResult from an argument in models.
sort::Bool=true: Whether to sort models by decreasing ELPD.

Returns

ModelComparisonResult: A container for the model comparison results. The fields contain a similar collection to models.

Examples

Compare the centered and non centered models of the eight school problem using the defaults: loo and Stacking weights. A custom myloo method formates the inputs as expected by loo.

julia> using ArviZExampleData

julia> models = (
           centered=load_example_data("centered_eight"),
           non_centered=load_example_data("non_centered_eight"),
       );

julia> function myloo(idata)
           log_like = PermutedDimsArray(idata.log_likelihood.obs, (2, 3, 1))
           return loo(log_like)
       end;

julia> mc = compare(models; elpd_method=myloo)
┌ Warning: 1 parameters had Pareto shape values 0.7 < k ≤ 1. Resulting importance sampling estimates are likely to be unstable.
└ @ PSIS ~/.julia/packages/PSIS/...
ModelComparisonResult with Stacking weights
               rank  elpd  elpd_mcse  elpd_diff  elpd_diff_mcse  weight    p   ⋯
 non_centered     1   -31        1.5       0              0.0       1.0  0.9   ⋯
 centered         2   -31        1.4       0.03           0.061     0.0  0.9   ⋯
                                                                1 column omitted
julia> mc.weight |> pairs
pairs(::NamedTuple) with 2 entries:
  :non_centered => 1.0
  :centered     => 3.50538e-31

Compare the same models from pre-computed PSIS-LOO results and computing BootstrappedPseudoBMA weights:

julia> elpd_results = mc.elpd_result;

julia> compare(elpd_results; weights_method=BootstrappedPseudoBMA())
ModelComparisonResult with BootstrappedPseudoBMA weights
               rank  elpd  elpd_mcse  elpd_diff  elpd_diff_mcse  weight    p   ⋯
 non_centered     1   -31        1.5       0              0.0      0.51  0.9   ⋯
 centered         2   -31        1.4       0.03           0.061    0.49  0.9   ⋯
                                                                1 column omitted

source

PosteriorStats.model_weights — Function

model_weights(elpd_results; method=Stacking())
model_weights(method::AbstractModelWeightsMethod, elpd_results)

Compute weights for each model in elpd_results using method.

elpd_results is a Tuple, NamedTuple, or AbstractVector with AbstractELPDResult entries. The weights are returned in the same type of collection.

Stacking is the recommended approach, as it performs well even when the true data generating process is not included among the candidate models. See ^{[YaoVehtari2018]} for details.

Examples

Compute Stacking weights for two models:

julia> using ArviZExampleData

julia> models = (
           centered=load_example_data("centered_eight"),
           non_centered=load_example_data("non_centered_eight"),
       );

julia> elpd_results = map(models) do idata
           log_like = PermutedDimsArray(idata.log_likelihood.obs, (2, 3, 1))
           return loo(log_like)
       end;
┌ Warning: 1 parameters had Pareto shape values 0.7 < k ≤ 1. Resulting importance sampling estimates are likely to be unstable.
└ @ PSIS ~/.julia/packages/PSIS/...

julia> model_weights(elpd_results; method=Stacking()) |> pairs
pairs(::NamedTuple) with 2 entries:
  :centered     => 3.50546e-31
  :non_centered => 1.0

Now we compute BootstrappedPseudoBMA weights for the same models:

julia> model_weights(elpd_results; method=BootstrappedPseudoBMA()) |> pairs
pairs(::NamedTuple) with 2 entries:
  :centered     => 0.492513
  :non_centered => 0.507487

source

The following model weighting methods are available

PosteriorStats.AbstractModelWeightsMethod — Type

abstract type AbstractModelWeightsMethod

An abstract type representing methods for computing model weights.

Subtypes implement model_weights(method, elpd_results).

source

PosteriorStats.BootstrappedPseudoBMA — Type

struct BootstrappedPseudoBMA{R<:Random.AbstractRNG, T<:Real} <: AbstractModelWeightsMethod

Model weighting method using pseudo Bayesian Model Averaging using Akaike-type weighting with the Bayesian bootstrap (pseudo-BMA+)^{[YaoVehtari2018]}.

The Bayesian bootstrap stabilizes the model weights.

BootstrappedPseudoBMA(; rng=Random.default_rng(), samples=1_000, alpha=1)
BootstrappedPseudoBMA(rng, samples, alpha)

Construct the method.

rng::Random.AbstractRNG: The random number generator to use for the Bayesian bootstrap
samples::Int64: The number of samples to draw for bootstrapping
alpha::Real: The shape parameter in the Dirichlet distribution used for the Bayesian bootstrap. The default (1) corresponds to a uniform distribution on the simplex.

Predictive checks

PosteriorStats.loo_pit — Function

loo_pit(y, y_pred, log_weights; kwargs...) -> Union{Real,AbstractArray}

Compute leave-one-out probability integral transform (LOO-PIT) checks.

Arguments

y: array of observations with shape (params...,)
y_pred: array of posterior predictive samples with shape (draws, chains, params...).
log_weights: array of normalized log LOO importance weights with shape (draws, chains, params...).

Keywords

is_discrete: If not provided, then it is set to true iff elements of y and y_pred are all integer-valued. If true, then data are smoothed using smooth_data to make them non-discrete before estimating LOO-PIT values.
kwargs: Remaining keywords are forwarded to smooth_data if data is discrete.

Returns

pitvals: LOO-PIT values with same size as y. If y is a scalar, then pitvals is a scalar.

LOO-PIT is a marginal posterior predictive check. If $y_{-i}$ is the array $y$ of observations with the $i$th observation left out, and $y_i^*$ is a posterior prediction of the $i$th observation, then the LOO-PIT value for the $i$th observation is defined as

\[P(y_i^* \le y_i \mid y_{-i}) = \int_{-\infty}^{y_i} p(y_i^* \mid y_{-i}) \mathrm{d} y_i^*\]

The LOO posterior predictions and the corresponding observations should have similar distributions, so if conditional predictive distributions are well-calibrated, then all LOO-PIT values should be approximately uniformly distributed on $[0, 1]$.^[Gabry2019]

Examples

Calculate LOO-PIT values using as test quantity the observed values themselves.

julia> using ArviZExampleData

julia> idata = load_example_data("centered_eight");

julia> y = idata.observed_data.obs;

julia> y_pred = PermutedDimsArray(idata.posterior_predictive.obs, (:draw, :chain, :school));

julia> log_like = PermutedDimsArray(idata.log_likelihood.obs, (:draw, :chain, :school));

julia> log_weights = loo(log_like).psis_result.log_weights;

julia> loo_pit(y, y_pred, log_weights)
┌ 8-element DimArray{Float64, 1} ┐
├────────────────────────────────┴─────────────────────────────── dims ┐
  ↓ school Categorical{String} ["Choate", …, "Mt. Hermon"] Unordered
└──────────────────────────────────────────────────────────────────────┘
 "Choate"            0.942759
 "Deerfield"         0.641057
 "Phillips Andover"  0.32729
 "Phillips Exeter"   0.581451
 "Hotchkiss"         0.288523
 "Lawrenceville"     0.393741
 "St. Paul's"        0.886175
 "Mt. Hermon"        0.638821

Calculate LOO-PIT values using as test quantity the square of the difference between each observation and mu.

julia> using Statistics

julia> mu = idata.posterior.mu;

julia> T = y .- median(mu);

julia> T_pred = y_pred .- mu;

julia> loo_pit(T .^ 2, T_pred .^ 2, log_weights)
┌ 8-element DimArray{Float64, 1} ┐
├────────────────────────────────┴─────────────────────────────── dims ┐
  ↓ school Categorical{String} ["Choate", …, "Mt. Hermon"] Unordered
└──────────────────────────────────────────────────────────────────────┘
 "Choate"            0.868148
 "Deerfield"         0.27421
 "Phillips Andover"  0.321719
 "Phillips Exeter"   0.193169
 "Hotchkiss"         0.370422
 "Lawrenceville"     0.195601
 "St. Paul's"        0.817408
 "Mt. Hermon"        0.326795

source

loo_pit(idata::InferenceData, log_weights; kwargs...) -> DimArray

Compute LOO-PIT values using existing normalized log LOO importance weights.

Keywords

y_name: Name of observed data variable in idata.observed_data. If not provided, then the only observed data variable is used.
y_pred_name: Name of posterior predictive variable in idata.posterior_predictive. If not provided, then y_name is used.
kwargs: Remaining keywords are forwarded to the base method PosteriorStats.loo_pit.

See PosteriorStats.loo_pit for more details.

Examples

Calculate LOO-PIT values using already computed log weights.

julia> using ArviZExampleData, PosteriorStats

julia> idata = load_example_data("centered_eight");

julia> loo_result = loo(idata; var_name=:obs);

julia> loo_pit(idata, loo_result.psis_result.log_weights; y_name=:obs)
┌ 8-element DimArray{Float64, 1} loo_pit_obs ┐
├────────────────────────────────────────────┴─────────────────── dims ┐
  ↓ school Categorical{String} ["Choate", …, "Mt. Hermon"] Unordered
└──────────────────────────────────────────────────────────────────────┘
 "Choate"            0.942759
 "Deerfield"         0.641057
 "Phillips Andover"  0.32729
 "Phillips Exeter"   0.581451
 "Hotchkiss"         0.288523
 "Lawrenceville"     0.393741
 "St. Paul's"        0.886175
 "Mt. Hermon"        0.638821

source

loo_pit(idata::InferenceData; kwargs...) -> DimArray

Compute LOO-PIT from groups in idata using PSIS-LOO.

Keywords

y_name: Name of observed data variable in idata.observed_data. If not provided, then the only observed data variable is used.
y_pred_name: Name of posterior predictive variable in idata.posterior_predictive. If not provided, then y_name is used.
log_likelihood_name: Name of log-likelihood variable in idata.log_likelihood. If not provided, then y_name is used if idata has a log_likelihood group, otherwise the only variable is used.
reff::Union{Real,AbstractArray{<:Real}}: The relative effective sample size(s) of the likelihood values. If an array, it must have the same data dimensions as the corresponding log-likelihood variable. If not provided, then this is estimated using ess.
kwargs: Remaining keywords are forwarded to PosteriorStats.loo_pit.

See PosteriorStats.loo_pit for more details.

Examples

Calculate LOO-PIT values using as test quantity the observed values themselves.

julia> using ArviZExampleData, PosteriorStats

julia> idata = load_example_data("centered_eight");

julia> loo_pit(idata; y_name=:obs)
┌ 8-element DimArray{Float64, 1} loo_pit_obs ┐
├────────────────────────────────────────────┴─────────────────── dims ┐
  ↓ school Categorical{String} ["Choate", …, "Mt. Hermon"] Unordered
└──────────────────────────────────────────────────────────────────────┘
 "Choate"            0.942759
 "Deerfield"         0.641057
 "Phillips Andover"  0.32729
 "Phillips Exeter"   0.581451
 "Hotchkiss"         0.288523
 "Lawrenceville"     0.393741
 "St. Paul's"        0.886175
 "Mt. Hermon"        0.638821

source

Utilities

PosteriorStats.smooth_data — Function

smooth_data(y; dims=:, interp_method=CubicSpline, offset_frac=0.01)

Smooth y along dims using interp_method.

interp_method is a 2-argument callabale that takes the arguments y and x and returns a DataInterpolations.jl interpolation method, defaulting to a cubic spline interpolator.

offset_frac is the fraction of the length of y to use as an offset when interpolating.

source

Hyndman1996Rob J. Hyndman (1996) Computing and Graphing Highest Density Regions, Amer. Stat., 50(2): 120-6. DOI: 10.1080/00031305.1996.10474359 jstor.
ChenShao1999Ming-Hui Chen & Qi-Man Shao (1999) Monte Carlo Estimation of Bayesian Credible and HPD Intervals, J Comput. Graph. Stat., 8:1, 69-92. DOI: 10.1080/10618600.1999.10474802 jstor.
GelmanGoodrich2019Andrew Gelman, Ben Goodrich, Jonah Gabry & Aki Vehtari (2019) R-squared for Bayesian Regression Models, The American Statistician, 73:3, 307-9, DOI: 10.1080/00031305.2018.1549100.
Vehtari2017Vehtari, A., Gelman, A. & Gabry, J. Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Stat Comput 27, 1413–1432 (2017). doi: 10.1007/s11222-016-9696-4 arXiv: 1507.04544
LOOFAQAki Vehtari. Cross-validation FAQ. https://mc-stan.org/loo/articles/online-only/faq.html
Watanabe2010Watanabe, S. Asymptotic Equivalence of Bayes Cross Validation and Widely Applicable Information Criterion in Singular Learning Theory. 11(116):3571−3594, 2010. https://jmlr.csail.mit.edu/papers/v11/watanabe10a.html
Vehtari2017Vehtari, A., Gelman, A. & Gabry, J. Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Stat Comput 27, 1413–1432 (2017). doi: 10.1007/s11222-016-9696-4 arXiv: 1507.04544
LOOFAQAki Vehtari. Cross-validation FAQ. https://mc-stan.org/loo/articles/online-only/faq.html
YaoVehtari2018Yuling Yao, Aki Vehtari, Daniel Simpson, and Andrew Gelman. Using Stacking to Average Bayesian Predictive Distributions. 2018. Bayesian Analysis. 13, 3, 917–1007. doi: 10.1214/17-BA1091 arXiv: 1704.02030
YaoVehtari2018Yuling Yao, Aki Vehtari, Daniel Simpson, and Andrew Gelman. Using Stacking to Average Bayesian Predictive Distributions. 2018. Bayesian Analysis. 13, 3, 917–1007. doi: 10.1214/17-BA1091 arXiv: 1704.02030
YaoVehtari2018Yuling Yao, Aki Vehtari, Daniel Simpson, and Andrew Gelman. Using Stacking to Average Bayesian Predictive Distributions. 2018. Bayesian Analysis. 13, 3, 917–1007. doi: 10.1214/17-BA1091 arXiv: 1704.02030
YaoVehtari2018Yuling Yao, Aki Vehtari, Daniel Simpson, and Andrew Gelman. Using Stacking to Average Bayesian Predictive Distributions. 2018. Bayesian Analysis. 13, 3, 917–1007. doi: 10.1214/17-BA1091 arXiv: 1704.02030
Gabry2019Gabry, J., Simpson, D., Vehtari, A., Betancourt, M. & Gelman, A. Visualization in Bayesian Workflow. J. R. Stat. Soc. Ser. A Stat. Soc. 182, 389–402 (2019). doi: 10.1111/rssa.12378 arXiv: 1709.01449