Stats
PSIS.PSISResult
PosteriorStats.AbstractELPDResult
PosteriorStats.AbstractModelWeightsMethod
PosteriorStats.BootstrappedPseudoBMA
PosteriorStats.ModelComparisonResult
PosteriorStats.PSISLOOResult
PosteriorStats.PseudoBMA
PosteriorStats.Stacking
PosteriorStats.SummaryStats
PosteriorStats.WAICResult
PSIS.PSISPlots.paretoshapeplot
PSIS.ess_is
PSIS.psis
PSIS.psis!
PosteriorStats.compare
PosteriorStats.default_summary_stats
PosteriorStats.elpd_estimates
PosteriorStats.eti
PosteriorStats.eti!
PosteriorStats.hdi
PosteriorStats.hdi!
PosteriorStats.information_criterion
PosteriorStats.kde_reflected
PosteriorStats.loo
PosteriorStats.loo_pit
PosteriorStats.model_weights
PosteriorStats.pointwise_loglikelihoods
PosteriorStats.r2_score
PosteriorStats.summarize
PosteriorStats.waic
StatsBase.summarystats
Summary statistics
PosteriorStats.SummaryStats
— Typestruct SummaryStats
A container for a column table of values computed by summarize
.
This object implements the Tables and TableTraits column table interfaces. It has a custom show
method.
SummaryStats
behaves like an OrderedDict
of columns, where the columns can be accessed using either Symbol
s or a 1-based integer index. However, this interface is not part of the public API and may change in the future. We recommend using it only interactively.
Constructor
SummaryStats(data; name="SummaryStats"[, labels])
Construct a SummaryStats
from tabular data
.
data
must implement the Tables interface. If it contains a column label
, this will be used for the row labels or will be replaced with the labels
if provided.
Keywords
name::AbstractString
: The name of the collection of summary statistics, used as the table title in display.labels::AbstractVector
: The names of the parameters indata
, used as row labels in display. If not provided, then the columnlabel
indata
will be used if it exists. Otherwise, the parameter names will be numeric indices.
PosteriorStats.default_summary_stats
— Functiondefault_summary_stats(kind::Symbol=:all; kwargs...)
Return a collection of stats functions based on the named preset kind
.
These functions are then passed to summarize
.
Arguments
kind::Symbol
: The named collection of summary statistics to be computed::all
: Everything in:stats
and:diagnostics
:stats
:mean
,std
,<ci>
:diagnostics
:ess_tail
,ess_bulk
,rhat
,mcse_mean
,mcse_std
:all_median
: Everything in:stats_median
and:diagnostics_median
:stats_median
:median
,mad
,<ci>
:diagnostics_median
:ess_median
,ess_tail
,rhat
,mcse_median
Keywords
PosteriorStats.summarize
— Functionsummarize(data; kind=:all,kwargs...) -> SummaryStats
summarize(data, stats_funs...; kwargs...) -> SummaryStats
Compute summary statistics on each param in data
.
Arguments
data
: a 3D array of real samples with shape(draws, chains, params)
or another object for which asummarize
method is defined.stats_funs
: a collection of functions that reduces a matrix with shape(draws, chains)
to a scalar or a collection of scalars. Alternatively, an item instats_funs
may be aPair
of the formname => fun
specifying the name to be used for the statistic or of the form(name1, ...) => fun
when the function returns a collection. When the function returns a collection, the names in this latter format must be provided.
Keywords
var_names
: a collection specifying the names of the parameters indata
. If not provided, the names the indices of the parameter dimension indata
.name::String
: the name of the summary statistics, used as the table title in display.kind::Symbol
: The named collection of summary statistics to be computed::all
: Everything in:stats
and:diagnostics
:stats
:mean
,std
,<ci>
:diagnostics
:ess_tail
,ess_bulk
,rhat
,mcse_mean
,mcse_std
:all_median
: Everything in:stats_median
and:diagnostics_median
:stats_median
:median
,mad
,<ci>
:diagnostics_median
:ess_median
,ess_tail
,rhat
,mcse_median
kwargs
: additional keyword arguments passed todefault_summary_stats
, including:
See also SummaryStats
, default_summary_stats
Extended Help
Examples
Compute all summary statistics (the default):
Display precision
When an estimator and its MCSE are both computed, the MCSE is used to determine the number of significant digits that will be displayed.
julia> using Statistics, StatsBase
julia> x = randn(1000, 4, 3) .+ reshape(0:10:20, 1, 1, :);
julia> summarize(x)
SummaryStats
mean std eti94 ess_tail ess_bulk rhat mcse_mean mcse_std
1 0.0003 0.99 -1.83 .. 1.89 3567 3663 1.00 0.016 0.012
2 10.02 0.99 8.17 .. 11.9 3841 3906 1.00 0.016 0.011
3 19.98 0.99 18.1 .. 21.9 3892 3749 1.00 0.016 0.012
Compute just the default statistics with an 89% HDI, and provide the parameter names:
julia> var_names=[:x, :y, :z];
julia> summarize(x; var_names, kind=:stats, ci_fun=hdi, ci_prob=0.89)
SummaryStats
mean std hdi89
x 0.000275 0.989 -1.63 .. 1.52
y 10.0 0.988 8.53 .. 11.6
z 20.0 0.988 18.5 .. 21.6
Compute Statistics.mean
, Statistics.std
and the Monte Carlo standard error (MCSE) of the mean estimate:
julia> summarize(x, mean, std, :mcse_mean => sem; name="Mean/Std")
Mean/Std
mean std mcse_mean
1 0.0003 0.989 0.016
2 10.02 0.988 0.016
3 19.98 0.988 0.016
Compute multiple quantiles simultaneously:
julia> percs = (5, 25, 50, 75, 95);
julia> summarize(x, Symbol.(:q, percs) => Base.Fix2(quantile, percs ./ 100))
SummaryStats
q5 q25 q50 q75 q95
1 -1.61 -0.668 0.00447 0.653 1.64
2 8.41 9.34 10.0 10.7 11.6
3 18.4 19.3 20.0 20.6 21.6
Extending summarize
to custom types
To support computing summary statistics from a custom object MyType
, overload the method summarize(::MyType, stats_funs...; kwargs...)
, which should ultimately call summarize(::AbstractArray{<:Union{Real,Missing},3}, stats_funs...; other_kwargs...)
, where other_kwargs
are the keyword arguments passed to summarize
.
StatsBase.summarystats
— Functionsummarystats(data::InferenceData; group=:posterior, kwargs...) -> SummaryStats
summarystats(data::Dataset; kwargs...) -> SummaryStats
Compute default summary statistics for the data using PosteriorStats.summarize
.
Credible intervals
PosteriorStats.eti
— Functioneti(samples::AbstractVecOrMat{<:Real}; [prob, kwargs...]) -> IntervalSets.ClosedInterval
eti(samples::AbstractArray{<:Real}; [prob, kwargs...]) -> Array{<:IntervalSets.ClosedInterval}
Estimate the equal-tailed interval (ETI) of samples
for the probability prob
.
The ETI of a given probability is the credible interval wih the property that the probability of being below the interval is equal to the probability of being above it. That is, it is defined by the (1-prob)/2
and 1 - (1-prob)/2
quantiles of the samples.
Arguments
samples
: an array of shape(draws[, chains[, params...]])
. If multiple parameters are present
Keywords
prob
: the probability mass to be contained in the ETI. Default is0.94
.kwargs
: remaining keywords are passed toStatistics.quantile
.
Returns
intervals
: Ifsamples
is a vector or matrix, then a singleIntervalSets.ClosedInterval
is returned. Otherwise, an array with the shape(params...,)
, is returned, containing a marginal ETI for each parameter.
Any default value of prob
is arbitrary. The default value of prob=0.94
instead of a more common default like prob=0.95
is chosen to reminder the user of this arbitrariness.
Examples
Here we calculate the 83% ETI for a normal random variable:
julia> x = randn(2_000);
julia> eti(x; prob=0.83)
-1.3740585250299766 .. 1.2860771129421198
We can also calculate the ETI for a 3-dimensional array of samples:
julia> x = randn(1_000, 1, 1) .+ reshape(0:5:10, 1, 1, :);
julia> eti(x)
3-element Vector{IntervalSets.ClosedInterval{Float64}}:
-1.951006825019686 .. 1.9011666217153793
3.048993174980314 .. 6.9011666217153795
8.048993174980314 .. 11.90116662171538
eti(data::InferenceData; kwargs...) -> Dataset
eti(data::Dataset; kwargs...) -> Dataset
Calculate the equal-tailed interval (ETI) for each parameter in the data.
For more details and a description of the kwargs
, see PosteriorStats.eti
.
PosteriorStats.eti!
— Functioneti!(samples::AbstractArray{<:Real}; [prob, kwargs...])
A version of eti
that partially sorts samples
in-place while computing the ETI.
PosteriorStats.hdi
— Functionhdi(samples::AbstractVecOrMat{<:Real}; [prob, sorted, method]) -> IntervalSets.ClosedInterval
hdi(samples::AbstractArray{<:Real}; [prob, sorted, method]) -> Array{<:IntervalSets.ClosedInterval}
Estimate the highest density interval (HDI) of samples
for the probability prob
.
The HDI is the minimum width Bayesian credible interval (BCI). That is, it is the smallest possible interval containing (100*prob)
% of the probability mass.Hyndman1996 This implementation uses the algorithm of ChenShao1999.
Arguments
samples
: an array of shape(draws[, chains[, params...]])
. If multiple parameters are present, a marginal HDI is computed for each.
Keywords
prob
: the probability mass to be contained in the HDI. Default is0.94
.sorted=false
: iftrue
, the input samples are assumed to be sorted.method::Symbol
: the method used to estimate the HDI. Available options are::unimodal
: Assumes a unimodal distribution (default). Bounds are entries insamples
.:multimodal
: Fits a density estimator tosamples
and finds the HDI of the estimated density. For continuous data, the density estimator is a kernel density estimate (KDE) computed usingkde_reflected
. For discrete data, a histogram with bin width 1 is used.:multimodal_sample
: Like:multimodal
, but uses the density estimator to find the entries insamples
with the highest density and computes the HDI from those points. This can correct for inaccuracies in the density estimator.
is_discrete::Union{Bool,Nothing}=nothing
: Specify if the data is discrete (integer-valued). Ifnothing
, it's automatically determined.kwargs
: For continuous data and multimodalmethod
s, remaining keywords are forwarded tokde_reflected
.
Returns
intervals
: Ifsamples
is a vector or matrix, then a singleIntervalSets.ClosedInterval
is returned for:unimodal
method, or a vector ofClosedInterval
for multimodal methods. For higher dimensional inputs, an array with the shape(params...,)
is returned, containing marginal HDIs for each parameter.
Any default value of prob
is arbitrary. The default value of prob=0.94
instead of a more common default like prob=0.95
is chosen to remind the user of this arbitrariness.
Examples
Here we calculate the 83% HDI for a normal random variable:
julia> x = randn(2_000);
julia> hdi(x; prob=0.83)
-1.3826605224220527 .. 1.259817149822839
We can also calculate the HDI for a 3-dimensional array of samples:
julia> x = randn(1_000, 1, 1) .+ reshape(0:5:10, 1, 1, :);
julia> hdi(x)
3-element Vector{IntervalSets.ClosedInterval{Float64}}:
-1.6402043796029502 .. 2.041852066407182
3.35979562039705 .. 7.041852066407182
8.35979562039705 .. 12.041852066407182
For multimodal distributions, you can use the :multimodal
method:
julia> x = vcat(randn(1000), randn(1000) .+ 5);
julia> hdi(x; method=:multimodal)
2-element Vector{IntervalSets.ClosedInterval{Float64}}:
-1.8882401079608677 .. 2.0017686164555037
2.9839268475847436 .. 6.9235952578447275
References
- Hyndman1996 Hyndman, J. Comput. Graph. Stat., 50:2 (1996)
- ChenShao1999 Chen & Shao, J Comput. Graph. Stat., 8:1 (1999)
hdi(data::InferenceData; kwargs...) -> Dataset
hdi(data::Dataset; kwargs...) -> Dataset
Calculate the highest density interval (HDI) for each parameter in the data.
For more details and a description of the kwargs
, see PosteriorStats.hdi
.
PosteriorStats.hdi!
— Functionhdi!(samples::AbstractArray{<:Real}; [prob, sorted])
A version of hdi
that partially sorts samples
in-place while computing the HDI.
Pareto-smoothed importance sampling
PSIS.PSISResult
— TypePSISResult
Result of Pareto-smoothed importance sampling (PSIS) using psis
.
Properties
log_weights
: un-normalized Pareto-smoothed log weightsweights
: normalized Pareto-smoothed weights (allocates a copy)pareto_shape
: Pareto $k=ξ$ shape parameternparams
: number of parameters inlog_weights
ndraws
: number of draws inlog_weights
nchains
: number of chains inlog_weights
reff
: the ratio of the effective sample size of the unsmoothed importance ratios and the actual sample size.ess
: estimated effective sample size of estimate of mean using smoothed importance samples (seeess_is
)tail_length
: length of the upper tail oflog_weights
that was smoothedtail_dist
: the generalized Pareto distribution that was fit to the tail oflog_weights
. Note that the tail weights are scaled to have a maximum of 1, sotail_dist * exp(maximum(log_ratios))
is the corresponding fit directly to the tail oflog_ratios
.normalized::Bool
:indicates whetherlog_weights
are log-normalized along the sample dimensions.
Diagnostic
The pareto_shape
parameter $k=ξ$ of the generalized Pareto distribution tail_dist
can be used to diagnose reliability and convergence of estimates using the importance weights VehtariSimpson2021.
- if $k < \frac{1}{3}$, importance sampling is stable, and importance sampling (IS) and PSIS both are reliable.
- if $k ≤ \frac{1}{2}$, then the importance ratio distributon has finite variance, and the central limit theorem holds. As $k$ approaches the upper bound, IS becomes less reliable, while PSIS still works well but with a higher RMSE.
- if $\frac{1}{2} < k ≤ 0.7$, then the variance is infinite, and IS can behave quite poorly. However, PSIS works well in this regime.
- if $0.7 < k ≤ 1$, then it quickly becomes impractical to collect enough importance weights to reliably compute estimates, and importance sampling is not recommended.
- if $k > 1$, then neither the variance nor the mean of the raw importance ratios exists. The convergence rate is close to zero, and bias can be large with practical sample sizes.
See PSISPlots.paretoshapeplot
for a diagnostic plot.
References
- VehtariSimpson2021 Vehtari et al. JMLR 25:72 (2021).
PSIS.ess_is
— Functioness_is(weights; reff=1)
Estimate effective sample size (ESS) for importance sampling over the sample dimensions.
Given normalized weights $w_{1:n}$, the ESS is estimated using the L2-norm of the weights:
\[\mathrm{ESS}(w_{1:n}) = \frac{r_{\mathrm{eff}}}{\sum_{i=1}^n w_i^2}\]
where $r_{\mathrm{eff}}$ is the relative efficiency of the log_weights
.
ess_is(result::PSISResult; bad_shape_nan=true)
Estimate ESS for Pareto-smoothed importance sampling.
PSIS.PSISPlots.paretoshapeplot
— Functionparetoshapeplot(values; showlines=false, ...)
paretoshapeplot!(values; showlines=false, kwargs...)
Plot shape parameters of fitted Pareto tail distributions for diagnosing convergence.
values
may be either a vector of Pareto shape parameters or a PSIS.PSISResult
.
If showlines==true
, horizontal lines indicating relevant Pareto shape thresholds are drawn. See PSIS.PSISResult
for an explanation of the thresholds.
All remaining kwargs
are forwarded to the plotting function.
See psis
, PSISResult
.
Examples
using PSIS, Distributions, Plots
proposal = Normal()
target = TDist(7)
x = rand(proposal, 1_000, 100)
log_ratios = logpdf.(target, x) .- logpdf.(proposal, x)
result = psis(log_ratios)
paretoshapeplot(result)
We can also plot the Pareto shape parameters directly:
paretoshapeplot(result.pareto_shape)
We can also use plot
directly:
plot(result.pareto_shape; showlines=true)
PSIS.psis
— Functionpsis(log_ratios, reff = 1.0; kwargs...) -> PSISResult
psis!(log_ratios, reff = 1.0; kwargs...) -> PSISResult
Compute Pareto smoothed importance sampling (PSIS) log weights VehtariSimpson2021.
While psis
computes smoothed log weights out-of-place, psis!
smooths them in-place.
Arguments
log_ratios
: an array of logarithms of importance ratios, with size(draws, [chains, [parameters...]])
, wherechains>1
would be used when chains are generated using Markov chain Monte Carlo.reff::Union{Real,AbstractArray}
: the ratio(s) of effective sample size oflog_ratios
and the actual sample sizereff = ess/(draws * chains)
, used to account for autocorrelation, e.g. due to Markov chain Monte Carlo. If an array, it must have the size(parameters...,)
to matchlog_ratios
.
Keywords
warn=true
: Iftrue
, warning messages are deliverednormalize=true
: Iftrue
, the log-weights will be log-normalized so thatexp.(log_weights)
sums to 1 along the sample dimensions.
Returns
result
: aPSISResult
object containing the results of the Pareto-smoothing.
A warning is raised if the Pareto shape parameter $k ≥ 0.7$. See PSISResult
for details and PSISPlots.paretoshapeplot
for a diagnostic plot.
Examples
Here we smooth log importance ratios for importance sampling 30 isotropic Student $t$-distributed parameters using standard normal distributions as proposals.
julia> using Distributions
julia> proposal, target = Normal(), TDist(7);
julia> x = rand(proposal, 1_000, 1, 30); # (ndraws, nchains, nparams)
julia> log_ratios = @. logpdf(target, x) - logpdf(proposal, x);
julia> result = psis(log_ratios)
┌ Warning: 9 parameters had Pareto shape values 0.7 < k ≤ 1. Resulting importance sampling estimates are likely to be unstable.
└ @ PSIS ~/.julia/packages/PSIS/...
┌ Warning: 1 parameters had Pareto shape values k > 1. Corresponding importance sampling estimates are likely to be unstable and are unlikely to converge with additional samples.
└ @ PSIS ~/.julia/packages/PSIS/...
PSISResult with 1000 draws, 1 chains, and 30 parameters
Pareto shape (k) diagnostic values:
Count Min. ESS
(-Inf, 0.5] good 7 (23.3%) 959
(0.5, 0.7] okay 13 (43.3%) 938
(0.7, 1] bad 9 (30.0%) ——
(1, Inf) very bad 1 (3.3%) ——
If the draws were generated using MCMC, we can compute the relative efficiency using MCMCDiagnosticTools.ess
.
julia> using MCMCDiagnosticTools
julia> reff = ess(log_ratios; kind=:basic, split_chains=1, relative=true);
julia> result = psis(log_ratios, reff)
┌ Warning: 9 parameters had Pareto shape values 0.7 < k ≤ 1. Resulting importance sampling estimates are likely to be unstable.
└ @ PSIS ~/.julia/packages/PSIS/...
┌ Warning: 1 parameters had Pareto shape values k > 1. Corresponding importance sampling estimates are likely to be unstable and are unlikely to converge with additional samples.
└ @ PSIS ~/.julia/packages/PSIS/...
PSISResult with 1000 draws, 1 chains, and 30 parameters
Pareto shape (k) diagnostic values:
Count Min. ESS
(-Inf, 0.5] good 9 (30.0%) 806
(0.5, 0.7] okay 11 (36.7%) 842
(0.7, 1] bad 9 (30.0%) ——
(1, Inf) very bad 1 (3.3%) ——
References
- VehtariSimpson2021 Vehtari et al. JMLR 25:72 (2021).
PSIS.psis!
— Functionpsis(log_ratios, reff = 1.0; kwargs...) -> PSISResult
psis!(log_ratios, reff = 1.0; kwargs...) -> PSISResult
Compute Pareto smoothed importance sampling (PSIS) log weights VehtariSimpson2021.
While psis
computes smoothed log weights out-of-place, psis!
smooths them in-place.
Arguments
log_ratios
: an array of logarithms of importance ratios, with size(draws, [chains, [parameters...]])
, wherechains>1
would be used when chains are generated using Markov chain Monte Carlo.reff::Union{Real,AbstractArray}
: the ratio(s) of effective sample size oflog_ratios
and the actual sample sizereff = ess/(draws * chains)
, used to account for autocorrelation, e.g. due to Markov chain Monte Carlo. If an array, it must have the size(parameters...,)
to matchlog_ratios
.
Keywords
warn=true
: Iftrue
, warning messages are deliverednormalize=true
: Iftrue
, the log-weights will be log-normalized so thatexp.(log_weights)
sums to 1 along the sample dimensions.
Returns
result
: aPSISResult
object containing the results of the Pareto-smoothing.
A warning is raised if the Pareto shape parameter $k ≥ 0.7$. See PSISResult
for details and PSISPlots.paretoshapeplot
for a diagnostic plot.
Examples
Here we smooth log importance ratios for importance sampling 30 isotropic Student $t$-distributed parameters using standard normal distributions as proposals.
julia> using Distributions
julia> proposal, target = Normal(), TDist(7);
julia> x = rand(proposal, 1_000, 1, 30); # (ndraws, nchains, nparams)
julia> log_ratios = @. logpdf(target, x) - logpdf(proposal, x);
julia> result = psis(log_ratios)
┌ Warning: 9 parameters had Pareto shape values 0.7 < k ≤ 1. Resulting importance sampling estimates are likely to be unstable.
└ @ PSIS ~/.julia/packages/PSIS/...
┌ Warning: 1 parameters had Pareto shape values k > 1. Corresponding importance sampling estimates are likely to be unstable and are unlikely to converge with additional samples.
└ @ PSIS ~/.julia/packages/PSIS/...
PSISResult with 1000 draws, 1 chains, and 30 parameters
Pareto shape (k) diagnostic values:
Count Min. ESS
(-Inf, 0.5] good 7 (23.3%) 959
(0.5, 0.7] okay 13 (43.3%) 938
(0.7, 1] bad 9 (30.0%) ——
(1, Inf) very bad 1 (3.3%) ——
If the draws were generated using MCMC, we can compute the relative efficiency using MCMCDiagnosticTools.ess
.
julia> using MCMCDiagnosticTools
julia> reff = ess(log_ratios; kind=:basic, split_chains=1, relative=true);
julia> result = psis(log_ratios, reff)
┌ Warning: 9 parameters had Pareto shape values 0.7 < k ≤ 1. Resulting importance sampling estimates are likely to be unstable.
└ @ PSIS ~/.julia/packages/PSIS/...
┌ Warning: 1 parameters had Pareto shape values k > 1. Corresponding importance sampling estimates are likely to be unstable and are unlikely to converge with additional samples.
└ @ PSIS ~/.julia/packages/PSIS/...
PSISResult with 1000 draws, 1 chains, and 30 parameters
Pareto shape (k) diagnostic values:
Count Min. ESS
(-Inf, 0.5] good 9 (30.0%) 806
(0.5, 0.7] okay 11 (36.7%) 842
(0.7, 1] bad 9 (30.0%) ——
(1, Inf) very bad 1 (3.3%) ——
References
- VehtariSimpson2021 Vehtari et al. JMLR 25:72 (2021).
LOO and WAIC
PosteriorStats.AbstractELPDResult
— Typeabstract type AbstractELPDResult
An abstract type representing the result of an ELPD computation.
Every subtype stores estimates of both the expected log predictive density (elpd
) and the effective number of parameters p
, as well as standard errors and pointwise estimates of each, from which other relevant estimates can be computed.
Subtypes implement the following functions:
PosteriorStats.PSISLOOResult
— TypeResults of Pareto-smoothed importance sampling leave-one-out cross-validation (PSIS-LOO).
See also: loo
, AbstractELPDResult
estimates
: Estimates of the expected log pointwise predictive density (ELPD) and effective number of parameters (p)pointwise
: Pointwise estimatespsis_result
: APSIS.PSISResult
with Pareto-smoothed importance sampling (PSIS) results
PosteriorStats.WAICResult
— TypeResults of computing the widely applicable information criterion (WAIC).
See also: waic
, AbstractELPDResult
estimates
: Estimates of the expected log pointwise predictive density (ELPD) and effective number of parameters (p)pointwise
: Pointwise estimates
PosteriorStats.elpd_estimates
— Functionelpd_estimates(result::AbstractELPDResult; pointwise=false) -> (; elpd, se_elpd, lpd)
Return the (E)LPD estimates from the result
.
PosteriorStats.information_criterion
— Functioninformation_criterion(elpd, scale::Symbol)
Compute the information criterion for the given scale
from the elpd
estimate.
scale
must be one of (:deviance, :log, :negative_log)
.
information_criterion(result::AbstractELPDResult, scale::Symbol; pointwise=false)
Compute information criterion for the given scale
from the existing ELPD result
.
scale
must be one of (:deviance, :log, :negative_log)
.
If pointwise=true
, then pointwise estimates are returned.
PosteriorStats.loo
— Functionloo(log_likelihood; reff=nothing, kwargs...) -> PSISLOOResult{<:NamedTuple,<:NamedTuple}
Compute the Pareto-smoothed importance sampling leave-one-out cross-validation (PSIS-LOO). Vehtari2017, LOOFAQ
log_likelihood
must be an array of log-likelihood values with shape (chains, draws[, params...])
.
Keywords
reff::Union{Real,AbstractArray{<:Real}}
: The relative effective sample size(s) of the likelihood values. If an array, it must have the same data dimensions as the corresponding log-likelihood variable. If not provided, then this is estimated usingMCMCDiagnosticTools.ess
.kwargs
: Remaining keywords are forwarded toPSIS.psis
.
See also: PSISLOOResult
, waic
Examples
Manually compute $R_\mathrm{eff}$ and calculate PSIS-LOO of a model:
julia> using ArviZExampleData, LogExpFunctions, MCMCDiagnosticTools
julia> idata = load_example_data("centered_eight");
julia> log_like = PermutedDimsArray(idata.log_likelihood.obs, (:draw, :chain, :school));
julia> reff = ess(softmax(log_like; dims=(1, 2)); kind=:basic, split_chains=1, relative=true);
julia> loo(log_like; reff)
PSISLOOResult with estimates
elpd se_elpd p se_p
-31 1.4 0.9 0.33
and PSISResult with 500 draws, 4 chains, and 8 parameters
Pareto shape (k) diagnostic values:
Count Min. ESS
(-Inf, 0.5] good 4 (50.0%) 270
(0.5, 0.7] okay 4 (50.0%) 307
References
- Vehtari2017 Vehtari et al. Stat. Comput. 27 (2017).
- LOOFAQ Vehtari. Cross-validation FAQ.
loo(data::Dataset; [var_name::Symbol,] kwargs...) -> PSISLOOResult{<:NamedTuple,<:Dataset}
loo(data::InferenceData; [var_name::Symbol,] kwargs...) -> PSISLOOResult{<:NamedTuple,<:Dataset}
Compute PSIS-LOO from log-likelihood values in data
.
If more than one log-likelihood variable is present, then var_name
must be provided.
For more details and a description of the kwargs
, see PosteriorStats.loo
.
Examples
Calculate PSIS-LOO of a model:
julia> using ArviZExampleData, PosteriorStats
julia> idata = load_example_data("centered_eight");
julia> loo(idata)
PSISLOOResult with estimates
elpd se_elpd p se_p
-31 1.4 0.9 0.33
and PSISResult with 500 draws, 4 chains, and 8 parameters
Pareto shape (k) diagnostic values:
Count Min. ESS
(-Inf, 0.5] good 4 (50.0%) 270
(0.5, 0.7] okay 4 (50.0%) 307
PosteriorStats.waic
— Functionwaic(log_likelihood::AbstractArray) -> WAICResult{<:NamedTuple,<:NamedTuple}
Compute the widely applicable information criterion (WAIC). Watanabe2010
log_likelihood
must be an array of log-likelihood values with shape (chains, draws[, params...])
.
See also: WAICResult
, loo
Examples
Calculate WAIC of a model:
julia> using ArviZExampleData
julia> idata = load_example_data("centered_eight");
julia> log_like = PermutedDimsArray(idata.log_likelihood.obs, (:draw, :chain, :school));
julia> waic(log_like)
WAICResult with estimates
elpd se_elpd p se_p
-31 1.4 0.9 0.32
References
- Watanabe2010 Watanabe, JMLR 11(116) (2010)
waic(data::Dataset; [var_name::Symbol]) -> WAICResult{<:NamedTuple,<:Dataset}
waic(data::InferenceData; [var_name::Symbol]) -> WAICResult{<:NamedTuple,<:Dataset}
Compute WAIC from log-likelihood values in data
.
If more than one log-likelihood variable is present, then var_name
must be provided.
See PosteriorStats.waic
for more details.
Examples
Calculate WAIC of a model:
julia> using ArviZExampleData, PosteriorStats
julia> idata = load_example_data("centered_eight");
julia> waic(idata)
WAICResult with estimates
elpd se_elpd p se_p
-31 1.4 0.9 0.32
Model comparison
PosteriorStats.ModelComparisonResult
— TypeModelComparisonResult
Result of model comparison using ELPD.
This struct implements the Tables and TableTraits interfaces.
Each field returns a collection of the corresponding entry for each model:
name
: Names of the models, if provided.rank
: Ranks of the models (ordered by decreasing ELPD)elpd_diff
: ELPD of a model subtracted from the largest ELPD of any modelse_elpd_diff
: Standard error of the ELPD differenceweight
: Model weights computed withweights_method
elpd_result
:AbstactELPDResult
s for each model, which can be used to access useful stats like ELPD estimates, pointwise estimates, and Pareto shape values for PSIS-LOOweights_method
: Method used to compute model weights withmodel_weights
PosteriorStats.compare
— Functioncompare(models; kwargs...) -> ModelComparisonResult
Compare models based on their expected log pointwise predictive density (ELPD).
The ELPD is estimated either by Pareto smoothed importance sampling leave-one-out cross-validation (LOO) or using the widely applicable information criterion (WAIC). loo
is the default and recommended method for computing the ELPD. For more theory, see Spiegelhalter2002.
Arguments
models
: aTuple
,NamedTuple
, orAbstractVector
whose values are eitherAbstractELPDResult
entries or any argument toelpd_method
.
Keywords
weights_method::AbstractModelWeightsMethod=Stacking()
: the method to be used to weight the models. Seemodel_weights
for detailselpd_method=loo
: a method that computes anAbstractELPDResult
from an argument inmodels
.sort::Bool=true
: Whether to sort models by decreasing ELPD.
Returns
ModelComparisonResult
: A container for the model comparison results. The fields contain a similar collection tomodels
.
Examples
Compare the centered and non centered models of the eight school problem using the defaults: loo
and Stacking
weights. A custom myloo
method formates the inputs as expected by loo
.
julia> using ArviZExampleData
julia> models = (
centered=load_example_data("centered_eight"),
non_centered=load_example_data("non_centered_eight"),
);
julia> function myloo(idata)
log_like = PermutedDimsArray(idata.log_likelihood.obs, (2, 3, 1))
return loo(log_like)
end;
julia> mc = compare(models; elpd_method=myloo)
┌ Warning: 1 parameters had Pareto shape values 0.7 < k ≤ 1. Resulting importance sampling estimates are likely to be unstable.
└ @ PSIS ~/.julia/packages/PSIS/...
ModelComparisonResult with Stacking weights
rank elpd se_elpd elpd_diff se_elpd_diff weight p se_p
non_centered 1 -31 1.5 0 0.0 1.0 0.9 0.32
centered 2 -31 1.4 0.03 0.061 0.0 0.9 0.33
julia> mc.weight |> pairs
pairs(::NamedTuple) with 2 entries:
:non_centered => 1.0
:centered => 3.50546e-31
Compare the same models from pre-computed PSIS-LOO results and computing BootstrappedPseudoBMA
weights:
julia> elpd_results = mc.elpd_result;
julia> compare(elpd_results; weights_method=BootstrappedPseudoBMA())
ModelComparisonResult with BootstrappedPseudoBMA weights
rank elpd se_elpd elpd_diff se_elpd_diff weight p se_p
non_centered 1 -31 1.5 0 0.0 0.51 0.9 0.32
centered 2 -31 1.4 0.03 0.061 0.49 0.9 0.33
References
- Spiegelhalter2002 Spiegelhalter et al. J. R. Stat. Soc. B 64 (2002)
PosteriorStats.model_weights
— Functionmodel_weights(elpd_results; method=Stacking())
model_weights(method::AbstractModelWeightsMethod, elpd_results)
Compute weights for each model in elpd_results
using method
.
elpd_results
is a Tuple
, NamedTuple
, or AbstractVector
with AbstractELPDResult
entries. The weights are returned in the same type of collection.
Stacking
is the recommended approach, as it performs well even when the true data generating process is not included among the candidate models. See Yao2018 for details.
See also: AbstractModelWeightsMethod
, compare
Examples
Compute Stacking
weights for two models:
julia> using ArviZExampleData
julia> models = (
centered=load_example_data("centered_eight"),
non_centered=load_example_data("non_centered_eight"),
);
julia> elpd_results = map(models) do idata
log_like = PermutedDimsArray(idata.log_likelihood.obs, (2, 3, 1))
return loo(log_like)
end;
┌ Warning: 1 parameters had Pareto shape values 0.7 < k ≤ 1. Resulting importance sampling estimates are likely to be unstable.
└ @ PSIS ~/.julia/packages/PSIS/...
julia> model_weights(elpd_results; method=Stacking()) |> pairs
pairs(::NamedTuple) with 2 entries:
:centered => 3.50546e-31
:non_centered => 1.0
Now we compute BootstrappedPseudoBMA
weights for the same models:
julia> model_weights(elpd_results; method=BootstrappedPseudoBMA()) |> pairs
pairs(::NamedTuple) with 2 entries:
:centered => 0.492513
:non_centered => 0.507487
References
- Yao2018 Yao et al. Bayesian Analysis 13, 3 (2018)
The following model weighting methods are available
PosteriorStats.AbstractModelWeightsMethod
— Typeabstract type AbstractModelWeightsMethod
An abstract type representing methods for computing model weights.
Subtypes implement model_weights
(method, elpd_results)
.
PosteriorStats.BootstrappedPseudoBMA
— Typestruct BootstrappedPseudoBMA{R<:Random.AbstractRNG, T<:Real} <: AbstractModelWeightsMethod
Model weighting method using pseudo Bayesian Model Averaging using Akaike-type weighting with the Bayesian bootstrap (pseudo-BMA+)Yao2018.
The Bayesian bootstrap stabilizes the model weights.
BootstrappedPseudoBMA(; rng=Random.default_rng(), samples=1_000, alpha=1)
BootstrappedPseudoBMA(rng, samples, alpha)
Construct the method.
rng::Random.AbstractRNG
: The random number generator to use for the Bayesian bootstrapsamples::Int64
: The number of samples to draw for bootstrappingalpha::Real
: The shape parameter in the Dirichlet distribution used for the Bayesian bootstrap. The default (1) corresponds to a uniform distribution on the simplex.
See also: Stacking
References
- Yao2018 Yao et al. Bayesian Analysis 13, 3 (2018)
PosteriorStats.PseudoBMA
— Typestruct PseudoBMA <: AbstractModelWeightsMethod
Model weighting method using pseudo Bayesian Model Averaging (pseudo-BMA) and Akaike-type weighting.
PseudoBMA(; regularize=false)
PseudoBMA(regularize)
Construct the method with optional regularization of the weights using the standard error of the ELPD estimate.
This approach is not recommended, as it produces unstable weight estimates. It is recommended to instead use BootstrappedPseudoBMA
to stabilize the weights or Stacking
. For details, see Yao2018.
See also: Stacking
References
- Yao2018 Yao et al. Bayesian Analysis 13, 3 (2018)
PosteriorStats.Stacking
— Typestruct Stacking{O<:Optim.AbstractOptimizer} <: AbstractModelWeightsMethod
Model weighting using stacking of predictive distributionsYao2018.
Stacking(; optimizer=Optim.LBFGS(), options=Optim.Options()
Stacking(optimizer[, options])
Construct the method, optionally customizing the optimization.
optimizer::Optim.AbstractOptimizer
: The optimizer to use for the optimization of the weights. The optimizer must support projected gradient optimization via amanifold
field.options::Optim.Options
: The Optim options to use for the optimization of the weights.
See also: BootstrappedPseudoBMA
References
- Yao2018 Yao et al. Bayesian Analysis 13, 3 (2018)
Predictive checks
PosteriorStats.loo_pit
— Functionloo_pit(y, y_pred, log_weights) -> Union{Real,AbstractArray}
Compute leave-one-out probability integral transform (LOO-PIT) checks.
Arguments
y
: array of observations with shape(params...,)
y_pred
: array of posterior predictive samples with shape(draws, chains, params...)
.log_weights
: array of normalized log LOO importance weights with shape(draws, chains, params...)
.
Returns
pitvals
: LOO-PIT values with same size asy
. Ify
is a scalar, thenpitvals
is a scalar.
LOO-PIT is a marginal posterior predictive check. If $y_{-i}$ is the array $y$ of observations with the $i$th observation left out, and $y_i^*$ is a posterior prediction of the $i$th observation, then the LOO-PIT value for the $i$th observation is defined as
\[P(y_i^* \le y_i \mid y_{-i}) = \int_{-\infty}^{y_i} p(y_i^* \mid y_{-i}) \mathrm{d} y_i^*\]
The LOO posterior predictions and the corresponding observations should have similar distributions, so if conditional predictive distributions are well-calibrated, then for continuous data, all LOO-PIT values should be approximately uniformly distributed on $[0, 1]$. Gabry2019
For discrete data, the LOO-PIT values will typically not be uniformly distributed on $[0, 1]$, and this function is not recommended.
Examples
Calculate LOO-PIT values using as test quantity the observed values themselves.
julia> using ArviZExampleData
julia> idata = load_example_data("centered_eight");
julia> y = idata.observed_data.obs;
julia> y_pred = PermutedDimsArray(idata.posterior_predictive.obs, (:draw, :chain, :school));
julia> log_like = PermutedDimsArray(idata.log_likelihood.obs, (:draw, :chain, :school));
julia> log_weights = loo(log_like).psis_result.log_weights;
julia> loo_pit(y, y_pred, log_weights)
┌ 8-element DimArray{Float64, 1} ┐
├────────────────────────────────┴─────────────────────────────── dims ┐
↓ school Categorical{String} ["Choate", …, "Mt. Hermon"] Unordered
└──────────────────────────────────────────────────────────────────────┘
"Choate" 0.942759
"Deerfield" 0.641057
"Phillips Andover" 0.32729
"Phillips Exeter" 0.581451
"Hotchkiss" 0.288523
"Lawrenceville" 0.393741
"St. Paul's" 0.886175
"Mt. Hermon" 0.638821
Calculate LOO-PIT values using as test quantity the square of the difference between each observation and mu
.
julia> using Statistics
julia> mu = idata.posterior.mu;
julia> T = y .- median(mu);
julia> T_pred = y_pred .- mu;
julia> loo_pit(T .^ 2, T_pred .^ 2, log_weights)
┌ 8-element DimArray{Float64, 1} ┐
├────────────────────────────────┴─────────────────────────────── dims ┐
↓ school Categorical{String} ["Choate", …, "Mt. Hermon"] Unordered
└──────────────────────────────────────────────────────────────────────┘
"Choate" 0.868148
"Deerfield" 0.27421
"Phillips Andover" 0.321719
"Phillips Exeter" 0.193169
"Hotchkiss" 0.370422
"Lawrenceville" 0.195601
"St. Paul's" 0.817408
"Mt. Hermon" 0.326795
References
- Gabry2019 Gabry et al. J. R. Stat. Soc. Ser. A Stat. Soc. 182 (2019).
loo_pit(idata::InferenceData, log_weights; kwargs...) -> DimArray
Compute LOO-PIT values using existing normalized log LOO importance weights.
Keywords
y_name
: Name of observed data variable inidata.observed_data
. If not provided, then the only observed data variable is used.y_pred_name
: Name of posterior predictive variable inidata.posterior_predictive
. If not provided, theny_name
is used.kwargs
: Remaining keywords are forwarded to the base methodPosteriorStats.loo_pit
.
See PosteriorStats.loo_pit
for more details.
Examples
Calculate LOO-PIT values using already computed log weights.
julia> using ArviZExampleData, PosteriorStats
julia> idata = load_example_data("centered_eight");
julia> loo_result = loo(idata; var_name=:obs);
julia> loo_pit(idata, loo_result.psis_result.log_weights; y_name=:obs)
┌ 8-element DimArray{Float64, 1} loo_pit_obs ┐
├────────────────────────────────────────────┴─────────────────── dims ┐
↓ school Categorical{String} ["Choate", …, "Mt. Hermon"] Unordered
└──────────────────────────────────────────────────────────────────────┘
"Choate" 0.942759
"Deerfield" 0.641057
"Phillips Andover" 0.32729
"Phillips Exeter" 0.581451
"Hotchkiss" 0.288523
"Lawrenceville" 0.393741
"St. Paul's" 0.886175
"Mt. Hermon" 0.638821
loo_pit(idata::InferenceData; kwargs...) -> DimArray
Compute LOO-PIT from groups in idata
using PSIS-LOO.
Keywords
y_name
: Name of observed data variable inidata.observed_data
. If not provided, then the only observed data variable is used.y_pred_name
: Name of posterior predictive variable inidata.posterior_predictive
. If not provided, theny_name
is used.log_likelihood_name
: Name of log-likelihood variable inidata.log_likelihood
. If not provided, theny_name
is used ifidata
has alog_likelihood
group, otherwise the only variable is used.reff::Union{Real,AbstractArray{<:Real}}
: The relative effective sample size(s) of the likelihood values. If an array, it must have the same data dimensions as the corresponding log-likelihood variable. If not provided, then this is estimated usingess
.kwargs
: Remaining keywords are forwarded toPosteriorStats.loo_pit
.
See PosteriorStats.loo_pit
for more details.
Examples
Calculate LOO-PIT values using as test quantity the observed values themselves.
julia> using ArviZExampleData, PosteriorStats
julia> idata = load_example_data("centered_eight");
julia> loo_pit(idata; y_name=:obs)
┌ 8-element DimArray{Float64, 1} loo_pit_obs ┐
├────────────────────────────────────────────┴─────────────────── dims ┐
↓ school Categorical{String} ["Choate", …, "Mt. Hermon"] Unordered
└──────────────────────────────────────────────────────────────────────┘
"Choate" 0.942759
"Deerfield" 0.641057
"Phillips Andover" 0.32729
"Phillips Exeter" 0.581451
"Hotchkiss" 0.288523
"Lawrenceville" 0.393741
"St. Paul's" 0.886175
"Mt. Hermon" 0.638821
PosteriorStats.r2_score
— Functionr2_score(y_true::AbstractVector, y_pred::AbstractArray) -> (; r2, r2_std)
$R²$ for linear Bayesian regression models.Gelman2019
Arguments
y_true
: Observed data of lengthnoutputs
y_pred
: Predicted data with size(ndraws[, nchains], noutputs)
Examples
julia> using ArviZExampleData
julia> idata = load_example_data("regression1d");
julia> y_true = idata.observed_data.y;
julia> y_pred = PermutedDimsArray(idata.posterior_predictive.y, (:draw, :chain, :y_dim_0));
julia> r2_score(y_true, y_pred) |> pairs
pairs(::NamedTuple) with 2 entries:
:r2 => 0.683197
:r2_std => 0.0368838
References
- Gelman2019 Gelman et al, The Am. Stat., 73(3) (2019)
r2_score(idata::InferenceData; y_name, y_pred_name) -> (; r2, r2_std)
Compute $R²$ from idata
, automatically formatting the predictions to the correct shape.
Keywords
y_name
: Name of observed data variable inidata.observed_data
. If not provided, then the only observed data variable is used.y_pred_name
: Name of posterior predictive variable inidata.posterior_predictive
. If not provided, theny_name
is used.
See PosteriorStats.r2_score
for more details.
Examples
julia> using ArviZExampleData, PosteriorStats
julia> idata = load_example_data("regression10d");
julia> r2_score(idata) |> pairs
pairs(::NamedTuple) with 2 entries:
:r2 => 0.998385
:r2_std => 0.000100621
Utilities
PosteriorStats.kde_reflected
— Functionkde_reflected(data::AbstractVector{<:Real}; bounds=extrema(data), kwargs...)
Compute the boundary-corrected kernel density estimate (KDE) of data
using reflection.
For $x \in (l, u)$, the reflected KDE has the density
\[\hat{f}_R(x) = \hat{f}(x) + \hat{f}(2l - x) + \hat{f}(2u - x),\]
where $\hat{f}$ is the usual KDE of data
. This is equivalent to augmenting the original data with 2 additional copies of the data reflected around each bound, computing the usual KDE, trimming the KDE to the bounds, and renormalizing.
Any non-finite bounds
are ignored. Remaining kwargs
are passed to KernelDensity.kde
. The default bandwidth is estimated using the Improved Sheather-Jones (ISJ) method Botev2010.
References
- Botev2010 Botev et al. Ann. Stat., 38: 5 (2010)
PosteriorStats.pointwise_loglikelihoods
— Functionpointwise_loglikelihoods
Compute pointwise conditional log-likelihoods for ELPD-based model comparison/validation.
Given model parameters $\theta$ and observations $y$, the pointwise conditional log-likelihood of $y_i$ given $y_{-i}$ (the elements of $y$ excluding $y_i$) and $\theta$ is defined as
\[\log p(y_i \mid y_{-i}, \theta)\]
This method is a utility function that dependant packages can override to provide pointwise conditional log-likelihoods for their own models/distributions.
See also: loo