arviz.hdi

arviz.hdi(ary, hdi_prob=None, circular=False, multimodal=False, skipna=False, group='posterior', var_names=None, filter_vars=None, coords=None, max_modes=10, dask_kwargs=None, **kwargs)[source]

Calculate highest density interval (HDI) of array for given probability.

The HDI is the minimum width Bayesian credible interval (BCI).

Parameters
ary: obj

object containing posterior samples. Any object that can be converted to an arviz.InferenceData object. Refer to documentation of arviz.convert_to_dataset() for details.

hdi_prob: float, optional

Prob for which the highest density interval will be computed. Defaults to stats.hdi_prob rcParam.

circular: bool, optional

Whether to compute the hdi taking into account x is a circular variable (in the range [-np.pi, np.pi]) or not. Defaults to False (i.e non-circular variables). Only works if multimodal is False.

multimodal: bool, optional

If true it may compute more than one hdi if the distribution is multimodal and the modes are well separated.

skipna: bool, optional

If true ignores nan values when computing the hdi. Defaults to false.

group: str, optional

Specifies which InferenceData group should be used to calculate hdi. Defaults to ‘posterior’

var_names: list, optional

Names of variables to include in the hdi report. Prefix the variables by ~ when you want to exclude them from the report: [“~beta”] instead of [“beta”] (see arviz.summary() for more details).

filter_vars: {None, “like”, “regex”}, optional, default=None

If None (default), interpret var_names as the real variables names. If “like”, interpret var_names as substrings of the real variables names. If “regex”, interpret var_names as regular expressions on the real variables names. A la pandas.filter.

coords: mapping, optional

Specifies the subset over to calculate hdi.

max_modes: int, optional

Specifies the maximum number of modes for multimodal case.

dask_kwargsdict, optional

Dask related kwargs passed to wrap_xarray_ufunc().

kwargs: dict, optional

Additional keywords passed to wrap_xarray_ufunc().

Returns
np.ndarray or xarray.Dataset, depending upon input

lower(s) and upper(s) values of the interval(s).

See also

plot_hdi

Plot highest density intervals for regression data.

xarray.Dataset.quantile

Calculate quantiles of array for given probabilities.

Examples

Calculate the HDI of a Normal random variable:

In [1]: import arviz as az
   ...: import numpy as np
   ...: data = np.random.normal(size=2000)
   ...: az.hdi(data, hdi_prob=.68)
   ...: 
Out[1]: array([-0.97520152,  1.00587755])

Calculate the HDI of a dataset:

In [2]: import arviz as az
   ...: data = az.load_arviz_data('centered_eight')
   ...: az.hdi(data)
   ...: 
Out[2]: 
<xarray.Dataset>
Dimensions:  (hdi: 2, school: 8)
Coordinates:
  * school   (school) object 'Choate' 'Deerfield' ... "St. Paul's" 'Mt. Hermon'
  * hdi      (hdi) <U6 'lower' 'higher'
Data variables:
    mu       (hdi) float64 -2.118 10.4
    theta    (school, hdi) float64 -3.707 17.34 -4.039 ... 16.92 -5.665 15.27
    tau      (hdi) float64 0.5692 9.386

We can also calculate the HDI of some of the variables of dataset:

In [3]: az.hdi(data, var_names=["mu", "theta"])
Out[3]: 
<xarray.Dataset>
Dimensions:  (hdi: 2, school: 8)
Coordinates:
  * school   (school) object 'Choate' 'Deerfield' ... "St. Paul's" 'Mt. Hermon'
  * hdi      (hdi) <U6 'lower' 'higher'
Data variables:
    mu       (hdi) float64 -2.118 10.4
    theta    (school, hdi) float64 -3.707 17.34 -4.039 ... 16.92 -5.665 15.27

By default, hdi is calculated over the chain and draw dimensions. We can use the input_core_dims argument of wrap_xarray_ufunc() to change this. In this example we calculate the HDI also over the school dimension:

In [4]: az.hdi(data, var_names="theta", input_core_dims = [["chain","draw", "school"]])
Out[4]: 
<xarray.Dataset>
Dimensions:  (hdi: 2)
Coordinates:
  * hdi      (hdi) <U6 'lower' 'higher'
Data variables:
    theta    (hdi) float64 -5.667 14.69

We can also calculate the hdi over a particular selection:

In [5]: az.hdi(data, coords={"chain":[0, 1, 3]}, input_core_dims = [["draw"]])
Out[5]: 
<xarray.Dataset>
Dimensions:  (chain: 3, hdi: 2, school: 8)
Coordinates:
  * chain    (chain) int64 0 1 3
  * school   (school) object 'Choate' 'Deerfield' ... "St. Paul's" 'Mt. Hermon'
  * hdi      (hdi) <U6 'lower' 'higher'
Data variables:
    mu       (chain, hdi) float64 -1.996 9.312 -2.358 10.91 -0.7842 9.985
    theta    (chain, school, hdi) float64 -4.077 17.44 -3.104 ... -3.708 14.21
    tau      (chain, hdi) float64 0.6768 8.881 1.013 9.1 0.5001 8.994