arviz.summary

arviz.summary(data, var_names=None, fmt='wide', round_to=None, include_circ=None, stat_funcs=None, extend=True, credible_interval=0.94, order='C', index_origin=0)[source]

Create a data frame with summary statistics.

Parameters
dataobj

Any object that can be converted to an az.InferenceData object Refer to documentation of az.convert_to_dataset for details

var_nameslist

Names of variables to include in summary

include_circbool

Whether to include circular statistics

fmt{‘wide’, ‘long’, ‘xarray’}

Return format is either pandas.DataFrame {‘wide’, ‘long’} or xarray.Dataset {‘xarray’}.

round_toint

Number of decimals used to round results. Defaults to 2. Use “none” to return raw numbers.

stat_funcsdict

A list of functions or a dict of functions with function names as keys used to calculate statistics. By default, the mean, standard deviation, simulation standard error, and highest posterior density intervals are included.

The functions will be given one argument, the samples for a variable as an nD array, The functions should be in the style of a ufunc and return a single number. For example, np.mean, or scipy.stats.var would both work.

extendboolean

If True, use the statistics returned by stat_funcs in addition to, rather than in place of, the default statistics. This is only meaningful when stat_funcs is not None.

credible_intervalfloat, optional

Credible interval to plot. Defaults to 0.94. This is only meaningful when stat_funcs is None.

order{“C”, “F”}

If fmt is “wide”, use either C or F unpacking order. Defaults to C.

index_originint

If fmt is “wide, select n-based indexing for multivariate parameters. Defaults to 0.

Returns
pandas.DataFrame

With summary statistics for each variable. Defaults statistics are: mean, sd, hpd_3%, hpd_97%, mcse_mean, mcse_sd, ess_bulk, ess_tail and r_hat. mcse_mean, mcse_sd, ess_bulk, ess_tail and r_hat are only computed for traces with 2 or more chains.

Examples

>>> az.summary(trace, ['mu'])
       mean    sd  hpd_3  hpd_97  ess_bulk  ess_tail   r_hat
mu[0]  0.10  0.06  -0.02    0.23     487.0              1.00
mu[1] -0.04  0.06   0.00   -0.17     379.0              1.00

Other statistics can be calculated by passing a list of functions or a dictionary with key, function pairs.

>>> import pandas as pd
>>> def trace_sd(x):
...     return pd.Series(np.std(x, 0), name='sd')
...
>>> def trace_quantiles(x):
...     return pd.DataFrame(pd.quantiles(x, [5, 50, 95]))
...
>>> az.summary(trace, ['mu'], stat_funcs=[trace_sd, trace_quantiles], extend=False)
         sd     5    50    95
mu[0]  0.06  0.00  0.10  0.21
mu[1]  0.07 -0.16 -0.04  0.06