Working with InferenceData

using ArviZ, ArviZExampleData, DimensionalData, Statistics

Here we present a collection of common manipulations you can use while working with InferenceData.

Let's load one of ArviZ's example datasets. posterior, posterior_predictive, etc are the groups stored in idata, and they are stored as Datasets. In this HTML view, you can click a group name to expand a summary of the group.

idata = load_example_data("centered_eight")
InferenceData
posterior
╭─────────────────╮
│ 500×4×8 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  → chain  Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points,
  ↗ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :mu    eltype: Float64 dims: draw, chain size: 500×4
  :theta eltype: Float64 dims: school, draw, chain size: 8×500×4
  :tau   eltype: Float64 dims: draw, chain size: 500×4
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 6 entries:
  "created_at" => "2022-10-13T14:37:37.315398"
  "inference_library_version" => "4.2.2"
  "sampling_time" => 7.48011
  "tuning_steps" => 1000
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
posterior_predictive
╭─────────────────╮
│ 8×500×4 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered,
  → draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  ↗ chain  Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points
├────────────────────────────────────────────────────────────────────── layers ┤
  :obs eltype: Float64 dims: school, draw, chain size: 8×500×4
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:41.460544"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
log_likelihood
╭─────────────────╮
│ 8×500×4 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered,
  → draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  ↗ chain  Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points
├────────────────────────────────────────────────────────────────────── layers ┤
  :obs eltype: Float64 dims: school, draw, chain size: 8×500×4
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:37.487399"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
sample_stats
╭───────────────╮
│ 500×4 Dataset │
├───────────────┴─────────────────────────────────────────────────────── dims ┐
  ↓ draw  Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  → chain Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points
├─────────────────────────────────────────────────────────────────────────────┴ layers ┐
  :max_energy_error    eltype: Float64 dims: draw, chain size: 500×4
  :energy_error        eltype: Float64 dims: draw, chain size: 500×4
  :lp                  eltype: Float64 dims: draw, chain size: 500×4
  :index_in_trajectory eltype: Int64 dims: draw, chain size: 500×4
  :acceptance_rate     eltype: Float64 dims: draw, chain size: 500×4
  :diverging           eltype: Bool dims: draw, chain size: 500×4
  :process_time_diff   eltype: Float64 dims: draw, chain size: 500×4
  :n_steps             eltype: Float64 dims: draw, chain size: 500×4
  :perf_counter_start  eltype: Float64 dims: draw, chain size: 500×4
  :largest_eigval      eltype: Union{Missing, Float64} dims: draw, chain size: 500×4
  :smallest_eigval     eltype: Union{Missing, Float64} dims: draw, chain size: 500×4
  :step_size_bar       eltype: Float64 dims: draw, chain size: 500×4
  :step_size           eltype: Float64 dims: draw, chain size: 500×4
  :energy              eltype: Float64 dims: draw, chain size: 500×4
  :tree_depth          eltype: Int64 dims: draw, chain size: 500×4
  :perf_counter_diff   eltype: Float64 dims: draw, chain size: 500×4
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 6 entries:
  "created_at" => "2022-10-13T14:37:37.324929"
  "inference_library_version" => "4.2.2"
  "sampling_time" => 7.48011
  "tuning_steps" => 1000
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
prior
╭─────────────────╮
│ 500×1×8 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  → chain  Sampled{Int64} [0] ForwardOrdered Irregular Points,
  ↗ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :tau   eltype: Float64 dims: draw, chain size: 500×1
  :theta eltype: Float64 dims: school, draw, chain size: 8×500×1
  :mu    eltype: Float64 dims: draw, chain size: 500×1
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:26.602116"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
prior_predictive
╭─────────────────╮
│ 8×500×1 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered,
  → draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  ↗ chain  Sampled{Int64} [0] ForwardOrdered Irregular Points
├────────────────────────────────────────────────────────────────────── layers ┤
  :obs eltype: Float64 dims: school, draw, chain size: 8×500×1
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:26.604969"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
observed_data
╭───────────────────╮
│ 8-element Dataset │
├───────────────────┴──────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :obs eltype: Float64 dims: school size: 8
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:26.606375"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
constant_data
╭───────────────────╮
│ 8-element Dataset │
├───────────────────┴──────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :scores eltype: Float64 dims: school size: 8
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:26.607471"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
Info

Datasets are DimensionalData.AbstractDimStacks and can be used identically. The variables a Dataset contains are called "layers", and dimensions of the same name that appear in more than one layer within a Dataset must have the same indices.

InferenceData behaves like a NamedTuple and can be used similarly. Note that unlike a NamedTuple, the groups always appear in a specific order.

length(idata) # number of groups
8
keys(idata) # group names
(:posterior, :posterior_predictive, :log_likelihood, :sample_stats, :prior, :prior_predictive, :observed_data, :constant_data)

Get the dataset corresponding to a single group

Group datasets can be accessed both as properties or as indexed items.

post = idata.posterior
╭─────────────────╮
500×4×8 Dataset
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  → chain  Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points,
  ↗ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :mu    eltype: Float64 dims: draw, chain size: 500×4
  :theta eltype: Float64 dims: school, draw, chain size: 8×500×4
  :tau   eltype: Float64 dims: draw, chain size: 500×4
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 6 entries:
  "created_at"                => "2022-10-13T14:37:37.315398"
  "inference_library_version" => "4.2.2"
  "sampling_time"             => 7.48011
  "tuning_steps"              => 1000
  "arviz_version"             => "0.13.0.dev0"
  "inference_library"         => "pymc"

post is the dataset itself, so this is a non-allocating operation.

idata[:posterior] === post
true

InferenceData supports a more advanced indexing syntax, which we'll see later.

Getting a new InferenceData with a subset of groups

We can index by a collection of group names to get a new InferenceData with just those groups. This is also non-allocating.

idata_sub = idata[(:posterior, :posterior_predictive)]
InferenceData
posterior
╭─────────────────╮
│ 500×4×8 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  → chain  Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points,
  ↗ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :mu    eltype: Float64 dims: draw, chain size: 500×4
  :theta eltype: Float64 dims: school, draw, chain size: 8×500×4
  :tau   eltype: Float64 dims: draw, chain size: 500×4
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 6 entries:
  "created_at" => "2022-10-13T14:37:37.315398"
  "inference_library_version" => "4.2.2"
  "sampling_time" => 7.48011
  "tuning_steps" => 1000
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
posterior_predictive
╭─────────────────╮
│ 8×500×4 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered,
  → draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  ↗ chain  Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points
├────────────────────────────────────────────────────────────────────── layers ┤
  :obs eltype: Float64 dims: school, draw, chain size: 8×500×4
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:41.460544"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"

Adding groups to an InferenceData

InferenceData is immutable, so to add or replace groups we use merge to create a new object.

merge(idata_sub, idata[(:observed_data, :prior)])
InferenceData
posterior
╭─────────────────╮
│ 500×4×8 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  → chain  Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points,
  ↗ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :mu    eltype: Float64 dims: draw, chain size: 500×4
  :theta eltype: Float64 dims: school, draw, chain size: 8×500×4
  :tau   eltype: Float64 dims: draw, chain size: 500×4
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 6 entries:
  "created_at" => "2022-10-13T14:37:37.315398"
  "inference_library_version" => "4.2.2"
  "sampling_time" => 7.48011
  "tuning_steps" => 1000
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
posterior_predictive
╭─────────────────╮
│ 8×500×4 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered,
  → draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  ↗ chain  Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points
├────────────────────────────────────────────────────────────────────── layers ┤
  :obs eltype: Float64 dims: school, draw, chain size: 8×500×4
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:41.460544"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
prior
╭─────────────────╮
│ 500×1×8 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  → chain  Sampled{Int64} [0] ForwardOrdered Irregular Points,
  ↗ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :tau   eltype: Float64 dims: draw, chain size: 500×1
  :theta eltype: Float64 dims: school, draw, chain size: 8×500×1
  :mu    eltype: Float64 dims: draw, chain size: 500×1
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:26.602116"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
observed_data
╭───────────────────╮
│ 8-element Dataset │
├───────────────────┴──────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :obs eltype: Float64 dims: school size: 8
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:26.606375"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"

We can also use Base.setindex to out-of-place add or replace a single group.

Base.setindex(idata_sub, idata.prior, :prior)
InferenceData
posterior
╭─────────────────╮
│ 500×4×8 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  → chain  Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points,
  ↗ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :mu    eltype: Float64 dims: draw, chain size: 500×4
  :theta eltype: Float64 dims: school, draw, chain size: 8×500×4
  :tau   eltype: Float64 dims: draw, chain size: 500×4
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 6 entries:
  "created_at" => "2022-10-13T14:37:37.315398"
  "inference_library_version" => "4.2.2"
  "sampling_time" => 7.48011
  "tuning_steps" => 1000
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
posterior_predictive
╭─────────────────╮
│ 8×500×4 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered,
  → draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  ↗ chain  Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points
├────────────────────────────────────────────────────────────────────── layers ┤
  :obs eltype: Float64 dims: school, draw, chain size: 8×500×4
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:41.460544"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
prior
╭─────────────────╮
│ 500×1×8 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  → chain  Sampled{Int64} [0] ForwardOrdered Irregular Points,
  ↗ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :tau   eltype: Float64 dims: draw, chain size: 500×1
  :theta eltype: Float64 dims: school, draw, chain size: 8×500×1
  :mu    eltype: Float64 dims: draw, chain size: 500×1
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:26.602116"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"

Add a new variable

Dataset is also immutable. So while the values within the underlying data arrays can be mutated, layers cannot be added or removed from Datasets, and groups cannot be added/removed from InferenceData.

Instead, we do this out-of-place also using merge.

merge(post, (log_tau=log.(post[:tau]),))
╭─────────────────╮
500×4×8 Dataset
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  → chain  Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points,
  ↗ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :mu      eltype: Float64 dims: draw, chain size: 500×4
  :theta   eltype: Float64 dims: school, draw, chain size: 8×500×4
  :tau     eltype: Float64 dims: draw, chain size: 500×4
  :log_tau eltype: Float64 dims: draw, chain size: 500×4
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 6 entries:
  "created_at"                => "2022-10-13T14:37:37.315398"
  "inference_library_version" => "4.2.2"
  "sampling_time"             => 7.48011
  "tuning_steps"              => 1000
  "arviz_version"             => "0.13.0.dev0"
  "inference_library"         => "pymc"

Obtain an array for a given parameter

Let’s say we want to get the values for mu as an array. Parameters can be accessed with either property or index syntax.

post.tau
╭───────────────────────────────╮
500×4 DimArray{Float64,2} tau
├───────────────────────────────┴─────────────────────────────────────── dims ┐
  ↓ draw  Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  → chain Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points
└─────────────────────────────────────────────────────────────────────────────┘
      0        1         2        3
   0    4.72574  1.97083   3.50128  6.07326
   1    3.90899  2.04903   2.89324  3.77187
   2    4.84403  2.12376   4.27329  3.17054
   3    1.8567   3.39183  11.8965   6.00193
   ⋮                                
 495    7.56498  1.61268   3.56495  2.78607
 496    2.24702  1.84816   2.55959  4.28196
 497    1.89384  2.17459   4.08978  2.74061
 498    5.92006  1.32755   2.72017  2.93238
 499    4.3259   1.21199   1.91701  4.46125
post[:tau] === post.tau
true

To remove the dimensions, just use parent to retrieve the underlying array.

parent(post.tau)
500×4 Matrix{Float64}:
 4.72574   1.97083   3.50128  6.07326
 3.90899   2.04903   2.89324  3.77187
 4.84403   2.12376   4.27329  3.17054
 1.8567    3.39183  11.8965   6.00193
 4.74841   4.84368   7.11325  3.28632
 3.51387  10.8872    7.18892  2.16314
 4.20898   4.01889   9.0977   7.68505
 2.6834    4.28584   7.84286  4.08612
 1.16889   3.70403  17.1548   5.1157
 1.21052   3.15829  16.7573   4.86939
 ⋮                            
 2.05742   1.09087  10.8168   5.08507
 2.72536   1.09087   2.16788  6.1552
 5.97049   1.67101   5.19169  8.23756
 8.15827   1.61268   4.96249  3.13966
 7.56498   1.61268   3.56495  2.78607
 2.24702   1.84816   2.55959  4.28196
 1.89384   2.17459   4.08978  2.74061
 5.92006   1.32755   2.72017  2.93238
 4.3259    1.21199   1.91701  4.46125

Get the dimension lengths

Let’s check how many groups are in our hierarchical model.

size(idata.observed_data, :school)
8

Get coordinate/index values

What are the names of the groups in our hierarchical model? You can access them from the coordinate name school in this case.

DimensionalData.index(idata.observed_data, :school)
8-element Vector{String}:
 "Choate"
 "Deerfield"
 "Phillips Andover"
 "Phillips Exeter"
 "Hotchkiss"
 "Lawrenceville"
 "St. Paul's"
 "Mt. Hermon"

Get a subset of chains

Let’s keep only chain 0 here. For the subset to take effect on all relevant InferenceData groups – posterior, sample_stats, log_likelihood, and posterior_predictive – we will index InferenceData instead of Dataset.

Here we use DimensionalData's At selector. Its other selectors are also supported.

idata[chain=At(0)]
InferenceData
posterior
╭─────────────────╮
│ 500×1×8 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  → chain  Sampled{Int64} [0] ForwardOrdered Irregular Points,
  ↗ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :mu    eltype: Float64 dims: draw, chain size: 500×1
  :theta eltype: Float64 dims: school, draw, chain size: 8×500×1
  :tau   eltype: Float64 dims: draw, chain size: 500×1
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 6 entries:
  "created_at" => "2022-10-13T14:37:37.315398"
  "inference_library_version" => "4.2.2"
  "sampling_time" => 7.48011
  "tuning_steps" => 1000
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
posterior_predictive
╭─────────────────╮
│ 8×500×1 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered,
  → draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  ↗ chain  Sampled{Int64} [0] ForwardOrdered Irregular Points
├────────────────────────────────────────────────────────────────────── layers ┤
  :obs eltype: Float64 dims: school, draw, chain size: 8×500×1
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:41.460544"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
log_likelihood
╭─────────────────╮
│ 8×500×1 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered,
  → draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  ↗ chain  Sampled{Int64} [0] ForwardOrdered Irregular Points
├────────────────────────────────────────────────────────────────────── layers ┤
  :obs eltype: Float64 dims: school, draw, chain size: 8×500×1
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:37.487399"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
sample_stats
╭───────────────╮
│ 500×1 Dataset │
├───────────────┴─────────────────────────────────────────────────────── dims ┐
  ↓ draw  Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  → chain Sampled{Int64} [0] ForwardOrdered Irregular Points
├─────────────────────────────────────────────────────────────────────────────┴ layers ┐
  :max_energy_error    eltype: Float64 dims: draw, chain size: 500×1
  :energy_error        eltype: Float64 dims: draw, chain size: 500×1
  :lp                  eltype: Float64 dims: draw, chain size: 500×1
  :index_in_trajectory eltype: Int64 dims: draw, chain size: 500×1
  :acceptance_rate     eltype: Float64 dims: draw, chain size: 500×1
  :diverging           eltype: Bool dims: draw, chain size: 500×1
  :process_time_diff   eltype: Float64 dims: draw, chain size: 500×1
  :n_steps             eltype: Float64 dims: draw, chain size: 500×1
  :perf_counter_start  eltype: Float64 dims: draw, chain size: 500×1
  :largest_eigval      eltype: Union{Missing, Float64} dims: draw, chain size: 500×1
  :smallest_eigval     eltype: Union{Missing, Float64} dims: draw, chain size: 500×1
  :step_size_bar       eltype: Float64 dims: draw, chain size: 500×1
  :step_size           eltype: Float64 dims: draw, chain size: 500×1
  :energy              eltype: Float64 dims: draw, chain size: 500×1
  :tree_depth          eltype: Int64 dims: draw, chain size: 500×1
  :perf_counter_diff   eltype: Float64 dims: draw, chain size: 500×1
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 6 entries:
  "created_at" => "2022-10-13T14:37:37.324929"
  "inference_library_version" => "4.2.2"
  "sampling_time" => 7.48011
  "tuning_steps" => 1000
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
prior
╭─────────────────╮
│ 500×1×8 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  → chain  Sampled{Int64} [0] ForwardOrdered Irregular Points,
  ↗ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :tau   eltype: Float64 dims: draw, chain size: 500×1
  :theta eltype: Float64 dims: school, draw, chain size: 8×500×1
  :mu    eltype: Float64 dims: draw, chain size: 500×1
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:26.602116"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
prior_predictive
╭─────────────────╮
│ 8×500×1 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered,
  → draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  ↗ chain  Sampled{Int64} [0] ForwardOrdered Irregular Points
├────────────────────────────────────────────────────────────────────── layers ┤
  :obs eltype: Float64 dims: school, draw, chain size: 8×500×1
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:26.604969"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
observed_data
╭───────────────────╮
│ 8-element Dataset │
├───────────────────┴──────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :obs eltype: Float64 dims: school size: 8
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:26.606375"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
constant_data
╭───────────────────╮
│ 8-element Dataset │
├───────────────────┴──────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :scores eltype: Float64 dims: school size: 8
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:26.607471"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"

Note that in this case, prior only has a chain of 0. If it also had the other chains, we could have passed chain=At([0, 2]) to subset by chains 0 and 2.

Warning

If we used idata[chain=[0, 2]] without the At selector, this is equivalent to idata[chain=DimensionalData.index(idata.posterior, :chain)[0, 2]], that is, [0, 2] indexes an array of dimension indices, which here would error. But if we had requested idata[chain=[1, 2]] we would not hit an error, but we would index the wrong chains. So it's important to always use a selector to index by values of dimension indices.

Remove the first $n$ draws (burn-in)

Let’s say we want to remove the first 100 draws from all the chains and all InferenceData groups with draws. To do this we use the .. syntax from IntervalSets.jl, which is exported by DimensionalData.

idata[draw=100 .. Inf]
InferenceData
posterior
╭─────────────────╮
│ 400×4×8 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ draw   Sampled{Int64} [100, 101, …, 498, 499] ForwardOrdered Irregular Points,
  → chain  Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points,
  ↗ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :mu    eltype: Float64 dims: draw, chain size: 400×4
  :theta eltype: Float64 dims: school, draw, chain size: 8×400×4
  :tau   eltype: Float64 dims: draw, chain size: 400×4
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 6 entries:
  "created_at" => "2022-10-13T14:37:37.315398"
  "inference_library_version" => "4.2.2"
  "sampling_time" => 7.48011
  "tuning_steps" => 1000
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
posterior_predictive
╭─────────────────╮
│ 8×400×4 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered,
  → draw   Sampled{Int64} [100, 101, …, 498, 499] ForwardOrdered Irregular Points,
  ↗ chain  Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points
├────────────────────────────────────────────────────────────────────── layers ┤
  :obs eltype: Float64 dims: school, draw, chain size: 8×400×4
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:41.460544"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
log_likelihood
╭─────────────────╮
│ 8×400×4 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered,
  → draw   Sampled{Int64} [100, 101, …, 498, 499] ForwardOrdered Irregular Points,
  ↗ chain  Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points
├────────────────────────────────────────────────────────────────────── layers ┤
  :obs eltype: Float64 dims: school, draw, chain size: 8×400×4
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:37.487399"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
sample_stats
╭───────────────╮
│ 400×4 Dataset │
├───────────────┴──────────────────────────────────────────────────────── dims ┐
  ↓ draw  Sampled{Int64} [100, 101, …, 498, 499] ForwardOrdered Irregular Points,
  → chain Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points
├────────────────────────────────────────────────────────────────────── layers ┤
  :max_energy_error    eltype: Float64 dims: draw, chain size: 400×4
  :energy_error        eltype: Float64 dims: draw, chain size: 400×4
  :lp                  eltype: Float64 dims: draw, chain size: 400×4
  :index_in_trajectory eltype: Int64 dims: draw, chain size: 400×4
  :acceptance_rate     eltype: Float64 dims: draw, chain size: 400×4
  :diverging           eltype: Bool dims: draw, chain size: 400×4
  :process_time_diff   eltype: Float64 dims: draw, chain size: 400×4
  :n_steps             eltype: Float64 dims: draw, chain size: 400×4
  :perf_counter_start  eltype: Float64 dims: draw, chain size: 400×4
  :largest_eigval      eltype: Union{Missing, Float64} dims: draw, chain size: 400×4
  :smallest_eigval     eltype: Union{Missing, Float64} dims: draw, chain size: 400×4
  :step_size_bar       eltype: Float64 dims: draw, chain size: 400×4
  :step_size           eltype: Float64 dims: draw, chain size: 400×4
  :energy              eltype: Float64 dims: draw, chain size: 400×4
  :tree_depth          eltype: Int64 dims: draw, chain size: 400×4
  :perf_counter_diff   eltype: Float64 dims: draw, chain size: 400×4
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 6 entries:
  "created_at" => "2022-10-13T14:37:37.324929"
  "inference_library_version" => "4.2.2"
  "sampling_time" => 7.48011
  "tuning_steps" => 1000
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
prior
╭─────────────────╮
│ 400×1×8 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ draw   Sampled{Int64} [100, 101, …, 498, 499] ForwardOrdered Irregular Points,
  → chain  Sampled{Int64} [0] ForwardOrdered Irregular Points,
  ↗ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :tau   eltype: Float64 dims: draw, chain size: 400×1
  :theta eltype: Float64 dims: school, draw, chain size: 8×400×1
  :mu    eltype: Float64 dims: draw, chain size: 400×1
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:26.602116"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
prior_predictive
╭─────────────────╮
│ 8×400×1 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered,
  → draw   Sampled{Int64} [100, 101, …, 498, 499] ForwardOrdered Irregular Points,
  ↗ chain  Sampled{Int64} [0] ForwardOrdered Irregular Points
├────────────────────────────────────────────────────────────────────── layers ┤
  :obs eltype: Float64 dims: school, draw, chain size: 8×400×1
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:26.604969"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
observed_data
╭───────────────────╮
│ 8-element Dataset │
├───────────────────┴──────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :obs eltype: Float64 dims: school size: 8
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:26.606375"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
constant_data
╭───────────────────╮
│ 8-element Dataset │
├───────────────────┴──────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :scores eltype: Float64 dims: school size: 8
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:26.607471"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"

If you check the object you will see that the groups posterior, posterior_predictive, prior, and sample_stats have 400 draws compared to idata, which has 500. The group observed_data has not been affected because it does not have the draw dimension.

Alternatively, you can change a subset of groups by combining indexing styles with merge. Here we use this to build a new InferenceData where we have discarded the first 100 draws only from posterior.

merge(idata, idata[(:posterior,), draw=100 .. Inf])
InferenceData
posterior
╭─────────────────╮
│ 400×4×8 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ draw   Sampled{Int64} [100, 101, …, 498, 499] ForwardOrdered Irregular Points,
  → chain  Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points,
  ↗ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :mu    eltype: Float64 dims: draw, chain size: 400×4
  :theta eltype: Float64 dims: school, draw, chain size: 8×400×4
  :tau   eltype: Float64 dims: draw, chain size: 400×4
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 6 entries:
  "created_at" => "2022-10-13T14:37:37.315398"
  "inference_library_version" => "4.2.2"
  "sampling_time" => 7.48011
  "tuning_steps" => 1000
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
posterior_predictive
╭─────────────────╮
│ 8×500×4 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered,
  → draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  ↗ chain  Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points
├────────────────────────────────────────────────────────────────────── layers ┤
  :obs eltype: Float64 dims: school, draw, chain size: 8×500×4
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:41.460544"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
log_likelihood
╭─────────────────╮
│ 8×500×4 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered,
  → draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  ↗ chain  Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points
├────────────────────────────────────────────────────────────────────── layers ┤
  :obs eltype: Float64 dims: school, draw, chain size: 8×500×4
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:37.487399"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
sample_stats
╭───────────────╮
│ 500×4 Dataset │
├───────────────┴─────────────────────────────────────────────────────── dims ┐
  ↓ draw  Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  → chain Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points
├─────────────────────────────────────────────────────────────────────────────┴ layers ┐
  :max_energy_error    eltype: Float64 dims: draw, chain size: 500×4
  :energy_error        eltype: Float64 dims: draw, chain size: 500×4
  :lp                  eltype: Float64 dims: draw, chain size: 500×4
  :index_in_trajectory eltype: Int64 dims: draw, chain size: 500×4
  :acceptance_rate     eltype: Float64 dims: draw, chain size: 500×4
  :diverging           eltype: Bool dims: draw, chain size: 500×4
  :process_time_diff   eltype: Float64 dims: draw, chain size: 500×4
  :n_steps             eltype: Float64 dims: draw, chain size: 500×4
  :perf_counter_start  eltype: Float64 dims: draw, chain size: 500×4
  :largest_eigval      eltype: Union{Missing, Float64} dims: draw, chain size: 500×4
  :smallest_eigval     eltype: Union{Missing, Float64} dims: draw, chain size: 500×4
  :step_size_bar       eltype: Float64 dims: draw, chain size: 500×4
  :step_size           eltype: Float64 dims: draw, chain size: 500×4
  :energy              eltype: Float64 dims: draw, chain size: 500×4
  :tree_depth          eltype: Int64 dims: draw, chain size: 500×4
  :perf_counter_diff   eltype: Float64 dims: draw, chain size: 500×4
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 6 entries:
  "created_at" => "2022-10-13T14:37:37.324929"
  "inference_library_version" => "4.2.2"
  "sampling_time" => 7.48011
  "tuning_steps" => 1000
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
prior
╭─────────────────╮
│ 500×1×8 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  → chain  Sampled{Int64} [0] ForwardOrdered Irregular Points,
  ↗ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :tau   eltype: Float64 dims: draw, chain size: 500×1
  :theta eltype: Float64 dims: school, draw, chain size: 8×500×1
  :mu    eltype: Float64 dims: draw, chain size: 500×1
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:26.602116"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
prior_predictive
╭─────────────────╮
│ 8×500×1 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered,
  → draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  ↗ chain  Sampled{Int64} [0] ForwardOrdered Irregular Points
├────────────────────────────────────────────────────────────────────── layers ┤
  :obs eltype: Float64 dims: school, draw, chain size: 8×500×1
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:26.604969"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
observed_data
╭───────────────────╮
│ 8-element Dataset │
├───────────────────┴──────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :obs eltype: Float64 dims: school size: 8
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:26.606375"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
constant_data
╭───────────────────╮
│ 8-element Dataset │
├───────────────────┴──────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :scores eltype: Float64 dims: school size: 8
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:26.607471"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"

Compute posterior mean values along draw and chain dimensions

To compute the mean value of the posterior samples, do the following:

mean(post)
(mu = 4.485933103402338,
 theta = 4.911515591394205,
 tau = 4.124222787491913,)

This computes the mean along all dimensions, discarding all dimensions and returning the result as a NamedTuple. This may be what you wanted for mu and tau, which have only two dimensions (chain and draw), but maybe not what you expected for theta, which has one more dimension school.

You can specify along which dimension you want to compute the mean (or other functions), which instead returns a Dataset.

mean(post; dims=(:chain, :draw))
╭───────────────╮
1×1×8 Dataset
├───────────────┴──────────────────────────────────────────────────────── dims ┐
  ↓ draw   Sampled{Float64} [249.5] ForwardOrdered Irregular Points,
  → chain  Sampled{Float64} [1.5] ForwardOrdered Irregular Points,
  ↗ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :mu    eltype: Float64 dims: draw, chain size: 1×1
  :theta eltype: Float64 dims: school, draw, chain size: 8×1×1
  :tau   eltype: Float64 dims: draw, chain size: 1×1
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 6 entries:
  "created_at"                => "2022-10-13T14:37:37.315398"
  "inference_library_version" => "4.2.2"
  "sampling_time"             => 7.48011
  "tuning_steps"              => 1000
  "arviz_version"             => "0.13.0.dev0"
  "inference_library"         => "pymc"

The singleton dimensions of chain and draw now contain meaningless indices, so you may want to discard them, which you can do with dropdims.

dropdims(mean(post; dims=(:chain, :draw)); dims=(:chain, :draw))
╭───────────────────╮
8-element Dataset
├───────────────────┴──────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :mu    eltype: Float64 dims: 
  :theta eltype: Float64 dims: school size: 8
  :tau   eltype: Float64 dims: 
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 6 entries:
  "created_at"                => "2022-10-13T14:37:37.315398"
  "inference_library_version" => "4.2.2"
  "sampling_time"             => 7.48011
  "tuning_steps"              => 1000
  "arviz_version"             => "0.13.0.dev0"
  "inference_library"         => "pymc"

Renaming a dimension

We can rename a dimension in a Dataset using DimensionalData's set method:

theta_bis = set(post.theta; school=:school_bis)
╭───────────────────────────────────╮
8×500×4 DimArray{Float64,3} theta
├───────────────────────────────────┴──────────────────────────────────── dims ┐
  ↓ school_bis Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered,
  → draw       Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  ↗ chain      Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points
└──────────────────────────────────────────────────────────────────────────────┘
[:, :, 1]
                     0497         498        499
  "Choate"            12.3207       -0.213828   10.4025     6.66131
  "Deerfield"          9.90537       1.35515     6.90741    7.41377
  "Phillips Andover"  14.9516        6.98269    -4.96414   -9.3226
  "Phillips Exeter"   11.0115        3.71681     3.13584    2.69192
  "Hotchkiss"          5.5796   …    5.32446    -2.2243    -0.502331
  "Lawrenceville"     16.9018        6.96589    -2.83504   -4.25487
  "St. Paul's"        13.1981        4.9302      5.39106    7.56657
  "Mt. Hermon"        15.0614        3.0586      6.38124    9.98762

We can use this, for example, to broadcast functions across multiple arrays, automatically matching up shared dimensions, using DimensionalData.broadcast_dims.

theta_school_diff = broadcast_dims(-, post.theta, theta_bis)
╭─────────────────────────────────────╮
8×500×4×8 DimArray{Float64,4} theta
├─────────────────────────────────────┴────────────────────────────────── dims ┐
  ↓ school     Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered,
  → draw       Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  ↗ chain      Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points,
  ⬔ school_bis Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
└──────────────────────────────────────────────────────────────────────────────┘
[:, :, 1, 1]
                     0497        498        499
  "Choate"             0.0            0.0        0.0        0.0
  "Deerfield"         -2.41532        1.56898   -3.49509    0.752459
  "Phillips Andover"   2.63093        7.19652  -15.3666   -15.9839
  "Phillips Exeter"   -1.3092         3.93064   -7.26666   -3.96939
  "Hotchkiss"         -6.74108   …    5.53829  -12.6268    -7.16364
  "Lawrenceville"      4.58111        7.17972  -13.2375   -10.9162
  "St. Paul's"         0.877374       5.14403   -5.01144    0.905263
  "Mt. Hermon"         2.74068        3.27243   -4.02126    3.32631

Compute and store posterior pushforward quantities

We use “posterior pushfoward quantities” to refer to quantities that are not variables in the posterior but deterministic computations using posterior variables.

You can compute these pushforward operations and store them as a new variable in a copy of the posterior group.

Here we'll create a new InferenceData with theta_school_diff in the posterior:

idata_new = Base.setindex(idata, merge(post, (; theta_school_diff)), :posterior)
InferenceData
posterior
╭───────────────────╮
│ 500×4×8×8 Dataset │
├───────────────────┴──────────────────────────────────────────────────── dims ┐
  ↓ draw       Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  → chain      Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points,
  ↗ school     Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered,
  ⬔ school_bis Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :mu                eltype: Float64 dims: draw, chain size: 500×4
  :theta             eltype: Float64 dims: school, draw, chain size: 8×500×4
  :tau               eltype: Float64 dims: draw, chain size: 500×4
  :theta_school_diff eltype: Float64 dims: school, draw, chain, school_bis size: 8×500×4×8
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 6 entries:
  "created_at" => "2022-10-13T14:37:37.315398"
  "inference_library_version" => "4.2.2"
  "sampling_time" => 7.48011
  "tuning_steps" => 1000
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
posterior_predictive
╭─────────────────╮
│ 8×500×4 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered,
  → draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  ↗ chain  Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points
├────────────────────────────────────────────────────────────────────── layers ┤
  :obs eltype: Float64 dims: school, draw, chain size: 8×500×4
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:41.460544"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
log_likelihood
╭─────────────────╮
│ 8×500×4 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered,
  → draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  ↗ chain  Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points
├────────────────────────────────────────────────────────────────────── layers ┤
  :obs eltype: Float64 dims: school, draw, chain size: 8×500×4
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:37.487399"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
sample_stats
╭───────────────╮
│ 500×4 Dataset │
├───────────────┴─────────────────────────────────────────────────────── dims ┐
  ↓ draw  Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  → chain Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points
├─────────────────────────────────────────────────────────────────────────────┴ layers ┐
  :max_energy_error    eltype: Float64 dims: draw, chain size: 500×4
  :energy_error        eltype: Float64 dims: draw, chain size: 500×4
  :lp                  eltype: Float64 dims: draw, chain size: 500×4
  :index_in_trajectory eltype: Int64 dims: draw, chain size: 500×4
  :acceptance_rate     eltype: Float64 dims: draw, chain size: 500×4
  :diverging           eltype: Bool dims: draw, chain size: 500×4
  :process_time_diff   eltype: Float64 dims: draw, chain size: 500×4
  :n_steps             eltype: Float64 dims: draw, chain size: 500×4
  :perf_counter_start  eltype: Float64 dims: draw, chain size: 500×4
  :largest_eigval      eltype: Union{Missing, Float64} dims: draw, chain size: 500×4
  :smallest_eigval     eltype: Union{Missing, Float64} dims: draw, chain size: 500×4
  :step_size_bar       eltype: Float64 dims: draw, chain size: 500×4
  :step_size           eltype: Float64 dims: draw, chain size: 500×4
  :energy              eltype: Float64 dims: draw, chain size: 500×4
  :tree_depth          eltype: Int64 dims: draw, chain size: 500×4
  :perf_counter_diff   eltype: Float64 dims: draw, chain size: 500×4
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 6 entries:
  "created_at" => "2022-10-13T14:37:37.324929"
  "inference_library_version" => "4.2.2"
  "sampling_time" => 7.48011
  "tuning_steps" => 1000
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
prior
╭─────────────────╮
│ 500×1×8 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  → chain  Sampled{Int64} [0] ForwardOrdered Irregular Points,
  ↗ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :tau   eltype: Float64 dims: draw, chain size: 500×1
  :theta eltype: Float64 dims: school, draw, chain size: 8×500×1
  :mu    eltype: Float64 dims: draw, chain size: 500×1
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:26.602116"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
prior_predictive
╭─────────────────╮
│ 8×500×1 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered,
  → draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  ↗ chain  Sampled{Int64} [0] ForwardOrdered Irregular Points
├────────────────────────────────────────────────────────────────────── layers ┤
  :obs eltype: Float64 dims: school, draw, chain size: 8×500×1
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:26.604969"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
observed_data
╭───────────────────╮
│ 8-element Dataset │
├───────────────────┴──────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :obs eltype: Float64 dims: school size: 8
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:26.606375"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"
constant_data
╭───────────────────╮
│ 8-element Dataset │
├───────────────────┴──────────────────────────────────────────────────── dims ┐
  ↓ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :scores eltype: Float64 dims: school size: 8
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 4 entries:
  "created_at" => "2022-10-13T14:37:26.607471"
  "inference_library_version" => "4.2.2"
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"

Once you have these pushforward quantities in an InferenceData, you’ll then be able to plot them with ArviZ functions, calculate stats and diagnostics on them, or save and share the InferenceData object with the pushforward quantities included.

Here we compute the mcse of theta_school_diff:

mcse(idata_new.posterior).theta_school_diff
╭───────────────────────────────────────────╮
8×8 DimArray{Float64,2} theta_school_diff
├───────────────────────────────────────────┴──────────────────────────── dims ┐
  ↓ school     Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered,
  → school_bis Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
└──────────────────────────────────────────────────────────────────────────────┘
                       "Choate""St. Paul's"     "Mt. Hermon"
  "Choate"            NaN               0.117476         0.219695
  "Deerfield"           0.191463        0.16484          0.189386
  "Phillips Andover"    0.255636        0.258001         0.160477
  "Phillips Exeter"     0.162782        0.156724         0.144923
  "Hotchkiss"           0.282881   …    0.283969         0.189015
  "Lawrenceville"       0.259065        0.251988         0.178094
  "St. Paul's"          0.117476      NaN                0.222054
  "Mt. Hermon"          0.219695        0.222054       NaN

Advanced subsetting

To select the value corresponding to the difference between the Choate and Deerfield schools do:

school_idx = ["Choate", "Hotchkiss", "Mt. Hermon"]
school_bis_idx = ["Deerfield", "Choate", "Lawrenceville"]
theta_school_diff[school=At(school_idx), school_bis=At(school_bis_idx)]
╭─────────────────────────────────────╮
3×500×4×3 DimArray{Float64,4} theta
├─────────────────────────────────────┴────────────────────────────────── dims ┐
  ↓ school     Categorical{String} ["Choate", "Hotchkiss", "Mt. Hermon"] Unordered,
  → draw       Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  ↗ chain      Sampled{Int64} [0, 1, 2, 3] ForwardOrdered Irregular Points,
  ⬔ school_bis Categorical{String} ["Deerfield", "Choate", "Lawrenceville"] Unordered
└──────────────────────────────────────────────────────────────────────────────┘
[:, :, 1, 1]
               0         1497        498         499
  "Choate"       2.41532   2.1563       -1.56898    3.49509    -0.752459
  "Hotchkiss"   -4.32577  -1.31781       3.96931   -9.13171    -7.9161
  "Mt. Hermon"   5.156    -2.9526        1.70345   -0.526168    2.57385

Add new chains using cat

Suppose after checking the mcse and realizing you need more samples, you rerun the model with two chains and obtain an idata_rerun object.

idata_rerun = InferenceData(; posterior=set(post[chain=At([0, 1])]; chain=[4, 5]))
InferenceData
posterior
╭─────────────────╮
│ 500×2×8 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  → chain  Sampled{Int64} [4, 5] ForwardOrdered Irregular Points,
  ↗ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :mu    eltype: Float64 dims: draw, chain size: 500×2
  :theta eltype: Float64 dims: school, draw, chain size: 8×500×2
  :tau   eltype: Float64 dims: draw, chain size: 500×2
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 6 entries:
  "created_at" => "2022-10-13T14:37:37.315398"
  "inference_library_version" => "4.2.2"
  "sampling_time" => 7.48011
  "tuning_steps" => 1000
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"

You can combine the two using cat.

cat(idata[[:posterior]], idata_rerun; dims=:chain)
InferenceData
posterior
╭─────────────────╮
│ 500×6×8 Dataset │
├─────────────────┴────────────────────────────────────────────────────── dims ┐
  ↓ draw   Sampled{Int64} [0, 1, …, 498, 499] ForwardOrdered Irregular Points,
  → chain  Sampled{Int64} [0, 1, …, 4, 5] ForwardOrdered Irregular Points,
  ↗ school Categorical{String} [Choate, Deerfield, …, St. Paul's, Mt. Hermon] Unordered
├────────────────────────────────────────────────────────────────────── layers ┤
  :mu    eltype: Float64 dims: draw, chain size: 500×6
  :theta eltype: Float64 dims: school, draw, chain size: 8×500×6
  :tau   eltype: Float64 dims: draw, chain size: 500×6
├──────────────────────────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 6 entries:
  "created_at" => "2022-10-13T14:37:37.315398"
  "inference_library_version" => "4.2.2"
  "sampling_time" => 7.48011
  "tuning_steps" => 1000
  "arviz_version" => "0.13.0.dev0"
  "inference_library" => "pymc"