Data

ArviZ.from_mcmcchains
ArviZ.from_samplechains
ArviZExampleData.describe_example_data
ArviZExampleData.load_example_data
InferenceObjectsNetCDF.from_netcdf
InferenceObjectsNetCDF.to_netcdf

Inference library converters

ArviZ.from_mcmcchains — Function

from_mcmcchains(posterior::MCMCChains.Chains; kwargs...) -> InferenceData
from_mcmcchains(; kwargs...) -> InferenceData
from_mcmcchains(
    posterior::MCMCChains.Chains,
    posterior_predictive,
    predictions,
    log_likelihood;
    kwargs...
) -> InferenceData

Convert data in an MCMCChains.Chains format into an InferenceData.

Any keyword argument below without an an explicitly annotated type above is allowed, so long as it can be passed to convert_to_inference_data.

Arguments

posterior::MCMCChains.Chains: Draws from the posterior

Keywords

posterior_predictive::Any=nothing: Draws from the posterior predictive distribution or name(s) of predictive variables in posterior
predictions: Out-of-sample predictions for the posterior.
prior: Draws from the prior
prior_predictive: Draws from the prior predictive distribution or name(s) of predictive variables in prior
observed_data: Observed data on which the posterior is conditional. It should only contain data which is modeled as a random variable. Keys are parameter names and values.
constant_data: Model constants, data included in the model that are not modeled as random variables. Keys are parameter names.
predictions_constant_data: Constants relevant to the model predictions (i.e. new x values in a linear regression).
log_likelihood: Pointwise log-likelihood for the data. It is recommended to use this argument as a named tuple whose keys are observed variable names and whose values are log likelihood arrays. Alternatively, provide the name of variable in posterior containing log likelihoods.
library=MCMCChains: Name of library that generated the chains
coords: Map from named dimension to named indices
dims: Map from variable name to names of its dimensions
eltypes: Map from variable names to eltypes. This is primarily used to assign discrete eltypes to discrete variables that were stored in Chains as floats.

Returns

InferenceData: The data with groups corresponding to the provided data

source

ArviZ.from_samplechains — Function

from_samplechains(
    posterior=nothing;
    prior=nothing,
    library=SampleChains,
    kwargs...,
) -> InferenceData

Convert SampleChains samples to an InferenceData.

Either posterior or prior may be a SampleChains.AbstractChain or SampleChains.MultiChain object.

For descriptions of remaining kwargs, see from_namedtuple.

source

IO / Conversion

InferenceObjectsNetCDF.from_netcdf — Function

from_netcdf(path::AbstractString; kwargs...) -> InferenceData

Load an InferenceData from an unopened NetCDF file.

Remaining kwargs are passed to NCDatasets.NCDataset. This method loads data eagerly. To instead load data lazily, pass an opened NCDataset to from_netcdf.

Examples

julia> idata = from_netcdf("centered_eight.nc")
InferenceData with groups:
  > posterior
  > posterior_predictive
  > sample_stats
  > prior
  > observed_data

from_netcdf(ds::NCDatasets.NCDataset; load_mode) -> InferenceData

Load an InferenceData from an opened NetCDF file.

load_mode defaults to :lazy, which avoids reading variables into memory. Operations on these arrays will be slow. load_mode can also be :eager, which copies all variables into memory. It is then safe to close ds. If load_mode is :lazy and ds is closed after constructing InferenceData, using the variable arrays will have undefined behavior.

Examples

Here is how we might load an InferenceData from an InferenceData lazily from a web-hosted NetCDF file.

julia> using HTTP, NCDatasets

julia> resp = HTTP.get("https://github.com/arviz-devs/arviz_example_data/blob/main/data/centered_eight.nc?raw=true");

julia> ds = NCDataset("centered_eight", "r"; memory = resp.body);

julia> idata = from_netcdf(ds)
InferenceData with groups:
  > posterior
  > posterior_predictive
  > sample_stats
  > prior
  > observed_data

julia> idata_copy = copy(idata); # disconnect from the loaded dataset

julia> close(ds);

InferenceObjectsNetCDF.to_netcdf — Function

to_netcdf(data, dest::AbstractString; group::Symbol=:posterior, kwargs...)
to_netcdf(data, dest::NCDatasets.NCDataset; group::Symbol=:posterior)

Write data to a NetCDF file.

data is any type that can be converted to an InferenceData using convert_to_inference_data. If not an InferenceData, then group specifies which group the data represents.

dest specifies either the path to the NetCDF file or an opened NetCDF file. If dest is a path, remaining kwargs are passed to NCDatasets.NCDataset.

Examples

julia> using NCDatasets

julia> idata = from_namedtuple((; x = randn(4, 100, 3), z = randn(4, 100)))
InferenceData with groups:
  > posterior

julia> to_netcdf(idata, "data.nc")
"data.nc"

Example data

ArviZExampleData.describe_example_data — Function

describe_example_data() -> String

Return a string containing descriptions of all available datasets.

Examples

julia> describe_example_data("radon") |> println
radon
=====

Radon is a radioactive gas that enters homes through contact points with the ground. It is a carcinogen that is the primary cause of lung cancer in non-smokers. Radon levels vary greatly from household to household.

This example uses an EPA study of radon levels in houses in Minnesota to construct a model with a hierarchy over households within a county. The model includes estimates (gamma) for contextual effects of the uranium per household.

See Gelman and Hill (2006) for details on the example, or https://docs.pymc.io/notebooks/multilevel_modeling.html by Chris Fonnesbeck for details on this implementation.

remote: http://ndownloader.figshare.com/files/24067472

ArviZExampleData.load_example_data — Function

load_example_data(name; kwargs...) -> InferenceObjects.InferenceData
load_example_data() -> Dict{String,AbstractFileMetadata}

Load a local or remote pre-made dataset.

kwargs are forwarded to InferenceObjects.from_netcdf.

Pass no parameters to get a Dict listing all available datasets.

Data files are handled by DataDeps.jl. A file is downloaded only when it is requested and then cached for future use.

Examples

julia> keys(load_example_data())
KeySet for a OrderedCollections.OrderedDict{String, ArviZExampleData.AbstractFileMetadata} with 9 entries. Keys:
  "centered_eight"
  "non_centered_eight"
  "radon"
  "rugby"
  "regression1d"
  "regression10d"
  "classification1d"
  "classification10d"
  "glycan_torsion_angles"

julia> load_example_data("centered_eight")
InferenceData with groups:
  > posterior
  > posterior_predictive
  > log_likelihood
  > sample_stats
  > prior
  > prior_predictive
  > observed_data
  > constant_data