Dataset

Type definition

InferenceObjects.DatasetType
Dataset{L} <: DimensionalData.AbstractDimStack{L}

Container of dimensional arrays sharing some dimensions.

This type is an DimensionalData.AbstractDimStack that implements the same interface as DimensionalData.DimStack and has identical usage.

When a Dataset is passed to Python, it is converted to an xarray.Dataset without copying the data. That is, the Python object shares the same memory as the Julia object. However, if an xarray.Dataset is passed to Julia, its data must be copied.

Constructors

Dataset(data::DimensionalData.AbstractDimArray...)
Dataset(data::Tuple{Vararg{<:DimensionalData.AbstractDimArray}})
Dataset(data::NamedTuple{Keys,Vararg{<:DimensionalData.AbstractDimArray}})
Dataset(
    data::NamedTuple,
    dims::Tuple{Vararg{DimensionalData.Dimension}};
    metadata=DimensionalData.NoMetadata(),
)

In most cases, use convert_to_dataset to create a Dataset instead of directly using a constructor.

source

General conversion

InferenceObjects.namedtuple_to_datasetFunction
namedtuple_to_dataset(data; kwargs...) -> Dataset

Convert NamedTuple mapping variable names to arrays to a Dataset.

Any non-array values will be converted to a 0-dimensional array.

Keywords

  • attrs::AbstractDict{<:AbstractString}: a collection of metadata to attach to the dataset, in addition to defaults. Values should be JSON serializable.
  • library::Union{String,Module}: library used for performing inference. Will be attached to the attrs metadata.
  • dims: a collection mapping variable names to collections of objects containing dimension names. Acceptable such objects are:
    • Symbol: dimension name
    • Type{<:DimensionsionalData.Dimension}: dimension type
    • DimensionsionalData.Dimension: dimension, potentially with indices
    • Nothing: no dimension name provided, dimension name is automatically generated
  • coords: a collection indexable by dimension name specifying the indices of the given dimension. If indices for a dimension in dims are provided, they are used even if the dimension contains its own indices. If a dimension is missing, its indices are automatically generated.
source

DimensionalData

As a DimensionalData.AbstractDimStack, Dataset also implements the AbstractDimStack API and can be used like a DimStack. See DimensionalData's documentation for example usage.

Tables inteface

Dataset implements the Tables interface. This allows Datasets to be used as sources for any function that can accept a table. For example, it's straightforward to:

  • write to CSV with CSV.jl
  • flatten to a DataFrame with DataFrames.jl
  • plot with StatsPlots.jl
  • plot with AlgebraOfGraphics.jl