Exploratory Analysis of Bayesian Models

Published

December 22, 2024

While conceptually simple, Bayesian methods can be mathematically and numerically challenging. Probabilistic programming languages (PPLs) implement functions to easily build Bayesian models together with efficient automatic inference methods. This helps separate the model building from the inference, allowing practitioners to focus on their specific problems and leaving the PPLs to handle the computational details for them (Bessiere et al. 2013; Daniel Roy 2015; Ghahramani 2015). The inference process generates a posterior distribution - which has a central role in Bayesian statistics - together with other distributions like the posterior predictive distribution and the prior predictive distribution. The correct visualization, analysis, and interpretation of these distributions is key to properly answer the questions that motivated the inference process.

When working with Bayesian models there are a series of related tasks that need to be addressed besides inference itself:

  • Diagnoses of the quality of the inference (as this is generally done using numerical approximation methods)
  • Model criticism, including evaluations of both model assumptions and model predictions
  • Comparison of models, including model selection or model averaging
  • Preparation of the results for a particular audience

We collectively call all these tasks Exploratory analysis of Bayesian models, building on concepts from Exploratory data analysis to examine and gain deeper insights into Bayesian models.

In the words of Persi Diaconis (Diaconis 2011):

“Exploratory data analysis seeks to reveal structure, or simple descriptions in data. We look at numbers or graphs and try to find patterns. We pursue leads suggested by background information, imagination, patterns perceived, and experience with other data analyses”.

In this book we discuss how to use both numerical and visual summaries to successfully perform the many tasks that are central to the iterative and interactive modeling process. To do so, we first discuss some general principles of data visualization and uncertainty representation that are not exclusive of Bayesian statistics.