This Github repository contains the implementation of Coordinate Ascent Variational Inference (CAVI) for the Gaussian estimation problem described in [1] and [2], which is briefly summarized below. The implementation was used to generate the results presented in [1] and is based on [3].
- Coordinate Ascent Variational Inference Algorithm: The implementation focuses on the CAVI algorithm, a variational inference method known for its efficiency in approximating posterior distributions.
- Dirichlet Process Mixtures of Gaussians: The code supports the modeling of complex data structures through the use of Dirichlet Process Mixtures, allowing for automatic determination of the number of clusters in the observations. The mixture distribution is assumed to be Gaussian.
- Scalable and Extendable: The code is designed to handle large datasets efficiently. Customization and experimentation with different priors, likelihoods, and hyperparameters is possible through the modification of the corresponding equations in the vi module.
We consider a Gaussian model for objects that are indexed by
Using the assumption
which is the conditional pdf of
The local parameters
with base distribution
The so called global parameters
The
Moreover,
The above model assumptions yield a Dirichlet process mixture (DPM) distribution for the object features
and
The mixture weights
Goal is to estimate the assignments
The implemented CAVI algorithm uses a mean field variational family to approximate the true posterior of the DPM.
This approximate posterior involves a truncated stick-breaking representation of the above model (truncated at level
where
is a Beta distribution,
is a Gaussian distribution in exponential family form and
is a Categorical distribution.
Here,
The variational parameters are given by
with
The CAVI algorithm approximates the true posterior pdf by calculating the variational parameters of the variational pdf in an interative manner. Convergence is declared when the relative change of the evidence lower bound falls below a predefined threshold.
Following parameters have to be choosen for initialization:
- Concentration parameter
$\alpha$ of the DP - Mean
$\mu_{\theta^*}$ and variance$\Sigma_{\theta^\ast}$ of the base distribution$G_0$ of the DP - Variance
$\Sigma_u$ - Truncation parameter
$T$ - Assignment probabilities
$\phi_{nt}$
The measurements
Given the approximate posterior, the means
For details see Chapter 4 and Chapter 5 of [1].
To run the code follow these steps:
- Clone the repository
git clone https://github.com/lipovec-t/vi-gaussian-dpm.git
- Install dependencies
pip install -r requirements.txt
- Run one of the simulation scripts s*_simulate.py in the IDE of your choice.
Customize the simulation scripts to your specific use case and adapt config files as needed.
Feel free to use, modify, and extend this implementation for your research or applications. If you encounter any issues or have suggestions, please let us know through the GitHub issues page.
This project is licensed under the MIT License - see the LICENSE file for details.
[1] T. Lipovec, “Variational Inference for Dirichlet Process Mixtures and Application to Gaussian Estimation,” Master’s thesis, TU Wien, 2023.
[2] E. Šauša, “Advanced Bayesian Estimation in Hierarchical Gaussian Models: Dirichlet Process Mixtures and Clustering Gain,” Master’s thesis, TU Wien, 2024.
[3] D. M. Blei and M. I. Jordan, “Variational Inference for Dirichlet Process Mixtures,” Bayesian Analysis, vol. 1, no. 1, pp. 121-143, 2006.
[4] D. M. Blei, A. Kucukelbir, and J. D. McAuliffe, “Variational Inference: A Review for Statisticians,” Journal of the American Statistical Association, vol. 112, no. 518, pp. 859-877, 2018.