Gaussian Mixture Models

Gaussian Mixture Models (GMMs) are a probabilistic model that assumes all data points are generated from a mixture of $K$ Gaussian distributions with unknown parameters. A GMM is characterized by its probability density function (PDF), which is expressed as a weighted sum of $K$ component Gaussian densities:

p (x) = \sum_{k = 1}^{K} π_{k} N (x ∣ μ_{k}, Σ_{k}),

where:

$x \in R^{d}$ is a $d$ -dimensional observation vector
$π_{k}$ are the mixing coefficients (weights) satisfying $π_{k} \geq 0$ and $\sum_{k = 1}^{K} π_{k} = 1$
$N (x ∣ μ_{k}, Σ_{k})$ is the PDF of the $k$ -th multivariate Gaussian component with mean vector $μ_{k} \in R^{d}$ and covariance matrix $Σ_{k} \in R^{d \times d}$

In practice, GMMs are widely applied for multivariate density estimation, clustering, and dimensionality reduction from data [11]. For univariate density estimation, we refer to Kernel Density Estimation.

Expectation-Maximization Algorithm for GMMs

One way to find the parameters of a GMM from a set of samples is to use the Expectation-Maximization (EM) algorithm. Here, we show the basic steps to fit a GMM to data using the EM algorithm based on Ref.[11]. The EM algorithm iteratively refines the parameters of the GMM by alternating between two steps:

Expectation Step (E-step): Calculate the expected value of the latent variables given the current parameters.
Maximization Step (M-step): Update the parameters to maximize the expected log-likelihood found in the E-step.

Given a dataset $D = {x_{1}, x_{2}, \dots, x_{N}}$ of $N$ independent observations, the goal is to estimate the parameters $θ = {π_{1}, \dots, π_{K}, μ_{1}, \dots, μ_{K}, Σ_{1}, \dots, Σ_{K}}$ that maximize the log-likelihood:

ℓ (θ) = \sum_{n = 1}^{N} \log p (x_{n} ∣ θ) = \sum_{n = 1}^{N} \log (\sum_{k = 1}^{K} π_{k} N (x_{n} ∣ μ_{k}, Σ_{k})) .

The EM algorithm introduces latent variables $z_{n k} \in {0, 1}$ indicating whether observation $n$ belongs to component $k$ , where $\sum_{k = 1}^{K} z_{n k} = 1$ for each $n$ . The algorithm iteratively maximizes the expected log-likelihood by alternating between two steps:

Expectation Step

Compute the posterior probabilities (responsibilities) $γ_{n k}$ for each observation-component pair:

γ_{n k}^{(t + 1)} = \frac{π_{k}^{(t)} N (x_{n} ∣ μ_{k}^{(t)}, Σ_{k}^{(t)})}{\sum_{j = 1}^{K} π_{j}^{(t)} N (x_{n} ∣ μ_{j}^{(t)}, Σ_{j}^{(t)})} .

Maximization Step

Update the parameters using the computed responsibilities, where $N_{k} = \sum_{n = 1}^{N} γ_{n k}^{(t + 1)}$ :

π_{k}^{(t + 1)} = \frac{N_{k}}{N}, μ_{k}^{(t + 1)} = \frac{1}{N_{k}} \sum_{n = 1}^{N} γ_{n k}^{(t + 1)} x_{n}, Σ_{k}^{(t + 1)} = \frac{1}{N_{k}} \sum_{n = 1}^{N} γ_{n k}^{(t + 1)} (x_{n} - μ_{k}^{(t + 1)}) (x_{n} - μ_{k}^{(t + 1)})^{T} .

Algorithm Convergence

The algorithm terminates when the change in log-likelihood between iterations falls below a predefined threshold $ϵ$ :

| ℓ (θ^{(t + 1)}) - ℓ (θ^{(t)}) | < ϵ .

Implementation

In UncertaintyQuantification.jl, a GMM can be fitted to data using the GaussianMixtureModel function, which implements the EM algorithm described above. The function takes a DataFrame containing the samples and the number of components k as input. Optionally, one can set the maximum number of iterations and tolerance. The GMM is constructed as:

julia

# Generate sample data with two clusters
df = DataFrame(x1=randn(100), x2=2*randn(100))
k = 2
gmm = GaussianMixtureModel(df, k) # maximum_iterations = 100, tolerance=1e-4

This returns a MultivariateDistribution object. The fitted mixture model, constructed using the EM algorithm, is stored as a Distributions.MixtureModel from Distributions.jl in the field gmm.d:

julia

gmm.d

MixtureModel{FullNormal}(K = 2)
components[1] (prior = 0.2410): FullNormal(
dim: 2
μ: [-0.33478082938636566, 1.8957939141962368]
Σ: [0.8357434100638619 0.04498446782817538; 0.04498446782817538 4.092943397667619]
)

components[2] (prior = 0.7590): FullNormal(
dim: 2
μ: [0.2521899250337857, -0.4880932059530853]
Σ: [1.2425497523016813 0.7430781222970831; 0.7430781222970831 3.34328079023627]
)

Since the GMM is returned as a MultivariateDistribution, we can perform sampling and evaluation of the PDF the same way as for other (multivariate) random variables. For a more detailed explanation, we refer to the Gaussian Mixture Model Example.

Alternative mixture model construction

(Gaussian) Mixture models constructed with other packages can also be used to construct a MultivariateDistribution, as long as they return a Distributions.MixtureModel.

Gaussian Mixture Models ​

Expectation-Maximization Algorithm for GMMs ​

Expectation Step ​

Maximization Step ​