Skip to content

Gaussian Mixture Models

Gaussian Mixture Models (GMMs) are a probabilistic model that assumes all data points are generated from a mixture of K Gaussian distributions with unknown parameters. A GMM is characterized by its probability density function (PDF), which is expressed as a weighted sum of K component Gaussian densities:

p(x)=k=1KπkN(xμk,Σk),

where:

  • xRd is a d-dimensional observation vector

  • πk are the mixing coefficients (weights) satisfying πk0 and k=1Kπk=1

  • N(xμk,Σk) is the PDF of the k-th multivariate Gaussian component with mean vector μkRd and covariance matrix ΣkRd×d

In practice, GMMs are widely applied for multivariate density estimation, clustering, and dimensionality reduction from data [11]. For univariate density estimation, we refer to Kernel Density Estimation.

Expectation-Maximization Algorithm for GMMs

One way to find the parameters of a GMM from a set of samples is to use the Expectation-Maximization (EM) algorithm. Here, we show the basic steps to fit a GMM to data using the EM algorithm based on Ref.[11]. The EM algorithm iteratively refines the parameters of the GMM by alternating between two steps:

  1. Expectation Step (E-step): Calculate the expected value of the latent variables given the current parameters.

  2. Maximization Step (M-step): Update the parameters to maximize the expected log-likelihood found in the E-step.

Given a dataset D={x1,x2,,xN} of N independent observations, the goal is to estimate the parameters θ={π1,,πK,μ1,,μK,Σ1,,ΣK} that maximize the log-likelihood:

(θ)=n=1Nlogp(xnθ)=n=1Nlog(k=1KπkN(xnμk,Σk)).

The EM algorithm introduces latent variables znk{0,1} indicating whether observation n belongs to component k, where k=1Kznk=1 for each n. The algorithm iteratively maximizes the expected log-likelihood by alternating between two steps:

Expectation Step

Compute the posterior probabilities (responsibilities) γnk for each observation-component pair:

γnk(t+1)=πk(t)N(xnμk(t),Σk(t))j=1Kπj(t)N(xnμj(t),Σj(t)).

Maximization Step

Update the parameters using the computed responsibilities, where Nk=n=1Nγnk(t+1):

πk(t+1)=NkN,μk(t+1)=1Nkn=1Nγnk(t+1)xn,Σk(t+1)=1Nkn=1Nγnk(t+1)(xnμk(t+1))(xnμk(t+1))T.

Algorithm Convergence

The algorithm terminates when the change in log-likelihood between iterations falls below a predefined threshold ϵ:

|(θ(t+1))(θ(t))|<ϵ.

Implementation

In UncertaintyQuantification.jl, a GMM can be fitted to data using the GaussianMixtureModel function, which implements the EM algorithm described above. The function takes a DataFrame containing the samples and the number of components k as input. Optionally, one can set the maximum number of iterations and tolerance. The GMM is constructed as:

julia
# Generate sample data with two clusters
df = DataFrame(x1=randn(100), x2=2*randn(100))
k = 2
gmm = GaussianMixtureModel(df, k) # maximum_iterations = 100, tolerance=1e-4

This returns a MultivariateDistribution object. The fitted mixture model, constructed using the EM algorithm, is stored as a Distributions.MixtureModel from Distributions.jl in the field gmm.d:

julia
gmm.d
MixtureModel{FullNormal}(K = 2)
components[1] (prior = 0.2410): FullNormal(
dim: 2
μ: [-0.33478082938636566, 1.8957939141962368]
Σ: [0.8357434100638619 0.04498446782817538; 0.04498446782817538 4.092943397667619]
)

components[2] (prior = 0.7590): FullNormal(
dim: 2
μ: [0.2521899250337857, -0.4880932059530853]
Σ: [1.2425497523016813 0.7430781222970831; 0.7430781222970831 3.34328079023627]
)

Since the GMM is returned as a MultivariateDistribution, we can perform sampling and evaluation of the PDF the same way as for other (multivariate) random variables. For a more detailed explanation, we refer to the Gaussian Mixture Model Example.

Alternative mixture model construction

(Gaussian) Mixture models constructed with other packages can also be used to construct a MultivariateDistribution, as long as they return a Distributions.MixtureModel.