Anisotropic Gaussian Mixture

Visualizing Gaussian Mixtures

Gaussian Mixture Models (GMMs) form the foundation of many probabilistic classifiers. They assume that each class can be described as a Gaussian distribution with mean $\mu_k$ and covariance $\Sigma_k$. Depending on how these covariances are constrained, we obtain very different decision rules — from simple linear boundaries to fully curved ones.

This visualization was designed to make those transitions intuitive. By interacting with the covariance matrices, you can see how each classical model emerges as a special case of the Gaussian framework:

Use separate covariances ($\Sigma_1 \neq \Sigma_2$)

$\Sigma_1$ (Blue)

Var$_{11}$ ($x_1$): 1.00 Var$_{22}$ ($x_2$): 1.00 Cov$_{12}$: 0.00

$\Sigma_2$ (Gold)

Var$_{11}$ ($x_1$): 1.00 Var$_{22}$ ($x_2$): 1.00 Cov$_{12}$: 0.00

Number of Points per Cluster: 50

Cluster 1 (Blue)

Center $x_1$: -1.5 Center $x_2$: 0.0

Cluster 2 (Gold)

Center $x_1$: 1.5 Center $x_2$: 0.0

Cluster 1 Region

Cluster 2 Region

Mahalanobis Boundary

Euclidean Bisector

Principal Axes

The Geometry of Gaussian Mixtures

The beauty of Gaussian mixture models lies in how geometry and probability intertwine. Each class $k \in \{1,2\}$ is characterized by a mean vector $\mu_k$ — the cluster's center of mass — and a covariance matrix $\Sigma_k$ that sculpts its shape. Together, these parameters determine not only how data clusters in space, but also the decision boundary: the elegant curve where both classes become equally likely.

This boundary emerges naturally from the interplay of Mahalanobis distances. Beginning with the multivariate Gaussian density,

$$p(\mathbf{x} \mid \mu, \Sigma) \propto \exp\!\left[-\frac{1}{2}(\mathbf{x}-\mu)^\top \Sigma^{-1}(\mathbf{x}-\mu)\right],$$

the decision boundary is obtained by equating the two class densities:

$$p(\mathbf{x} \mid \mu_1, \Sigma_1) = p(\mathbf{x} \mid \mu_2, \Sigma_2).$$

Taking logs and simplifying yields:

$$(\mathbf{x}-\mu_1)^\top \Sigma_1^{-1}(\mathbf{x}-\mu_1) + \ln|\Sigma_1| = (\mathbf{x}-\mu_2)^\top \Sigma_2^{-1}(\mathbf{x}-\mu_2) + \ln|\Sigma_2|.$$

Each covariance assumption produces a distinct type of boundary — from flat bisectors to curved conics.

1. Isotropic GMM — spherical and equal

Assume equal spherical covariances: $\Sigma_1 = \Sigma_2 = \sigma^2 I$.

Then $\Sigma_k^{-1} = \frac{1}{\sigma^2}I$ and the determinants cancel, leaving:

$$\|\mathbf{x}-\mu_1\|^2 = \|\mathbf{x}-\mu_2\|^2.$$

Expanding gives:

$$(\mu_1-\mu_2)^\top \mathbf{x} = \frac{1}{2}(\|\mu_1\|^2 - \|\mu_2\|^2),$$

which defines the Euclidean bisector between the two centers. The boundary is linear, orthogonal to $(\mu_1-\mu_2)$, and invariant to scaling.

2. LDA (Equal $\Sigma$) — same shape, rotated ellipses

Let the clusters share a common anisotropic covariance: $\Sigma_1 = \Sigma_2 = \Sigma$.

Substituting into the equation above gives:

$$(\mathbf{x}-\mu_1)^\top \Sigma^{-1}(\mathbf{x}-\mu_1) = (\mathbf{x}-\mu_2)^\top \Sigma^{-1}(\mathbf{x}-\mu_2).$$

Expanding and simplifying:

$$(\mu_1-\mu_2)^\top \Sigma^{-1}\mathbf{x} = \frac{1}{2}(\mu_1+\mu_2)^\top \Sigma^{-1}(\mu_1-\mu_2).$$

The boundary remains linear, but its orientation is now orthogonal to $\Sigma^{-1}(\mu_1-\mu_2)$, not to $(\mu_1-\mu_2)$. This "twist" reflects how $\Sigma$ stretches space along its principal axes — a rotated Mahalanobis geometry.

3. Eigenvector Case — alignment between covariance and separation

In this configuration, the direction connecting the two means, $(\mu_1-\mu_2)$, is chosen as one of the eigenvectors of the covariance matrix:

$$\Sigma(\mu_1-\mu_2) = \lambda(\mu_1-\mu_2).$$

Geometrically, this means the ellipses' principal axis is exactly aligned with the line joining the cluster centers. In this aligned case, the Mahalanobis metric rescales space along the same direction as the Euclidean separation — so the boundary remains linear and orthogonal to the line between the means.

In essence, anisotropy and orientation "agree": the deformation introduced by $\Sigma$ does not tilt the boundary, but only stretches distances along or across it.

4. QDA (Different $\Sigma$) — complementary covariances

When the clusters have distinct covariances: $\Sigma_1 \neq \Sigma_2$, the boundary equation becomes:

$$\mathbf{x}^\top(\Sigma_2^{-1}-\Sigma_1^{-1})\mathbf{x} + 2(\Sigma_1^{-1}\mu_1-\Sigma_2^{-1}\mu_2)^\top\mathbf{x} + (\mu_2^\top\Sigma_2^{-1}\mu_2-\mu_1^\top\Sigma_1^{-1}\mu_1) + \ln\frac{|\Sigma_2|}{|\Sigma_1|} = 0.$$

The presence of the quadratic term $\mathbf{x}^\top(\cdot)\mathbf{x}$ makes the boundary a conic section — typically an ellipse or hyperbola. The separating curve bends toward the class with greater variance, where Mahalanobis distance grows more slowly.

✨ Interpretation

Covariance defines the "geometry" of the model — how distances and directions are perceived:

Case	Boundary Type	Key Condition	Geometry
Isotropic	Linear	$\Sigma = \sigma^2 I$	Perfect Euclidean symmetry
LDA	Linear	Shared anisotropic $\Sigma$	Rotated Mahalanobis bisector
Eigenvector Case	Linear	$(\mu_1-\mu_2)$ eigenvector of $\Sigma$	Aligned axes, pure scaling
QDA	Quadratic	$\Sigma_1 \neq \Sigma_2$	Curved conic boundary