DEC (Deep Embedded Clustering)

June 2, 2023
Created by
Neo Yin
Done ✨
Reading Notes
The goal of this reading note is to jot down my high-level understanding of how DEC works and what it is trying to accomplish.
What is the main goal of the DEC model?

The DEC model looks for two things at the same time — embedding function and clustering centers. For a fixed kk number of clusters, DEC looks for a dimensionality-reducing embedding function ff that is a DNN (deep neural network), and a set of kk cluster centers in the lower dimensional latent space.

What is the loss objective of the DEC model?

DEC is very much inspired by


First for each learnable cluster center μi\mu_i we can define the t-distributed latent cluster assignment probability (soft assignment) in a

-like fashion.

qij=(1+f(xi)μi22/α)α+12j(1+f(xi)μj22/α)α+12,q_{ij} = \frac{(1+||f(x_i) - \mu_i||^2_2/\alpha)^{-\frac{\alpha + 1}{2}}}{\sum_{j'}(1+||f(x_i)-\mu_j'||_2^2/\alpha)^{-\frac{\alpha+1}{2}}},

where for the experiments the authors chose the degree of freedom α=1\alpha=1.

The soft assignment probabilities are compared to a target distribution pijp_{ij} using the cross entropy loss function

L=ijpijlogpijqij.L = \sum_i \sum_j p_{ij} \log \frac{p_{ij}}{q_{ij}}.

The target distribution is computed from qijq_{ij} as follow

pij=qij2/fjiqij2/fj,p_{ij} = \frac{q_{ij}^2/f_j}{\sum_{i'} q^2_{ij'}/f_{j'}},

where fj=iqijf_j = \sum_i q_{ij} are the soft cluster frequencies.

What is the motivation/intuition behind the target distribution pijp_{ij}?

The authors want the target distribution to:

  1. Strengthen predictions
  2. Put more emphasis on data points assigned with high confidence
  3. Normalize the loss contribution of each centroid to prevent large clusters from distorting the hidden feature space.