♠️

# DEC (Deep Embedded Clustering)

Created
June 2, 2023
Created by Neo Yin
Status
Done ✨
Tags 🎯
The goal of this reading note is to jot down my high-level understanding of how DEC works and what it is trying to accomplish.
What is the main goal of the DEC model?

The DEC model looks for two things at the same time — embedding function and clustering centers. For a fixed $k$ number of clusters, DEC looks for a dimensionality-reducing embedding function $f$ that is a DNN (deep neural network), and a set of $k$ cluster centers in the lower dimensional latent space.

What is the loss objective of the DEC model?

DEC is very much inspired by

.

First for each learnable cluster center $\mu_i$ we can define the t-distributed latent cluster assignment probability (soft assignment) in a

-like fashion.

$q_{ij} = \frac{(1+||f(x_i) - \mu_i||^2_2/\alpha)^{-\frac{\alpha + 1}{2}}}{\sum_{j'}(1+||f(x_i)-\mu_j'||_2^2/\alpha)^{-\frac{\alpha+1}{2}}},$

where for the experiments the authors chose the degree of freedom $\alpha=1$.

The soft assignment probabilities are compared to a target distribution $p_{ij}$ using the cross entropy loss function

$L = \sum_i \sum_j p_{ij} \log \frac{p_{ij}}{q_{ij}}.$

The target distribution is computed from $q_{ij}$ as follow

$p_{ij} = \frac{q_{ij}^2/f_j}{\sum_{i'} q^2_{ij'}/f_{j'}},$

where $f_j = \sum_i q_{ij}$ are the soft cluster frequencies.

What is the motivation/intuition behind the target distribution $p_{ij}$?

The authors want the target distribution to:

1. Strengthen predictions
2. Put more emphasis on data points assigned with high confidence
3. Normalize the loss contribution of each centroid to prevent large clusters from distorting the hidden feature space.