The DEC model looks for two things at the same time — embedding function and clustering centers. For a fixed number of clusters, DEC looks for a dimensionality-reducing embedding function that is a DNN (deep neural network), and a set of cluster centers in the lower dimensional latent space.
DEC is very much inspired by
First for each learnable cluster center we can define the t-distributed latent cluster assignment probability (soft assignment) in a
where for the experiments the authors chose the degree of freedom .
The soft assignment probabilities are compared to a target distribution using the cross entropy loss function
The target distribution is computed from as follow
where are the soft cluster frequencies.
The authors want the target distribution to:
- Strengthen predictions
- Put more emphasis on data points assigned with high confidence
- Normalize the loss contribution of each centroid to prevent large clusters from distorting the hidden feature space.