- t-SNE has the advantage of other dimension reduction methods that it tries to keep local structures when mapping high dimensional spaces in 2D or 3D.
- In the high-dimensional (hD) space, t-SNE uses multivariate Gaussian (normal) distributions to assign pairwise probabilities of data points.
- In the low-dimensional (lD) space, t-SNE uses Euclidean distances to assign pairwise distances of data points.
- In order to make the points in lD reflect their relationship in hD, the Kullback–Leibler divergence between the two distributions is minimised.
- The reason to use Student's t statistic is that it prevents the algorithm from rending dissimilar points too far apart in lD (TODO: I have not yet completely understood why).
- In real-life applications, Barnes-Hut simulation and quadtree implementations make the algorithm efficient enough to handle reasonably large data sets with thousands of samples.
- Relation to physics: t-SNE gradient can be viewed as a simulation of the N-body problem, with a spring term, an exertion/compression term, and a sum. (TODO: The N-body problem is new to me).
- As an extension, multi-map t-SNE allows representation of correlated features in different contexts.
- t-SNE can be used to cluster samples and to assist feature selection.
Montag, 8. Mai 2017
t-distributed Stochastic Neighbor Embedding (tSNE)
Notes by following the video lecture on t-SNE, given by Laurens van der Maaten at Google Tech Talk, available here on YouTube.