Project Details
Projekt Print View

Hyperbolic Stochastic Neighbor Embeddings

Subject Area Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Mathematics
Term from 2021 to 2023
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 455095046
 
Final Report Year 2023

Final Report Abstract

The analysis of data is of major importance for a wide range of industries and for research in a variety of fields. A particularly demanding type of data are high-dimensional data sets. With increasing computing power and advancing measuring technologies, high-dimensional data are becoming increasingly common. A widely used practice to make the high-dimensional data points accessible for visual analysis by humans is to embed them into a lowdimensional space. Our focus was on the popular Stochastic Neighbor Embedding (SNE) algorithm, which computes these embeddings as well as on its variant t-Distributed Stochastic Neighbor Embedding (t-SNE). Our research project aimed at further developing the t-SNE approach in two key aspects: scalability and generalization to hyperbolic spaces. The goal of our first work package was to develop a hierarchical solver approach for the multi-scale computation of t-SNE embeddings. A similar approach was recently described, outlining a two-level sampling technique. Our experiments did not show any additional benefit in computation time or embedding quality when iterating the sampling, i.e., when going to a three- or higher-scale approach. However, our experiments revealed a relation between one of the most important hyper-parameters of t-SNE, the perplexity, and the number of data points to be embedded. Based on this, we were able to extend the simple scheme to an acceleration scheme with additional benefits in terms of computation time and embedding quality. For the second aspect, note that t-SNE methods mostly use Euclidean spaces as embedding spaces. In work packages two and three, we aimed embedding data points into hyperbolic spaces, which provide a promising alternative due to their different metrics. Several previous works have tackled this hyperbolic embedding problem. However, all these still follow the original t-SNE methodology [HR03] without utilizing acceleration structures, such as a tree-approximation. In the project, we developed such an acceleration structure for embedding computations into hyperbolic space. In the third work package, we intended to investigate the interaction possibilities hyperbolic t-SNE offers, such as changing the curvature of the embedding space and using hyperbolic Focus+Context techniques to explore the data. Due to time limitations, the elements from this work package were not tackled within the project and are left as future work.

Publications

 
 

Additional Information

Textvergrößerung und Kontrastanpassung