I work on visualization algorithms. It is a great topic to work in as it provides a very concrete output of otherwise opaque data.

One downside is that it can be hard to work in only 2 dimensions (or three) instead of a latent space with hundreds on dimensions. This usually changes the problem in unexpected ways, leading to interesting questions about the model in general.

If you are a student at the University of Tübingen and want to work in this domain, please reach out. We have opportunities for lab rotations or Master thesis supervision.


Annotated plot of cifar10 showing the subcluster structure
Subcluster structure in CIFAR-10 (Böhm et al., 2022).

Currently I work on visualization for image data. It neatly ties together nonlinear dimension reduction, à la t-SNE, with contrastive learning. While the former shines in the visualization domain by optimizing and extracting information from a kNN graph (with a lot of hand-waiving), the latter learns a good representation by implicitly defining a graph due to transformations on the original data. By combining those two approaches, we can leverage a new optimization goal for image data that produces sensible visualizations of natural images.

MSc thesis

My topic for the MSc thesis was mostly concerned with making a comprehensive comparison between t-SNE and UMAP (and others). We noticed that these algorithms can be generalized as a visualization that blanaces attraction and repulsion, which revealed that UMAP relates to another parameter configuration of t-SNE. This means that the purported dissimilarity the algrithms are optimizing the same objective which lies on a spectrum.

This work has been published in JMLR (Böhm et al., 2022) and I presented a poster at NeurIPS 2022 in New Orleans. The entire thesis is also available.