Machine Learning for Single-Cell Biology

In the past years, machine learning has started to help understand the molecular biology of single cells. Within this context, we develop methods that target specific biological questions, hence originate from many different areas of machine learning: topological data analysis [P25, see below], manifold learning [P19], causal inference [T20] or deep learning [P20, see below]. To provide these for practical use, we develop highly-performant software [P23, see below]. With F. J. Theis.

Scanpy [P23, code] is a scalable toolkit for analyzing single-cell gene expression data. It includes preprocessing, visualization, clustering, pseudotime and trajectory inference and differential expression testing. It is currently (December 2017) the only package that can tackle the recently exploding dataset sizes without subsampling, scaling to more than one million cells.

Partition-based graph abstraction (PAGA) reconciles clustering with manifold learning by explaining variation among observations using both discrete and continuous latent variables [P26, code]. PAGA generates coarse-grained maps of manifolds with complex topologies - efficiently and robustly across different datasets.

We showed how to reconstruct continuous biological processes using deep learning for the examples of cell cycle and disease progression in diabetic retinopathy [P20]. Read more.

The goal of the Data Science Bowl 2017 was to predict lung cancer from computed tomography scans. We won the 7th prize among almost 2000 teams; the best result among all German teams [code].

Earlier Work

Tensor Trains (MPS, DMRG) rank, with quantum Monte Carlo and the Numerical Renormalization Group, among the most popular numerical approaches for tackling the exponential computational complexity of models of strongly correlated materials. Being a topic in applied Mathematics since a few years, they have recently appeared within Machine Learning. I developed a way to use Tensor Trains within Dynamical Mean-Field Theory to improve our ability of simulating strongly correlated materials [O6,P12-P18]. With U. Schollwöck.

Before that, I modeled diffusion-reaction processes to enhance material properties of solar cells [O5,P8-P11]. With P. Pichler. Also, I investigated the quantum Rabi model, which is, for example, important for understanding technical foundations of quantum computing [P6,P7]. With D. Braak.

During studies, I focused on emergent properties of quantum-many body systems and their applications, for example, in showing how grain boundaries limit high-temperature superconductivity [P5]. With T. Kopp. Also, I did research on the non-equilibrium behavior of these systems [P1-P4], in particular, the fundamental problem of how such systems transition from an excited state to equilibrium. This happens through chaotic dynamics in the classical case, but is an active area of research in the quantum case. We showed that the transition proceeds through an intermediate, prethermalized, plateau for which a statistical theory applies - M. Kollar posed this as a problem for a summer project, during which I contributed the central analytical calculation [T1] to the highly cited paper [P3]. With M. Rigol, I investigated collapse and revival oscillations and coherent expansions, as suggested for realizing matter-wave lasers [P2,P4].

During high school, I tried to gain a better understanding of how philosophical and political ideas stimulate change in society and culture. My thesis investigated why J.-P. Sartre publicly supported the German terrorist group RAF upon his visit in Stammheim in 1974 [O1].