Special Seminar in Computing and Mathematical Sciences
The spatial organization of the genome represents an important regulator of gene expression, and alterations thereof are associated with various diseases. A recent break-through in genomics makes it possible to perform perturbation experiments at a very large scale. This motivates the development of a causal inference framework that is based on observational and interventional data. We characterize the causal relationships that are identifiable and present the first provably consistent algorithm for learning a causal network from such data. I will then link gene expression with the 3D genome organization. In particular, we will discuss approaches for integrating different data modalities and analyze alterations in the spatial organization of the genome via autoencoders and optimal transport. We end by a theoretical analysis of autoencoders linking overparameterization to memorization. In particular, we will show that overparameterized single-layer fully connected autoencoders as well as deep convolutional autoencoders memorize images, i.e., they produce outputs in the span of the training images. Collectively, this talk will highlight the symbiosis between genomics and AI and show how biology can lead to new theorems, which in turn can guide biological experiments.