Project Details
On the Convergence of Variational Deep Learning to Sums of Entropies
Applicants
Professorin Dr. Asja Fischer; Professor Dr. Jörg Lücke
Subject Area
Mathematics
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term
since 2021
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 464104047
Deep probabilistic data models represent a central theoretical foundation for deep unsupervised learning. In the form, e.g., of sigmoid or deep belief networks they have played a key role in the establishment of Deep Learning as a research field. Furthermore, they are driving current theoretical and practical developments, e.g., in the form of Variational Autoencoders (VAEs), Generative Adversarial Nets or Deep Restricted Boltzmann Machines (RBMs).For all deep probabilistic models, learning algorithms change the model parameters until learning has converged, i.e., until learning parameters cease to change significantly. The convergence points of deep learning can consequently be regarded as stationary points of a learning dynamics. The concrete rules how the parameters change are derived from the respective objective function a deep learning algorithm aims to optimizes. Most deep probabilistic models (including SBNs, RBMs as well as standard and recent VAEs) are based on a theoretically well-founded learning objective: the variational lower bound (a.k.a. ELBO). Here we aim at investigating a theoretical structure that seems common to all models based on a variational learning objective: we aim at investigating the convergence of the variational lower bound to sums of entropies. Each deep probabilistic model is defined by its constituting distributions, e.g., latent and observable distributions for SBNs and VAEs or specific Boltzmann distributions for RBMs. Our hypothesis is that during learning, parameters of the above mentioned models change such that the variational lower bound becomes equal to sums of entropies at convergence. These entropies are defined by those distributions that define a given deep generative model. We aim at investigating for which class of deep models such a behavior can be proven under realistically encountered conditions. Furthermore, we aim at exploiting convergence to entropies to improve deep learning by (A) providing theoretically grounded strategies to avoid shallow local optima, mode collapse and overfitting; and (B) by providing partial analytical solutions for deep variational optimization. Our main mathematical tools will be the theory for exponential family distributions and the theory of variational deep learning.We believe that the implications of the theoretical structure we aim at investigating can very significantly change and deepen our understanding of deep unsupervised learning; and we believe that much improved novel approaches for unsupervised deep learning will follow from the theoretical insights we will gain.
DFG Programme
Priority Programmes
Subproject of
SPP 2298:
Theoretical Foundations of Deep Learning
International Connection
Canada, France, United Kingdom
Cooperation Partners
Dr. Jörg Bornschein; Dr. Zhenwen Dai; Georgios Exarchakis, Ph.D.; James Lucas