From atoms to emergent mechanisms with information bottleneck and diffusion probabilistic models

P. Tiwary
University of Maryland,
United States

Keywords: AI, high-dimensional data


The ability to rapidly learn from high-dimensional data to make reliable predictions about the future is crucial in many contexts. This could be a fly avoiding predators, or the retina processing terabytes of data guiding complex human actions. Modern day artificial intelligence (AI) aims to mimic this fidelity and has been successful in many domains of life. It is tempting to ask if AI could also be used to understand and predict the emergent mechanisms across timescales for complex molecules with millions of atoms. In this talk I will show that certain flavors of AI can indeed help us understand generic molecular structure and dynamics, and also predict it even in situations with arbitrary long memories. However this requires close integration of AI with old and new ideas in statistical mechanics. I will talk about such methods developed by my group (1-3) using information bottleneck, denoising probabilistic models and long short-term memory networks, focusing on the first one or two frameworks in interest of time. I will demonstrate the methods on different problems, where we predict mechanisms at timescales much longer than milliseconds while keeping all-atom/femtosecond resolution. These include ligand dissociation from flexible protein/RNA and crystal nucleation with competing polymorphs. References: 1. Wang, Y., Ribeiro, J.M.L. & Tiwary, P. Past–future information bottleneck for sampling molecular reaction coordinate simultaneously with thermodynamics and kinetics. Nat. Commun. 10, 3573 (2019). 2. Wang, Y., & Tiwary, P. Denoising diffusion probabilistic models for replica exchange. arXiv preprint arXiv:2107.07369 (2021). 3. Tsai, S.T, Kuo, E.J. & Tiwary, P. Learning Molecular Dynamics with Simple Language Model built upon Long Short-Term Memory Neural Network. Nat. Commun. 11, 5115 (2020).