The Bayes-Duality Project

Toward AI that learns adaptively, robustly, and continuously, like humans

About People Research Publications

About the project

Goal: To develop a new learning paradigm for Artificial Intelligence (AI) that learns like humans in an adaptive, robust, and continuous fashion.

Summary: The new learning paradigm will be based on a new principle of machine learning, which we call the Bayes-Duality principle and will develop during this project. Conceptually, the new principle hinges on the fundamental idea that an AI should be capable of efficiently preserving and acquiring the relevant past knowledge, for a quick adaptation in the future. We will apply the principle to representation of the past knowledge, faithful transfer to new situations, and collection of new knowledge whenever necessary. Current Deep-learning methods lack these mechanisms and instead focus on brute-force data collection and training. Bayes-Duality aims to fix these deficiencies.

Funding and Duration:



Team PIs

Approx-Bayes team
Emtiyaz Khan
Emtiyaz Khan

Research director (Japan side)

Approx-Bayes team at RIKEN-AIP and OIST

Stat-Theory team
Julyan Arbel
Math-Science team
Kenichi Bannai
HPC team
Rio Yokota

Core members

Pierre Alquier
Pierre Alquier

Core Member

Research Scientist, Approx-Bayes team at RIKEN-AIP


Name University Position Team in project
Paul Chang University of Aalto, Department of Computer Science Ph.D. Student Approx-Bayes team
Florence Forbes Inria Grenoble Rhône-Alpes Principal investigator Stat-Theory team
Kei Hagihara RIKEN AIP Postdoctoral Researcher Math-Science team
Samuel Kaski University of Aalto Department of Computer Science, Professor Approx-Bayes team
Takahiro Katagiri Nagoya University, Information Technology Center Professor HPC team
Akihiro Ida The University of Tokyo, Information Technology Center, Project Associate Professor HPC team
Takeshi Iwashita Hokkaido University, Information Initiative Center Professor HPC team
Julien Mairal Inria Grenoble Rhône-Alpes Research scientist Stat-Theory team
Eren Mehmet Kiral RIKEN AIP Special Postdoctoral Researcher Math-Science team
Kengo Nakajima The University of Tokyo, Information Technology Center Professor HPC team
Takeshi Ogita Tokyo Woman’s Christian University, School of Arts and Sciences Professor HPC team
Jan Peters TU Darmstadt Professor Approx-Bayes team
Judith Rousseau Université Paris-Dauphine & University of Oxford Professor Stat-Theory team
Haavard Rue King Abdullah University of Science and Technology, CEMSE division Professor Approx-Bayes team
Akiyoshi Sannai RIKEN AIP Research Scientist Math-Science team
Mark Schmidt University of British Columbia, Department of Computer Science Associate Professor Approx-Bayes team
Arno Solin University of Aalto, Department of Computer Science Assistant Professor Approx-Bayes team
Siddharth Swaroop Cambridge University, UK, Department of Engineering Ph.D. Student Approx-Bayes team
Yuuki Takai RIKEN AIP Postdoctoral Researcher Math-Science team
Asuka Takatsu Tokyo Metropolitan University Associate Professor Math-Science team
Richard Turner Cambridge University, UK, Department of Engineering Associate Professor Approx-Bayes team
Mariia Vladimirova Inria Grenoble Rhône-Alpes PhD Student Stat-Theory team
Pierre Wolinski Inria Grenoble Rhône-Alpes & University of Oxford Post-doctoral fellow Stat-Theory team
Shuji Yamamoto Keio University/RIKEN AIP Associate Professor Math-Science team


Open Positions

The formal call for application will be out later in the year. For now, we have a rough number of positions below. If interested, please send all inquiries to jobs-bayes-duality (at) googlegroups (dot) com, and indicate the location(s) and group(s) you are interested in. Joint positions (between countries/institutes) are also possible, but we are still figuring out the details.



Bayes duality illustration

Our goal is to develop a new learning paradigm that enables adaptive, robust, and lifelong learning of AI systems. Deep learning methods are not sufficiently adaptive or robust, e.g., new knowledge cannot be easily added in trained models and, when forced, the old knowledge is easily forgotten. Given a new dataset, the whole model needs to be retrained from scratch on both the old and new data, and training only on the new dataset leads to the catastrophic forgetting of the past. All of the data must be available at the same time, which creates a dependency on large datasets and models that plagues almost all deep learning systems. Our main goal is to fix this by developing a new learning paradigm to support adaptive and robust systems that learn throughout their lives.

We introduce a new learning-principle for machine learning, which we call the Bayes Duality principle or “Bayes-Duality”. The principle exploits the “dual perspectives” of approximate (Bayesian) posteriors, to extend the concepts of duality (similar to convex duality) to nonconvex problems. It is based on a new discovery that natural-gradients used in approximate Bayesian methods automatically give rise to such dual representations. In the past, we have shown that natural-gradients for Bayesian problems yield a majority of machine learning algorithms as special cases (see our paper on the Bayesian Learning Rule). Our goal now is to show that the same approach to apply ideas of duality to nonconvex problems, such as those that arise in deep learning.

Our main goals in the future include the following:

  • A theory of Bayes-duality and connections to other dualities
  • Theoretical guarantees for adaptive systems based on Bayes-duality
  • Practical methods for knowledge transfer and collection in deep learning

We already have a few promising results in these directions



  • The Bayesian Learning Rule,
    (Preprint) M.E. Khan, H. Rue [ arXiv ] [ Tweet ]
  • Knowledge-Adaptation Priors,
    (NeurIPS 2021) M.E. Khan*, Siddharth Swaroop* [ arXiv ] [ Slides ] [ Tweet ] [ SlidesLive Video ]
  • Dual Parameterization of Sparse Variational Gaussian Processes,
    (NeurIPS 2021) P. Chang, V. Adam, M.E. Khan, A. Solin
  • Continual Deep Learning by Functional Regularisation of Memorable Past,
    (NeurIPS 2020, Oral) P. Pan*, S. Swaroop*, A. Immer, R. Eschenhagen, R. E. Turner, M.E. Khan [ arXiv ] [ Code ] [ Poster ]
  • Approximate Inference Turns Deep Networks into Gaussian Processes,
    (NeurIPS 2019) M.E. Khan, A. Immer, E. Abedi, M. korzepa. [ arXiv ] [ Code ]
  • Decoupled Variational Gaussian Inference,
    (NIPS 2014) M.E. Khan [ Paper and appendix ]
  • Fast Dual Variational Inference for Non-Conjugate Latent Gaussian Models,
    (ICML 2013) M.E. Khan, A. Aravkin, M. Friedlander, M. Seeger [ Paper ]