Theory and Application of Energy-Based Generative Models

ICCV 2021 Tutorial

October 16th, 2021
The tutorial has been recorded and is available with the slides below.


In recent years, there has been growing interest in ConvNet-parametrized energy-based generative models. The concomitant need for representation, generation, efficiency and scalability in generative models is addressed by the framework of ConvNet-parametrized EBMs. Specifically, different from existing popular generative models, such as Generative Adversarial Nets (GANs) and Variational Auto-encoders (VAEs), the energy-based generative model can unify the bottom-up representation and top-down generation into a single framework, and can be trained by "analysis by synthesis", without recruiting an extra auxiliary model. Both model parameter update and data synthesis can be efficiently computed by back-propagation. The model can be easily designed and scaled up. The expressive power and advantages of this framework has launched a series of research works leading to significant theoretical and algorithmic maturity. Due to its major advantages over conventional models, energy-based generative models are now utilized in many computer vision tasks. The tutorial will provide a comprehensive introduction to energy-based generative modeling and learning in computer vision. An intuitive and systematic understanding of the underlying learning objective and sampling strategy will be developed. Different types of computer vision tasks successfully solved by the energy-based generative frameworks will be presented. Besides introducing the energy-based framework and the state-of-the-art applications, this tutorial will aim to enable researchers to apply the energy-based learning principles in other contexts of computer vision.


Jianwen Xie

Baidu Research, USA

Ying Nian Wu

University of California, Los Angeles


Click here to download

Tutorial Video


Part I : Fundamentals
1. Background
  • Probabilistic models of images
  • Gibbs distribution in statistical physics
  • Filters, Random Fields and Maximum Entropy (FRAME) models
  • Generative ConvNet: EBM parameterized by modern neural network
2. Elements of Energy-Based Generative Learning
  • Understanding Kullback-Leibler divergences
  • Maximum likelihood learning, analysis by synthesis
  • Gradient-based MCMC and Langevin sampling
  • Adversarial self-critic interpretations
  • Short-run MCMC for synthesis for EBMs
  • Equivalence between EBMs and discriminative models
Part II : Advanced
1. Strategy for Efficient Learning and Sampling
  • Multi-stage expanding and sampling for EBMs
  • Multi-grid learning and sampling for EBMs
  • Learning EBM by recovery likelihood
2. Energy-Based Generative Frameworks
  • Generative cooperative network
  • Divergence triangle
  • Latent Space Energy-Based Prior Model
  • Flow contrastive estimation of energy-based model
Part III : Applications
1. Energy-Based Generative Neural Networks
  • Generative ConvNet: EBMs for images
  • Spatial-Temporal Generative ConvNet: EBMs for videos
  • Generative VoxelNet: EBMs for 3D volumetric shapes
  • Generative PointNet: EBMs for unordered point clouds
  • EBMs for inverse optimal control and trajectory prediction
  • Patchwise Generative ConvNet: EBMs for internal learning
2. Energy-Based Generative Cooperative Networks
  • Unconditioned image, video, 3D shape synthesis
  • Supervised conditional learning
  • Unsupervised image-to-image translation
  • Unsupervised sequence-to-sequence translation
  • Generative saliency prediction
3. Latent Space Energy-Based Models
  • Text Generation
  • Molecule Generation
  • Anomaly Detection
  • Saliency prediction using transformer with energy-based prior
  • Trajectory Prediction
  • Semi-Supervised Learning
  • Controlled Text Generation