Deep Energy-Based Learning in Computer Vision

ECCV 2022 Tutorial

October 24th, 2022


There has been growing interest and advance in deep energy-based learning. The deep energy-based model specifies an explicit probability density up to a normalization by using a modern bottom-up neural network to parameterize the energy function. The model can be trained by Langevin dynamics-based maximum likelihood estimation. It unifies the bottom-up representation and top-down generation into a single framework, which makes it different from the other generative models, such as generative adversarial net (GAN) and variational auto-encoder (VAE). This tutorial provides a quick introduction of current deep energy-based modeling and learning methodologies. It starts from the background of energy-based models from the perspective of computer vision, and then presents three categories of deep energy-based frameworks, including deep energy-based models in data space, energy-based cooperative learning frameworks, and energy-based models in latent space. This tutorial aims to enable researchers to learn about the current advance of deep energy-based learning and apply the knowledges to other domains.


Jianwen Xie

Baidu Research, USA


Click here to download

Tutorial Video


Part I : Background
  • Knowledge Representation: Sets, Concepts and Models
  • Pattern Theory
  • Texture Modeling
  • Clique-Based Markov Random Field
  • FRAME (Filters, Random field, And Maximum Entropy)
  • Inhomogeneous FRAME Model
  • Sparse FRAME Model
  • Hierarchical Sparse FRAME Model
  • Deep FRAME Model
  • Deep Energy-Based Models – Generative ConvNet
  • Three Research Directions of Deep Energy-Based Learning
Part II : Deep Energy-Based Models in Data Space
  • Maximum Likelihood Estimation of Generative ConvNet
  • Mode Seeking and Mode Shifting
  • Adversarial Interpretations
  • Short-run MCMC for EBM
  • Multi-Grid Modeling and Sampling
  • Multi-Stage Coarse-to-Fine Expanding and Sampling
  • Energy-Based Image Inpainting
  • One-Sided Energy-Based Image-To-Image Translation
  • Patchwise Generative ConvNet for Internal Learning
  • Spatial-Temporal Generative ConvNet: EBMs for Videos
  • Generative VoxelNet: EBMs for 3D Voxels
  • Generative PointNet: EBMs for Unordered Point Clouds
  • Energy-Based Continuous Inverse Optimal Control
Part III : Deep Energy-Based Cooperative Learning
  • Generator Model as a Deep Latent Variable Model
  • Maximum Likelihood Learning of Generator Model
  • Two Generative Models: EBM vs. LVM
  • Cooperative Learning via MCMC Teaching
  • Cooperative Conditional Learning
  • Cycle-Consistent Cooperative Network
  • Generative Cooperative Saliency Prediction
  • Cooperative Learning via Variational MCMC Teaching
  • Cooperative Learning of EBM and Normalizing Flow
Part IV : Deep Energy-Based Models in Latent Space
  • Latent Space Energy-Based Prior Model
  • Learning by Maximum Likelihood
  • Prior and Posterior Sampling
  • Learning and Sampling Algorithm of Latent Space EBM
  • Conditional Latent Space EBM for Saliency Prediction