AI5: Generative AI: Deep Generative Models

About this course

Study the models that generate images, text, and audio, from diffusion models to GANs and autoregressive generators.

Course format. Thirteen weeks, four contact hours each: a two-hour lecture (concepts and theory) and a two-hour practice session. The course is project-based; teams carry one running project end to end and present it three times, in weeks 5, 8, and 13.

What you will build

Built a conditional generative pipeline in Python with PyTorch and Hugging Face Diffusers, implementing and benchmarking VAE, GAN, autoregressive, and diffusion generators with classifier-free guidance and FID evaluation on a shared data modality.

Expected outcomes

Formalize generative modeling as learning a data distribution
Derive the variational lower bound and the VAE objective
Explain the adversarial minimax game and GAN training dynamics
Derive the diffusion forward and reverse processes and the denoising objective
Connect diffusion to score matching and stochastic differential equations
Build autoregressive generators and analyze their likelihood factorization
Implement conditional and guided generation including classifier-free guidance
Evaluate generative models with FID, likelihood, and sample-quality metrics
Analyze the likelihood, sampling, and mode-coverage trade-offs across model families
Deploy a conditional generative pipeline end to end

Key topics

Diffusion models
GANs & VAEs
Autoregressive generation
Evaluating generative output

Theoretical foundations

The concepts and results this course rests on.

maximum-likelihood estimation of a data distribution
the evidence lower bound and variational inference
the reparameterization trick and the variational autoencoder
the adversarial minimax game and the optimal discriminator
the autoregressive likelihood factorization and causal masking
the diffusion forward and reverse denoising processes
score matching and stochastic differential equations

Prerequisites

This is a Year-3 course. It assumes the mandatory CS core: data structures and algorithms, operating systems, computer networks, databases, software engineering, and the core mathematics (linear algebra, probability and statistics, calculus, discrete mathematics). It additionally requires the specific prior courses listed below.

Course-specific prerequisites:

Deep Learning
Probability and linear algebra

Weekly schedule 13 weeks · lecture + practice

Generative foundations

Wk 1

What is a generative model

LectureWe define generative modeling, maximum likelihood, latent variables, and the taxonomy of model families.

PracticeTrain a simple density estimator and sample from it.

ProjectChoose the data modality and generative task for the running project.

WatchStanford CS236 Lecture 1: Introduction · Stanford CS236 Lecture 4: Maximum Likelihood Learning · Berkeley CS294-158 Lecture 1: Introduction

Wk 2

Latent variables and the ELBO

LectureWe derive latent-variable models, the evidence lower bound, and variational inference.

PracticeImplement variational inference on a toy latent-variable model.

ProjectEstablish a probabilistic baseline generator for the project data.

WatchStanford CS236 Lecture 5: VAEs · Berkeley CS294-158 Lecture 4: Latent Variable Models and VAEs

Variational models

Wk 3

Variational autoencoders

LectureWe derive the VAE, the reparameterization trick, and the reconstruction-versus-KL trade-off.

PracticeTrain a VAE and explore its latent space by interpolation.

ProjectBuild a VAE generator for the project modality.

WatchStanford CS236 Lecture 6: VAEs

Wk 4

Expressive and discrete latents

LectureWe cover hierarchical VAEs, vector-quantized latents, and posterior collapse.

PracticeTrain a VQ-VAE and inspect the learned discrete codebook.

ProjectUpgrade the generator with a discrete or hierarchical latent space.

WatchStanford CS236 Lecture 17: Discrete Latent Variable Models

Adversarial models

Wk 5

Generative adversarial networksPresentation

LectureWe derive the GAN minimax objective, the optimal discriminator, and the Jensen-Shannon connection.

PracticeTeam presentation: each team defends its generative specification and metrics.

ProjectLock the specification and prototype a GAN generator.

WatchStanford CS236 Lecture 9: GANs · Berkeley CS294-158 Lecture 5: GANs

Wk 6

Stabilizing GAN training

LectureWe cover mode collapse, Wasserstein GANs, gradient penalties, and training stability.

PracticeTrain a Wasserstein GAN and compare stability against the vanilla GAN.

ProjectImprove the GAN with stabilized training and conditioning.

WatchBerkeley CS294-158 Lecture 5: GANs

Autoregressive models

Wk 7

Autoregressive generation

LectureWe derive the autoregressive likelihood factorization and causal masking for sequences and images.

PracticeTrain an autoregressive model and sample token by token.

ProjectAdd an autoregressive generator and compare with the latent models.

WatchStanford CS236 Lecture 3: Autoregressive Models · Berkeley CS294-158 Lecture 2: Autoregressive Models

Diffusion models

Wk 8

Denoising diffusionPresentation

LectureWe derive the forward noising process, the reverse denoising process, and the simplified DDPM training objective.

PracticeTeam presentation: interim demo of generated samples across model families.

ProjectPrototype a diffusion generator for the project data.

WatchBerkeley CS294-158 Lecture 6: Diffusion Models

Wk 9

Score matching and SDEs

LectureWe connect diffusion to score matching and stochastic differential equations and derive the probability-flow ODE.

PracticeImplement a score-based sampler and compare sampling schedules.

ProjectRefine the diffusion model with score-based sampling.

WatchStanford CS236 Lecture 13: Score-Based Models · Stanford CS236 Lecture 16: Score-Based Diffusion Models

Conditional generation

Wk 10

Guidance and conditioning

LectureWe cover conditional diffusion, classifier and classifier-free guidance, and latent diffusion.

PracticeAdd conditioning and classifier-free guidance to control generation.

ProjectMake the generator controllable and conditional.

WatchBerkeley CS294-158 Lecture 6: Diffusion Models

Evaluation

Wk 11

Evaluating generative models

LectureWe cover likelihood, FID, inception score, precision-recall, and the difficulty of evaluating generation.

PracticeCompute FID and precision-recall across the team's models.

ProjectBenchmark all project generators with shared metrics.

Deployment

Wk 12

Efficient generation and serving

LectureWe cover sampling acceleration, distillation, and the quality-versus-speed trade-off at inference.

PracticeAccelerate sampling with distillation or fewer steps and serve the model.

ProjectMake the conditional generator fast and deployable.

Capstone

Wk 13

Final defensePresentation

LectureWe synthesize VAEs, GANs, autoregressive, and diffusion models and survey open research directions.

PracticeTeam presentation: final demo with samples, metrics, and an oral defense of design choices.

ProjectDeliver the complete conditional generative pipeline with evaluation results.

WatchStanford CS236 Lecture 1: Introduction

AI tools in this course.

Students use AI assistants to generate and refactor PyTorch VAE, GAN, autoregressive, and Diffusers pipeline code, vibe-coding the DDPM noise schedule and classifier-free guidance. They prompt AI to synthesize toy datasets, write reparameterization and sampling routines, and generate tests for FID and precision-recall scoring. AI also helps read sample grids, latent interpolations, and FID curves, diagnosing mode collapse or posterior collapse from the evidence.

Student project

Teams build one conditional generative system on a chosen data modality, implementing and comparing VAE, GAN, autoregressive, and diffusion approaches against shared metrics. The project culminates in a controllable, conditional generator backed by the probabilistic theory taught each week.

Requirements

Build a working system, not a set of disconnected exercises.
Be original: a new system that solves a real problem, not a re-implementation of a tutorial or course demo.
Show real depth: real data, real users or realistic load, and engineering trade-offs that are measured rather than assumed.
Carry one running project from specification to a deployed, defensible result across the whole term.
Work in a team of three or four and defend the design at each of the three presentations (weeks 5, 8, and 13).

Example projects

Conditional image synthesisText-to-image generationMolecular structure generationAudio and music generationAnomaly detection via generative densityData augmentation generatorStyle transfer and image editingTabular synthetic-data generation

Assessment & grading

Grading is project-based, with no written exam. Teams of three or four present one running project three times.

Component	What it covers	Weight
Project · Specification	Presentation 1 (week 5): problem, objectives, and architecture	20%
Project · Interim	Presentation 2 (week 8): the working system demonstrated live	30%
Project · Final	Presentation 3 (week 13): end-to-end demo with oral defense	50%

Tools & platforms

PyTorch: model implementation and training
Hugging Face Diffusers: diffusion model pipelines
Hugging Face Transformers: autoregressive backbones
torchvision: image datasets and transforms
clean-fid: standardized FID evaluation
Weights and Biases: experiment tracking and sample logging
Accelerate: multi-device training
einops: tensor reshaping for generative models
NumPy: numerical computation
Matplotlib: sample and latent-space visualization
Gradio: interactive generation demos
ONNX Runtime: optimized inference

Free online courses

Existing free, video-based courses this course can build on, for self-study or as a teaching basis.

YouTubeStanford CS236: Deep Generative Models (2023)
Full course: VAEs, GANs, flows, diffusion
YouTubeBerkeley CS294-158: Deep Unsupervised Learning (Spring 2024)
Abbeel: autoregressive, VAEs, GANs, diffusion

In Hebrew · בעברית

Google Cloud (Coursera)Introduction to Generative AI - בעברית
Hebrew-narrated introduction to generative AI and how it differs from traditional machine learning; free to audit.
Dr. Amos Azaria, Ariel University (YouTube)Deep Learning and NLP - קורס למידה עמוקה ועיבוד שפות טבעיות
Hebrew-spoken deep learning course providing the generative-model foundations (GANs, autoencoders, sequence generation).

Primary literature

Seminal works to read for graduate-level depth.

PaperDenoising Diffusion Probabilistic Models
Ho, Jain, Abbeel, 2020
PaperGenerative Adversarial Networks
Goodfellow, Pouget-Abadie, Mirza, Xu, Warde-Farley, Ozair, Courville, Bengio, 2014
PaperAuto-Encoding Variational Bayes
Kingma, Welling, 2013
PaperScore-Based Generative Modeling through Stochastic Differential Equations
Song, Sohl-Dickstein, Kingma, Kumar, Ermon, Poole, 2021
PaperHigh-Resolution Image Synthesis with Latent Diffusion Models
Rombach, Blattmann, Lorenz, Esser, Ommer, 2022

References

Books and resources link to an online or publisher page.

PaperDenoising Diffusion Probabilistic Models
Ho, Jain, Abbeel, 2020
PaperGenerative Adversarial Networks
Goodfellow, Pouget-Abadie, Mirza, Xu, Warde-Farley, Ozair, Courville, Bengio, 2014
PaperAuto-Encoding Variational Bayes
Kingma, Welling, 2013
PaperScore-Based Generative Modeling through Stochastic Differential Equations
Song, Sohl-Dickstein, Kingma, Kumar, Ermon, Poole, 2021
PaperHigh-Resolution Image Synthesis with Latent Diffusion Models
Rombach, Blattmann, Lorenz, Esser, Ommer, 2022
TextbookProbabilistic Machine Learning: An Introduction
Murphy, 2022
TextbookDeep Learning
Goodfellow, Bengio, Courville, 2016
DocumentationHugging Face Diffusers Documentation
Hugging Face, 2026

Role in each concentration

Concentration	Role
Intelligent Software Systems	Elective
Networking & Cyber Security	Elective
AI & Robotics	Core · Semester 2
AI and Quantum Computing for Finance	Elective
Immersive Systems & Game Development	Core · Semester 2
Defense Technologies & Autonomous Systems	Elective

← AI4 · Scalable AI: Big-Data Algorithms AI6 · Embodied AI: Robotics & Autonomous Systems →