(SP25) CS 498 MLSys - Machine Learning System

Schedule (Tentative)

-->
Date Lecturer Topics Slides
Jan 21 Minjia Zhang Course Introduction and Logistics pdf
Jan 23 Minjia Zhang Deep Learning Basics pdf
Jan 28 Minjia Zhang Transformers Deep Dive and Arithmetic Intensity pdf
Jan 30 Minjia Zhang Distributed Training Overview, Parameter Server, Asynchronous Training pdf
Feb 4 Minjia Zhang Data Parallelism, Communication Terminologies
pdf
Feb 6 Minjia Zhang Tensor Slicing Model Parallelism pdf
Feb 11 Minjia Zhang Pipeline Parallelism pdf
Feb 13 Minjia Zhang Multi-Dimensional Parallelism pdf
Feb 18 Minjia Zhang Mixed Precision Training pdf
Feb 20 Minjia Zhang Memory Optimization, Rematerialization pdf
Feb 25 Minjia Zhang ZeRO-style Data Parallelism pdf
Feb 27 Minjia Zhang Training with Heterogeneous Memory pdf
Mar 4 Minjia Zhang Course Project Proposal Feedback
Mar 6 Minjia Zhang Course Project Proposal Feedback
Mar 11 Masahiro Tanaka (Guest Lecture) Advancing Large-scale and Efficient Deep Learning with DeepSpeed
Mar 13 Yanqi Zhou (Guest Lecture) Scaling LLMs: Modularity, Distribution, and Efficient Inference
Spring Break (March 17-21)
Mar 25 Minjia Zhang Inference Overview pdf
Mar 27 Minjia Zhang GPU Memory Hierarchy, FlashAttention Part 1 pdf
April 1 Minjia Zhang FlashAttention Part 2, LLM Inference
pdf
Apr 3 Minjia Zhang Continuous Batching
pdf