The course assignments consist of
(i) attendance and class participation,
(ii) lab assignments,
(iii) reading summary,
(iv) final project presentation,
and (v) completing an open-ended research project. The breakdown is as
follows.
The instruction will select 10 highly relevant papers in MLSys (mostly under 12 pages). One paper per week (starting from Jan 27), submit your reading by the end of day of Friday midnight of each week.
The reading summary should be done independently and include the following contents:
- The problem the paper is trying to
tackle.
- What's the impact of the work, e.g., why is it an important problem to solve?
- The main proposed idea(s).
- A summary of your understanding of different components of the proposed technique, e.g., the purpose of critical design choices.
- Your perceived strengths and weaknesses of the work, e.g., novelty, significance of improvements, quality of the evaluation, easy-to-use.
- Is there room for improvement? If so, what idea do you have for improving the techniques?
The reading summary length should be around 4-5 paragraphs. However, you do not need to write super long paragraphs, as long as you have the key points listed out in each paragraph. You can discuss the paper with other students, but all of your writing work should be your own.
In terms of grading criteria, each summary has 12 points in total. For each review item above, you get:
- 2: The summary item demonstrates a clear understanding of the paper.
- 1: The summary item misses the point of the paper.
- 0: The summary item is missing.
Course Project
The course also includes proposing and completing a course project.
The project can involve, but is not limited to, any of the following tasks:
- Benchmark and analyze important DL workloads to understand their performance gap and identify important angles to optimize their performance.
- Apply and evaluate how existing solutions work in the context of emerging AI/DL workloads.
- Design and implement new algorithms that are both theoretically and practically efficient.
- Design and implement system optimizations, e.g., parallelism, cache-locality, IO-efficiency, to improve the compute/memory/communication efficiency of AI/DL workloads.
- Offer customized optimization for critical DL workloads where latency is extremely tight.
- Build library/tool/framework to improve the efficiency of a class of problems.
- Integrate important optimizations into existing frameworks (e.g., DeepSpeed), providing fast and agile inference.
- Combine system optimization with modeling optimizations.
- Combine and leverage hardware resources (e.g., GPU/CPU, on-device memory/DRAM/NVMe/SSD) in a principled way.
- ...
The project will be done in groups of 2-3 people, which consists of a proposal, mid-term report, final presentation, and final report. The tentative timeline for the
project is as follows.
Late Submissions
All assignments are due on the respective due date. Only on-time assignments will be accepted.
Computing Resources
We will be using the National Center for Supercomputing Applications (NCSA).