CME 323: Distributed Algorithms and Optimization
Spring 2022, Stanford University
03/28/2022 - 06/10/2022
Lectures will be posted online (two per week)
Instructor: Reza Zadeh
Computer Science is evolving to utilize new hardware such as GPUs, TPUs, CPUs, and large commodity clusters thereof.
Many subfields such as Machine Learning and Optimization have adapted their algorithms to handle such clusters.
Topics include distributed and parallel algorithms for: Optimization, Numerical Linear Algebra, Machine Learning, Graph analysis, Streaming algorithms, and other problems that are challenging to scale on a commodity cluster.
The class will focus on analyzing programs, with some implementation using Apache Spark and TensorFlow.
The course will be split into two parts: first, an introduction
to fundamentals of parallel algorithms and runtime analysis on a single multicore machine.
Second, we will cover distributed algorithms running on a cluster of machines.
We will focus on the analysis of parallelism and distribution costs of algorithms.
Sometimes, topics will be illustrated with exercises
using Apache Spark and TensorFlow.
Pre-requisites: Targeting graduate students having
taken Algorithms at the level of CME 305 or CS 161.
Being able to competently program in any main-stream high level language.
There will be homeworks, a midterm, and a final exam.
by Guy E. Blelloch and Bruce M. Maggs [BB]
Introduction to Algorithms by Cormen, Leiserson, Rivest, Stein [CLRS]
by Holden Karau, Andy Konwinski, Patrick Wendell, Matei Zaharia [KKWZ]
TensorFlow for Deep Learning
by Bharath Ramsundar and Reza Zadeh [RZ]
by Kleinberg and Tardos [KT]
Homeworks will be assigned via Ed and due on Gradescope.
Lecture videos will be available on resources tab in Ed.
We will be hosting office hours via Zoom, however, we encourage students to post questions publicly on Ed.
The midterm and the final exam will be take-home and can be given anytime within a 24 hour window.
Homework 1 [pdf] Due April 14th | [soln]
Homework 2 [pdf] Due April 28th | [soln]
Homework 3 [pdf] Due May 19th | [soln]
Homework 4 [pdf] Due June 1st | [soln]
Lectures and References
- Lecture 1 (3/29): Introduction to Parallel Algorithms (PRAM Model, Work + Depth, Computation DAGs, Brent's Theorem, Parallel Summation)
- Lecture 2 (3/31): Scalability, Scheduling, All Prefix Sum Reading: BB 5.
Handbook of Scheduling,
Better Bounds for Online Scheduling
- Lecture 3 (4/5): All Prefix Sum, Mergesort. Reading: KT 5, BB 8.
Thinking in Parallel: Some Basic Data-Parallel Algorithms and Techniques,
Cole's parallel merge sort (1988)
- Lecture 4 (4/7): Divide and Conquer Algorithms, Master Theorem, Quick Selection, Quick Sort.
Linear time bounds for median select,
Prefix scan qsort
- Lecture 5 (4/12): Matrix Multiplication (Strassen's Algorithm).
- Lecture 6 (4/14): Graph Algorithms, Star Contraction, Minimum Spanning Tree (Boruvka's Algorithm. Reading: KT 3, 4.5, 4.6, CLRS 12, 13.
- Lecture 7 (4/19): Solving Linear Systems
- Lecture 8 (4/21): Optimization for Machine Learning, HOGWILD!
- Lecture 9 (4/26): Skipped for Midterm Review
- Lecture 10 (5/2): Midterm Exam
- Lecture 11 (5/3): Introduction to Distributed Algorithms and Spark
Intro to Spark,
Spark Cheat Sheet,
- Lecture 12 (5/5): Communication Networks, Cluster Computing, Broadcast Networks, and Communication Patterns
- Lecture 13 (5/10): Distributed Summation, Simple Random Sampling, Distributed Sort, Introduction to MapReduce
- Lecture 14 (5/12): Converting SQL to MapReduce, Matrix representations on a cluster, Matrix Computations in SQL and Spark
Matrix Computations and Optimization in Apache Spark,
Sparse matrix multiplication using SQL,
Sparse matrix multiplication in MapReduce.
- Lecture 15 (5/17): Partitioning for PageRank
Partitioning for Pagerank.
- Lecture 16 (5/19): Complexity Measures for MapReduce, Triangle Counting in a Graph
Counting Triangles and the Curse of the Last Reducer
- Lecture 17 (5/24): Singular Value Decomposition
Singular Value Decomposition.
- Lecture 18 (5/26): Covariance Matrices and All-pairs similarity
Lecture 18 ,
Covariance Matrices and All-pairs similarity,
Spring 2015: [class webpage]
Spring 2016: [class webpage]
Spring 2017: [class webpage]
Spring 2018: [class webpage]
Spring 2020: [class webpage]
Reza: rezab at stanford
Office hours: by appointment
Anuj Nagpal: anujnag at stanford
Office hours: Tue-Wed 4-6 PM
Zoom link: Here