CME 323: Distributed Algorithms and Optimization
Spring 2016, Stanford University
Mon, Wed 1:30 PM - 2:50 PM at Braun Lecture Hall, Mudd Chemistry Building
Instructor: Reza Zadeh
The emergence of large distributed clusters of commodity machines
has brought with it a slew of new algorithms and tools.
Many fields such as Machine Learning and Optimization
have adapted their algorithms to handle such clusters.
The class will cover widely used
distributed algorithms in academia and industry.
The course will begin with an introduction
to fundamentals of parallel and distributed runtime analysis. Afterwards,
we will cover distributed algorithms for:
- Convex Optimization
- Matrix Factorization
- Machine Learning
- Neural Networks
- Numerical Linear Algebra
- Large Graph analysis
- Streaming algorithms
We will focus on the analysis of parallelism and distribution costs of algorithms.
Sometimes, topics will be illustrated with hands-on exercises
using Apache Spark.
Pre-requisites: Targeting graduate students having
taken Algorithms at the level of CME 305 or CS 261.
Being able to competently program in any main-stream high level language.
There will be homeworks, a midterm, one scribed lecture, and a project.
Homeworks and scribing: 40%
The midterm will be Monday May 2nd in class
by Guy E. Blelloch and Bruce M. Maggs [BB]
Models of Computation
by John E. Savage [S]
Introduction to Algorithms by Cormen, Leiserson, Rivest, Stein [CLRS]
by Holden Karau, Andy Konwinski, Patrick Wendell, Matei Zaharia [KKWZ]
by Boyd and Vandenberghe [BV]
Algorithm Design, by Kleinberg and Tardos [KT]
Homework 1 [pdf] [tex] Due April 11th. [Solution]
Homework 2 [pdf] [tex] Due April 27th. [Solution]
Homework 3 [pdf] [tex] Due May 18th. [Solution]
Homework 4 [pdf] [tex] Due May 25th.
Lectures and References
- Lecture 1: Fundamentals of Distributed and Parallel algorithm analysis, Reading: BB Chapter 1, Notes
- Lecture 2: Caveats of Parallel Algorithms, the Master Theorem, and parallel matrix multiplication, Reading: KT Chapter 5, Notes, Strassen 1969
- Lecture 3: Parallel Strassen's Algorithm and Parallel Mergesort, Cole 1988, Notes
- Lecture 4: Parallel Mergesort, General Divide and Conquer, Parallel Selection, Parallel Quicksort, Blum et. al. 1971, Prefix-scan QSort, Reading: BB Chapter 2.1, CLRS 27.3. Notes
- Lecture 5: Minimum Spanning Trees, Boruvka's Algorithm, Boruvka (1926), Reading: KT Chapters 3, 4.5, 4.6, Notes
- Lecture 6: Review of Master Theorem, Closest Pair Problem, Master Theorem Examples, KT 5.4, Notes
- Lecture 7: Set Representation, Graph Contractions, Connectivity AVL Trees , Red Black Trees, Reading: Primer on BST's (Sections 6, 7), CLRS Ch. 12, 13 Notes
- Lecture 8: Multicore Optimization (Seperable Objective Functions), Job Scheduling. Hogwild! Notes
- Lecture 9: Job Scheduling, Intro to Distributed Computing, Intro to Streaming Algorithms. Notes
- Lecture 10: Intro to Distributed Computing, Communication Protocols, Midterm Review. Notes
- Lecture 11: Midterm, Midterm Solutions
- Lecture 12: Intro to MapReduce, Spark and other Distributed Computing Tools Intro to Spark, Intro to Distributed Optimization, Notes.
- Lecture 13: Gradient Descent in Spark, Communication Patterns. Intro to Distributed Optimization, Communication Patterns, Notes.
- Lecture 14: Measures of Complexity, Triangle Counting. Notes.
- Lecture 15: Wrap-Up Node Iterator (Triangle Counting), Combiners, Broadcasting, Spark in other Programming Languages. Curse of the Last Reducer , Broadcasting in Spark, Notes.
- Lecture 16: Sorting, Partitioning for Page Rank, Distributed Matrix Computations. Pregel Slides, Page Rank Slides, Pregel: A System for Large Scale Graph Processing, Scaling! But at what COST?. Notes.
- Lecture 17: Covariance Matrices and All-Pairs Similarity, Matrix Computations, General Convex Optimization. ADMM, Dimension Independent Similarity Computation. Notes.
Spring 2015: [class webpage]
Advanced Data Science on Spark: [slides]
Spark Intro Tutorial: [slides] [code and data - 1 GB]
Spark Devops Slides: Spark Summit slides
Reza: rezab at stanford
Office hours: by appointment
Andreas Santucci: santucci at stanford
Office hours: Tuesdays 3:15 pm - 5:15 pm
Nolan Skochdopole: naskoch at stanford
TA office hours will be held in the Huang Engineering Center basement
(in front of the ICME office)
Office hours: Mondays, 3:15 pm - 5:15 pm