CME 323: Distributed Algorithms and Optimization
Spring 2017, Stanford University
Mon, Wed 10:30 AM - 11:50 AM at 200-205
Instructor: Reza Zadeh
The emergence of large distributed clusters of commodity machines
has brought with it a slew of new algorithms and tools.
Many fields such as Machine Learning and Optimization
have adapted their algorithms to handle such clusters.
The class will cover widely used
distributed algorithms in academia and industry.
The course will begin with an introduction
to fundamentals of parallel and distributed runtime analysis. Afterwards,
we will cover parallel and distributed algorithms for:
- Convex Optimization
- Matrix Factorization
- Machine Learning
- Neural Networks
- Numerical Linear Algebra
- Large Graph analysis
- Streaming algorithms
Class Format
We will focus on the analysis of parallelism and distribution costs of algorithms.
Sometimes, topics will be illustrated with hands-on exercises
using Apache Spark.
Pre-requisites: Targeting graduate students having
taken Algorithms at the level of CME 305 or CS 261.
Being able to competently program in any main-stream high level language.
There will be homeworks, a midterm, and a final.
Grade Breakdown:
Homeworks: 40%
Midterm: 30%
Final: 30%
The midterm will be in class on Monday May 8th.
Required textbook:
Parallel Algorithms
by Guy E. Blelloch and Bruce M. Maggs [BB]
Optional textbooks:
Models of Computation
by John E. Savage [S]
Introduction to Algorithms by Cormen, Leiserson, Rivest, Stein [CLRS]
Learning Spark
by Holden Karau, Andy Konwinski, Patrick Wendell, Matei Zaharia [KKWZ]
Convex Optimization
by Boyd and Vandenberghe [BV]
Algorithm Design, by Kleinberg and Tardos [KT]
Homework
Homework 1 [pdf] [tex] Due April 19th. [soln]
Homework 2 [pdf] [tex] Due May 1st. [soln]
Homework 3 [pdf] [tex] Due May 22nd. [soln]
Homework 4 [pdf] [tex] Due June 7th.
Lectures and References
- Lecture 1: Fundamentals of Distributed and Parallel algorithm analysis. Reading: BB Chapter 1.
Lecture Notes
- Lecture 2: Scalable algorithms, Scheduling. Reading: BB 5.
Lecture Notes,
Handbook of Scheduling
- Lecture 3: Prefix Sum, Mergesort. Reading: KT 5, BB 8.
Lecture Notes,
Cole's parallel merge sort (1988)
- Lecture 4: Parallel quick-select, quicksort.
Lecture Notes,
Linear time bounds for median select,
Prefix scan qsort.
- Lecture 5: Quicksort, Strassen's Algorithm, Minimum Spanning Trees. Reading: KT 3, 4.5, 4.6.
Lecture Notes,
Boruvka (1926).
- Lecture 6: Graph contraction, star contraction, MST algorithms. Reading: CLRS 12, 13.
Lecture Notes.
- Lecture 7: (Stochastic) Gradient Descent, Parallel SGD (HOGWILD!). HOGWILD!, Omnivore.
Lecture Notes.
- Lecture 8: Intro to distributed computing, sampling, communication patterns.
Lecture Notes.
- Lecture 9: Network Topology and communication patterns. Distributed summation, and remarks on sorting.
Lecture Notes (draft).
- Lecture 10: Distributed sort, intro to map reduce, applications to map reduce.
Lecture Notes.
- Lecture 11: Midterm, Solution.
- Lecture 12: Map Reduce (indexing), Sparse Matrix Multiplies using SQL, Joins using Map Reduce.
Lecture Notes.
- Lecture 13: Joins using map reduce, measures of complexity, triangle counting. Curse of the Last Reducer.
Lecture Notes (a), Lecture Notes (b).
- Lecture 14: Triangle Counting in Map Reduce, matrix multiplies with a small matrix, optimization and gradient descent.
Lecture Notes (node iterator via map reduce), Lecture Notes (analysis of node iterator), Lecture Notes (matrix multiplies).
- Lecture 15: Data Flow Systems: Spark, MapReduce shortcomings.
Lecture Notes. Slides: Intro to DAO. Slides: Distributed Computations with MapReduce.
- Lecture 16: Optimization in Spark, Broadcasting, SGD on parameter servers.
Lecture Notes.
- Lecture 17: Distributed singular value decomposition, covariance matrix computation.
Lecture Notes.
- Lecture 18: Covariance matrix computation, optimization, review. DIMSUM.
Lecture Notes (Old).
- Lecture 19: Review.
Final Review.
Midterm Practice Problems. [pdf] [sol hints]
Previous Years
Spring 2015: [class webpage]
Spring 2016: [class webpage]
|
|
|
Contact
Reza: rezab at stanford
Office hours: by appointment
TA
Andreas Santucci: santucci at stanford
Office hours: Mondays 12-2, Wednesdays 12-1.
Wissam Baalbaki: baalbaki at stanford
Office hours: Tuesday, 3:30-5:30, Thursday 3:30-4:30.
TA office hours will be held in the Huang Engineering Center basement
(in front of the ICME office)
|
|