MS&E334: The Structure of Social Data (Fall 2016)

Johan Ugander, Assistant Professor, MS&E
Email: jugander [at] stanford
Office location: Huang 357
Office Hours: TBD

Lecture hours: Tu/Th, 3:00pm-4:20pm
Lecture room: 160-317


Course Description

Social networks have a rich history of study across many disciplines. Recent opportunities afforded by large-sale online instrumentation and experimentation have begun to provide a rich view of their structure and role in diverse social and economic domains. This course provides a survey of recent research in the study of social networks and large-scale social and behavioral data. Topics will include network models based on random graphs and their properties; centrality and ranking on graphs; ranking from comparisons; social influence and homophily; experimentation and causal inference on networks; heavy-tailed statistical distributions.

Most important links:

Lecture material

The literature below lays the foundation for the lecture material, though not all papers will be discussed in depth. If you have a focused interests in specific papers, feel free to come discuss them with me during office hours. The reference list will almost certainly be expanded in response to class discussions as the course progresses.

Week 1

Lecture 1: Course overview (9/27)

An introduction to the course and high-level tour of content and goals.

Lecture 2: Graphs and graph properties (9/29)

A review of graph definitions and properties. Graphical degree sequences.

Literature:

Week 2

Lecture 3: Configuration Models (10/4)

Configuration models are uniform distributions over specific spaces of graphs. We discuss the Simple Configuration Model and non-simple Configuration Models, and how to uniformly sample from different graph spaces.

Lecture 4: Preferential Attachment, Growth, and Power Laws (10/6)
Power Law literature: Growth models: Not discussed: Benford's Law, another "law" that in fact has a rigorous (if and only if) characterization that is directly related to scale-free distributions.

Week 3

Lecture 5: More graph models (10/11)

Finishing preferential attachment; Stochastic Block Models; ERGMs. Other models.

SBMs: Planted partition model: ERGMs: Other models:
"Lecture 6": No class, Johan @ MIT (10/13)


Week 4

Lecture 7 & 8 : Graph centrality and ranking (10/18, 10/20)

Katz, Bonacich, Eigenvector,PageRank, Betweenness, Harmonic centrality. Personalized variations.

Foundational papers: More recent perspectives: Centrality, personalized:

Week 5

Lecture 9: Comparisons and ranking from comparisons (10/25)

Thurstone and Bradley-Terry models; Elo ratings.

Example applications: Other methods that seek status embeddings:
Lecture 10: The friendship paradox (10/27)
Friendship paradox literature: Applications of the friendship paradox:

Week 6

Lecture 11: Models of social processes: influence and contagion (11/1)
Lecture 12: Influence maximization; complex contagion; Homophily and Influence (11/3)


Week 7

Lecture 13: Causal Inference (11/8)

Lecture 14: Causal Inference under Interference (11/10)


Weeks 8

Lecture 15: Graph clustering and community detection

Lecture 16: The small-world phenomena (smallness and navigability)
Distance distributions:

Break - Week of Thanksgiving



Week 9: Dissecting Papers (11/29, 12/1)

During Week 9 the course will take on an active discursive style, aiming to synthesize what we've discussed as we dissect the methodologies of recent, complex applied papers. We will take a survey during Week 8 to determine the papers we want to discuss. In 2015 the following papers were discussed, though we will only do two papers this year:



Week 10

In-class presentations of student projects.



Tools and Data

Here are some libraries that might be useful for the problem sets and projects:

Some data sources: