MS&E334: The Structure of Social Data (Fall 2015)

Johan Ugander, Assistant Professor, MS&E
Email: jugander [at] stanford
Office location: Huang 357
Office Hours: Tu/Th, 4:30pm-5:30pm

Lecture hours: Tu/Th, 3:00pm-4:20pm
Lecture room: Thornton Center (Terman Annex), Rm 110

Course Description

Social networks have a rich history of study across many disciplines. Recent opportunities afforded by large-sale online instrumentation and experimentation have begun to provide a rich view of their structure and role in diverse social and economic domains. This course provides a survey of recent research in the study of social networks and large-scale social and behavioral data. Topics will include network models based on random graphs and their properties; centrality and ranking on graphs; ranking from comparisons; social influence and homophily; experimentation and causal inference on networks; heavy-tailed statistical distributions.

Most important links:

Lecture material

The literature below lays the foundation for the lecture material, though not all papers will be discussed in depth. If you have a focused interests in specific papers, feel free to come discuss them with me during office hours. The reference list will almost certainly be expanded in response to class discussions as the course progresses.

Week 1

Lecture 1: Course overview

An introduction to the course and high-level tour of content and goals.

Lecture 2: Graphs and graph properties

A review of graph definitions and properties. Graphical degree sequences.


Week 2

Lecture 3: Configuration Models

Configuration models are uniform distributions over specific spaces of graphs. We discussed the Simple Configuration Model and the Non-Simple Configuration Model, and how to uniformly sample from these two models.

Lecture 4: Erdos-Renyi, Preferential Attachment, Degree Assortativity
Power Law literature: Growth models: Not discussed: Benford’s Law, another "law" that in fact has a rigorous (if and only if) characterization that is directly related to scale-free distributions.

Week 3

Lecture 5: More graph models

Finishing preferential attachment; Stochastic Block Models; ERGMs. Other models.

SBMs: Planted partition model: ERGMs: Other models:
Lecture 6: The small-world phenomena (smallness and navigability)
Distance distributions:

Week 4

Lecture 7 & 8 : Graph centrality and ranking

Katz, Bonacich, Eigenvector,PageRank, Betweenness, Harmonic centrality. Personalized variations.

Foundational papers: More recent perspectives: Centrality, personalized:

Week 5

Lecture 9: Comparisons and ranking from comparisons

Thurstone and Bradley-Terry models; Elo ratings.

Example applications: Other methods that seek status embeddings:
Lecture 10: The friendship paradox
Friendship paradox literature: Applications of the friendship paradox:

Week 6

Lecture 11: Models of social processes: influence and contagion
Lecture 12: Influence maximization; complex contagion; Homophily and Influence

Week 7

Lecture 13: Causal Inference

Lecture 14: Causal Inference under Interference

Weeks 8

Lecture 15: More on interference; Clustering

Dissecting Papers

During weeks 8 and 9 the course will take on an active discursive style, aiming to synthesize what we've discussed as we dissect the methodologies of recent, complex applied papers.

Lecture 16: Get Out the Vote Experiment
Lecture 17: Complex contagion in technology adoption
Lecture 18: A friend-of-random field experiment

Break - Week of Thanksgiving

Week 10

In-class presentations of student projects.

Tools and Data

Here are some libraries that might be useful for the problem sets and projects:

Some data sources: