\documentclass[11pt]{beamer}
\beamertemplatenavigationsymbolsempty
\usetheme{Madrid}
\usepackage[utf8]{inputenc}
\usepackage{amsmath,mathtools}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{xcolor}
\usepackage{soul}
\usepackage{graphicx}
\setbeamertemplate{itemize items}[default]
\setbeamertemplate{enumerate items}[default]
\title{450A Political Methodology I \\ TA Section 3}
\author{Vincent Bauer and William Marble}
\date{October 13, 2016}
%\title{}
%\setbeamercovered{transparent}
%\setbeamertemplate{navigation symbols}{}
%\logo{}
%\institute{}
%\date{}
%\subject{}
\begin{document}
\begin{frame}
\titlepage
\end{frame}
%\begin{frame}
%\tableofcontents
%\end{frame}
\begin{frame}{Road Map}
\begin{enumerate}
\item Multinomial Distribution
\item Multinomial Goodness of Fit
\item Paired t-test, blocking, research design (most of the slides)
\vspace{3mm}
R code
\begin{enumerate}
\item ggplot
\item reshape
\end{enumerate}
\end{enumerate}
\end{frame}
\begin{frame}{Multinomial Distribution}
Most of our hypotheses tests have compared proportions from a binary outcome variable, i.e. support for Hillary Clinton. But the homework introduced a more complicated scenario, a categorical outcome variable, i.e. preferences between four trade deals.
\vspace{3mm}
\pause
The binary outcome has been implicitly assuming a binomial distribution, which is a special case of the multinomial distibution with k = 2 outcomes.
\vspace{3mm}
\pause
A multinomial trials process is a sequence of independent, identically distributed random variables each taking k possible values.
\end{frame}
\begin{frame}{Multinomial Distribution}
The expected number of times the outcome i was observed over n trials is
$$E[X_{i}] = n \cdot p_{i}$$
\pause
Similarly, the variance of the expected number of times we observe outcome i is stored in a diagonal matrix.
$$Var[X_{i}] = n \cdot p_{i}(1-p_{i})$$
\pause
The off-diagonal entries are the covariances between outcomes, and are negative because increasing the probability of one outcomes decreases the probability of other outcomes.
$$Cov[X_{i}, X_{j}] = -n \cdot p_{i} \cdot p_{j}$$
\pause
If we care about proportions and not the number of times, we can drop the n's.
\end{frame}
\begin{frame}{Multinomial Goodness of Fit}
What if we wanted to check whether two sets of statistics came from the same population.
\vspace{3mm}
\pause
Say that the university released the following statistics about smoking rates on campus but we are skeptical so we decide to run our own survey.
\pause
\begin{table}[]
\centering
\caption{University Statistics}
\begin{tabular}{c|c|c|c}
Heavy & Regular & Occasional & Never \\ \hline
\\[-1em]
4.5\% & 7.5\% & 8.5\% & 79.5\%
\end{tabular}
\end{table}
The university's numbers are our expected outcomes.
\end{frame}
\begin{frame}[fragile]{Multinomial Goodness of Fit}
\footnotesize %change the font size. You can \scriptsize to get a smaller font.
<>=
library(MASS) #the data lives here
library(dplyr) #pipe operator
library(xtable) #make a table
#school statistics
smoke.prob = c(.045, .795, .085, .075)
#survey results
smoke.freq = table(survey$Smoke) %>% prop.table() %>% round(3)*100
xtable(smoke.freq)
@
\end{frame}
\begin{frame}{Multinomial Goodness of Fit}
We want to test whether the results of our sample and the statistics presented by the university could have come from the same population.
\vspace{3mm}
\pause
We use a Chi-squared test for independence (Pearson's)
$$X^{2} = \sum\limits_{k} \dfrac{(f_{k} - e_{k})^{2}}{e_{k}}$$
\pause
\begin{table}[]
\centering
\label{my-label}
\begin{tabular}{ccccc}
& Heavy & Regular & Occassional & Never \\
Observed & 4.7 & 7.2 & 8.1 & 80.1 \\
Expected & 4.5 & 7.5 & 8.5 & 79.5 \\
& = & = & = & = \\
0.044 $\gets$ & 0.009 + & 0.012 + & 0.019 + & 0.005
\end{tabular}
\end{table}
\end{frame}
\begin{frame}[fragile]{Multinomial Goodness of Fit}
\footnotesize
<>=
suppressWarnings(chisq.test(smoke.freq, p=smoke.prob))
@
\normalsize
The p-value is greater than the .05 significance level so we cannot not reject the null hypothesis that the survey results and the campus-wide smoking statistics came from the same population.
\end{frame}
\begin{frame}
\huge{Paired t-test, blocking, research design}
\end{frame}
\begin{frame}{Example: Boys' Shoes}
Say you wanted to test the effectiveness of two materials for making shoes. Your outcome variable is the amount of wear on shoes made of material A and material B. \pause We therefore have two population parameters of interest.
$$\mu_{A} = E[Y_{A}]$$
$$\mu_{B} = E[Y_{B}]$$
\pause
Our null hypothesis is that the wear on the shoes is the same, i.e. $H_{0}: \mu_{A} = \mu_{B}$. \pause We have no prior beliefs about the strength of these materials so we test the two-sided our alternative hypothesis that the wears are not the same, i.e. $H_{A}: \mu_{A} \neq \mu_{B}$.
\end{frame}
\begin{frame}{Complete Random Assignment}
We only have 10 shoes of each type and we are concerned that if the difference between the quality of the shoes is not very large, then our comparisons will be polluted by noise from other differences and we will fail to find a difference where one actually exists (Type-2 Error).
\pause
\begin{itemize}
\item We think that boys and girls may have different wear patterns, so we control for gender and only allow boys into our study.
\item But we're still worried about variation within boys, so we decide to try out this random assignment thing that seems to be all the rage.
\item We call this experiment the "Complete Random Assignment" Design.
\end{itemize}
\end{frame}
\begin{frame}[fragile]
\footnotesize
<>=
library(frt)
data(shoes)
set.seed(123)
#set up data for CRD
#the data is set up for the block design
shoeA <- sample(1:10,5, replace=FALSE) #randomly assign each boy to a treatment
shoes$CRA.wear <- shoes$CRA.shoe <- NA
shoes$CRA.shoe[shoeA] <- "A"
shoes$CRA.shoe[-shoeA] <- "B"
shoes$CRA.wear <- ifelse(shoes$CRA.shoe=="A", shoes$matA, shoes$matB) #get corresponding wear
shoes[1:5, c("boy", "CRA.shoe", "CRA.wear")]
@
\end{frame}
\begin{frame}[fragile]
<>=
library(ggplot2)
ggplot(shoes, aes(x=boy, y=CRA.wear, colour=CRA.shoe)) +
geom_point() +
scale_x_continuous(breaks = 1:10) +
theme(panel.grid.minor = element_blank())
@
\end{frame}
\begin{frame}[fragile]
We run our experiment and obtain the following amount of wear per boy.
<>=
library(ggplot2)
ggplot(shoes, aes(x=boy, y=CRA.wear, colour=CRA.shoe)) + geom_point() + scale_x_continuous(breaks = 1:10) + theme(panel.grid.minor = element_blank())
@
\end{frame}
\begin{frame}[fragile]
We run our two-sample t-test and its disappointing so we sit at our desk dejectedly. Notice that shoe B has higher wear on average but we cannot be sure that this is not due to chance.
\footnotesize %change the font size. You can \scriptsize to get a smaller font.
<<>>=
t.test(shoes$CRA.wear[shoes$CRA.shoe=="A"],
shoes$CRA.wear[shoes$CRA.shoe=="B"])
@
\end{frame}
\begin{frame}{Random Block design}
Staring at the floor, we realize that people have two feet and that we can give each boy one of each kind of shoe.
\pause
\begin{itemize}
\item We decide to give this randomization thing another try and randomize which shoe goes on which foot.
\item We promise our boss that we'll find a difference this time and they make twenty more shoes for us and we give them to the same boys.
\item We call this the "Random Block Design".
\end{itemize}
\end{frame}
\begin{frame}[fragile]
\footnotesize
<>=
library(reshape)
shoes2 <- melt(shoes, id.vars=c("boy"),
measure.vars=c("matA", "matB"))
shoes2[c(1:3,50, 17:20),]
@
\end{frame}
\begin{frame}[fragile]
This time a different pattern emerges. Notice that that differences between the boys is much greater than the differences between the shoes.
<>=
library(reshape)
shoes2 <- melt(shoes, id.vars=c("boy"), measure.vars=c("matA", "matB"))
ggplot(shoes2, aes(x=boy, y=value, colour=variable)) + geom_point()+ scale_x_continuous(breaks = 1:10) + theme(panel.grid.minor = element_blank())
@
\end{frame}
\begin{frame}[fragile]
And we plot the differences themselves. Most boys show more wear on shoe B.
<>=
ggplot(shoes, aes(x=boy, y=diff))+ geom_point() + geom_hline(yintercept = 0)+ scale_x_continuous(breaks = 1:10) + theme(panel.grid.minor = element_blank())
@
\end{frame}
\begin{frame}[fragile]
Then we run a two-sample t-test on these newly randomized shoes, and are disappointed again.
\footnotesize %change the font size. You can \scriptsize to get a smaller font.
<<>>=
t.test(shoes$matB,shoes$matA)
@
\end{frame}
\begin{frame}[fragile]
Finally, we decide to pair these observations together.
\footnotesize %change the font size. You can \scriptsize to get a smaller font.
<>=
t.test(shoes$matB,shoes$matA, paired=TRUE)
@
\end{frame}
\begin{frame}[fragile]
We're curious what this test is doing, so we run a one-sample t-test on just the differences.
\footnotesize %change the font size. You can \scriptsize to get a smaller font.
<>=
t.test(shoes$diff)
@
\end{frame}
\begin{frame}{Analysis}
Q: Why did we observe a statistically significant difference in the second test but not the first? \\
\vspace{3mm}
\pause
Intuitive answer:
\begin{itemize}
\item The children in our study varied a lot on how they used their shoes, while the differences between the materials was relatively small.
\item Our first test did pick up higher wear on shoe B but there was too much variance. If we had had many more children then we could have made a conclusion about this difference.
\end{itemize}
\vspace{3mm}
\pause
The math to figure out how many children we would need to detect this magnitude of change is called a 'power calculation'.
\end{frame}
\begin{frame}
Let's think about this intuition more formally.
\vspace{3mm}
\pause
Wear ($Y_{A}$) is determined by 1) shoe material, 2) shoe usage, 3) other random factors
\begin{align*}
Y_{A} & = \mu_{A} + \beta_{i} + \epsilon_{i, A}\\
Y_{B} & = \mu_{B} + \beta_{i} + \epsilon_{i, B}
\end{align*}
\pause
Once we put both shoes on the same boys, $\beta_{i}$ is the same in both equations.
\end{frame}
\begin{frame}
Graphically,
\begin{figure}
\centering
\includegraphics{graph.png}
\end{figure}
\end{frame}
\begin{frame}
Lets show formally why the Complete Random design is failing to detect statistically significant results. There is no \ul{bias} in this design, its just inefficient for our small sample size.
\vspace{3mm}
\pause
For simplicity we assume that the material is exactly the same in each shoe, i.e. $Var[\mu] \approx 0$, we assume this in most regressions.
\vspace{3mm}
But there is still variation between boys and other random variation.
$$\beta_{i} \sim N(0, \sigma_{\beta}^{2})$$
$$\epsilon_{i, A/B} \sim N(0, \sigma_{\epsilon}^{2}/2)$$
\pause
The $\frac{1}{2}$ is for convenience later and makes no difference to the proof.
\end{frame}
\begin{frame}
What is the variance of our estimate $Y_{A}$?
\pause
\begin{align*}
\onslide<2->{Y_{A} & = \mu_{A} + \beta_{i} + \epsilon_{i, A}\\}
\onslide<3->{Var[Y_{A}] & = Var[\mu_{A} + \beta_{i} + \epsilon_{i, A}]}
\onslide<4->{\intertext{By assumption, $\beta \perp \epsilon$, and by design $\perp \mu$}}
\onslide<5->{Var[Y_{A}] & = 0 + Var[\beta_{i}] + Var[\epsilon_{i, A}] \\}
\onslide<6->{Var[Y_{A}] & = \sigma_{\beta}^{2} + \sigma_{\epsilon}^{2}/2}
\end{align*}
\onslide<7->{
So the SE in our two-sample t-test is actually
$$\sqrt{\dfrac{\sigma_{\beta}^{2} + \sigma_{\epsilon_{A}}^{2}/2}{N_{A}} + \dfrac{\sigma_{\beta}^{2} + \sigma_{\epsilon_{B}}^{2}/2}{N_{B}}}$$
}
\onslide<8->{
We can tell graphically that $\sigma_{\beta}^{2} \gg \sigma_{\epsilon}^{2}/2$}
\end{frame}
\begin{frame}
There is nothing we can do about $\sigma_{\epsilon}^{2}/2$ but ideally we would like to minimize $\sigma_{\beta}^{2}$ because it is unrelated to the shoes. \\
\vspace{3mm}
Comparing variation within boys controls for $\sigma_{\beta}^{2}$.
\begin{align*}
\onslide<1->{D_{i} & = Y_{A} - Y_{B} \\}
\onslide<2->{& = (\mu_{A} + \beta_{i} + \epsilon_{i, A}) - (\mu_{B} + \beta_{i} + \epsilon_{i, B}) \\}
\onslide<3->{& = (\mu_{A} + \mu_{B}) + (\beta_{i} - \beta_{i}) + (\epsilon_{i, A} - \epsilon_{i, B}) \\}
\onslide<4->{Var[D_{i}]& = 0 + 0 + Var[\epsilon_{i, A} - \epsilon_{i, B}] \\}
\onslide<5->{& = \sigma_{\epsilon}^{2}/2 + \sigma_{\epsilon}^{2}/2 = \sigma_{\epsilon}^{2}}
\end{align*}
\end{frame}
\begin{frame}
Q: What does this toy example tell us about performing research?
\begin{itemize}
\item We will learn how to control for the effects of confounding variables in in multivariate regression.
\item Analygous to fixed effects or first differences. All cancel out the effects of shared confounders.
\item But we would not be able to estimate the boy specific effects had we not given them both kinds of shoes.
\item Thinking carefully about design can make a huge difference.
\item Paired comparison design is a form of randomized block design.
\begin{enumerate}
\item Create blocks of relatively homogenous subunits
\item Randomize treatment within blocks
\item Analyze appropriately
\end{enumerate}
\end{itemize}
\end{frame}
\begin{frame}
George Box: "Block what you can; Randomize what you cannot!"
\vspace{3mm}
\begin{enumerate}
\item Block to control for large, known, sources of variation
\item Randomize to eliminate bias from unknown sources of variation
\end{enumerate}
\vspace{3mm}
Comparative politics,
\begin{itemize}
\item You could randomize treatment assignment between villages
\item But probably better to \ul{match} villages on observable characteristics, creating blocks or pairs, and then randomize within.
\end{itemize}
\end{frame}
\end{document}