2024-04-01
ANOVA models: variables with only categorical variables
Case studies:
A. Seaweed regeneration time based on grazers
B. Pygmalion effect
Treat
: which animals were allowed into graze
Block
: which of 8 blocks measured
Cover
: what percentage of each block had seaweed covered at time of measurement
Company
: ID of company
Treat
: whether Pygmalion
or Control
Score
: outcome of interest
Generalization of \(t\)-test: more than one grouping variable.
Two-way ANOVA model:
Block
)Treat
)Model: \[Y_{ijk} = \mu + \alpha_i + \beta_j + (\alpha \beta)_{ij} + \varepsilon_{ijk} , \qquad \varepsilon_{ijk} \sim N(0, \sigma^2).\]
In seaweed.df
, \(r=8\), \(m=6\), \(n_{ij}=2\) for all \((i,j)\).
Some constraints are needed, again for identifiability. Let’s not worry too much about the details…
Constraints:
\(\sum_{i=1}^r \alpha_i = 0\)
\(\sum_{j=1}^m \beta_j = 0\)
\(\sum_{j=1}^m (\alpha\beta)_{ij} = 0, 1 \leq i \leq r\)
\(\sum_{i=1}^r (\alpha\beta)_{ij} = 0, 1 \leq j \leq m.\)
Source | df | E(MS) |
---|---|---|
A | r-1 | \(\sigma^2 + nm\frac{\sum_{i=1}^r \alpha_i^2}{r-1}\) |
B | m-1 | \(\sigma^2 + nr\frac{\sum_{j=1}^m \beta_j^2}{m-1}\) |
A:B | (m-1)(r-1) | \(\sigma^2 + n\frac{\sum_{i=1}^r\sum_{j=1}^m (\alpha\beta)_{ij}^2}{(r-1)(m-1)}\) |
Error | (n-1)mr | \(\sigma^2\) |
Rows of the ANOVA table can be used to test various of the hypotheses we started out with.
Under \(H_0:\) no interactions
\[F = \frac{MSAB}{MSE} = \frac{\frac{SSAB}{(m-1)(r-1)}}{\frac{SSE}{(n-1)mr}} \sim F_{(m-1)(r-1), (n-1)mr}\]
pygm_add.lm
is the true model\(F\)-statistic \(\approx 0.67\) – very little evidence against \(H_0\).
Note: this \(F\)-statistic also appears in anova(pygm_sat.lm)
.
interaction.plot
we’ll order Block
as book does for Fig 13.9: ordered by total response.seaweed
logit
transformedlogit
transformed dataTreat
R
formulaeWhile we see that it is straightforward to form the interactions test using our usual anova
function approach, we generally cannot test for main effects by this approach.