Two-way ANOVA

STATS 191

2024-04-01

Outline

  • ANOVA models: variables with only categorical variables

  • Case studies:

    A. Seaweed regeneration time based on grazers

    B. Pygmalion effect

Case study A: Seaweed grazers

  • Treat: which animals were allowed into graze

  • Block: which of 8 blocks measured

  • Cover: what percentage of each block had seaweed covered at time of measurement

Case study B: Pygmalion effect

  • Company: ID of company

  • Treat: whether Pygmalion or Control

  • Score: outcome of interest

Additive model

Saturated model

Graphical check of additivity

Two-way ANOVA model

  • Generalization of \(t\)-test: more than one grouping variable.

  • Two-way ANOVA model:

    • \(r=8\) groups in first factor (Block)
    • \(m=6\) groups in second factor (Treat)
    • \(n_{ij}\) in each combination of factor variables.
  • Model: \[Y_{ijk} = \mu + \alpha_i + \beta_j + (\alpha \beta)_{ij} + \varepsilon_{ijk} , \qquad \varepsilon_{ijk} \sim N(0, \sigma^2).\]

  • In seaweed.df, \(r=8\), \(m=6\), \(n_{ij}=2\) for all \((i,j)\).

Parameterization

  • Some constraints are needed, again for identifiability. Let’s not worry too much about the details…

  • Constraints:

    • \(\sum_{i=1}^r \alpha_i = 0\)

    • \(\sum_{j=1}^m \beta_j = 0\)

    • \(\sum_{j=1}^m (\alpha\beta)_{ij} = 0, 1 \leq i \leq r\)

    • \(\sum_{i=1}^r (\alpha\beta)_{ij} = 0, 1 \leq j \leq m.\)

ANOVA table

Source df E(MS)
A r-1 \(\sigma^2 + nm\frac{\sum_{i=1}^r \alpha_i^2}{r-1}\)
B m-1 \(\sigma^2 + nr\frac{\sum_{j=1}^m \beta_j^2}{m-1}\)
A:B (m-1)(r-1) \(\sigma^2 + n\frac{\sum_{i=1}^r\sum_{j=1}^m (\alpha\beta)_{ij}^2}{(r-1)(m-1)}\)
Error (n-1)mr \(\sigma^2\)

Tests using the ANOVA table

  • Rows of the ANOVA table can be used to test various of the hypotheses we started out with.

  • Under \(H_0:\) no interactions

\[F = \frac{MSAB}{MSE} = \frac{\frac{SSAB}{(m-1)(r-1)}}{\frac{SSE}{(n-1)mr}} \sim F_{(m-1)(r-1), (n-1)mr}\]

Test of additivity

  • \(H_0\): pygm_add.lm is the true model
  • \(F\)-statistic \(\approx 0.67\) – very little evidence against \(H_0\).

  • Note: this \(F\)-statistic also appears in anova(pygm_sat.lm).

Case study A: seaweed regeneration

  • Before calling interaction.plot we’ll order Block as book does for Fig 13.9: ordered by total response.

Interaction plot for seaweed

No transformation

logit transformed

Interaction plot for logit transformed data

ANOVA table

Another test for Treat

  • Difference between the two \(F\)-statistics: estimate of \(\sigma^2\)

Inference about specific linear combinations

Do large fish have an effect on the regeneration ratio? If so, how much? (see Display 13.13)

Standard error

Some caveats about R formulae

While we see that it is straightforward to form the interactions test using our usual anova function approach, we generally cannot test for main effects by this approach.