Part 1: Group Exercise
We are interested in building a language model over a language with three words: A, B, C. Our training corpus is
AAABACBABBBCCACBCC
For the purposes of this problem let's assume we do use an end token.
(In class we tried an experiment without using an end token
to explore what happened to the probabilities, but for the purposes of studying for the final let's return to the standard correct
approach of using an end token.)

First train a unigram language model using maximum likelihood estimation. What are the probabilities? (Just leave in the form of a fraction)?
P(A) =
P(B) =
P(C) =

Next train a bigram language model using maximum likelihood estimation. Fill in the probabilities below. Leave your answers in the form of a fraction.
P(AA) =
P(AB) =
P(AC) =
P(BA) =
P(BB) =
P(BC) =
P(CA) =
P(CB) =
P(CC) =

Now evaluate your language models on the corpus
ABACABB
What is the perplexity of the unigram language model evaluated on this corpus?
What is the perplexity of the bigram language model evaluated on this corpus?

Now repeat everything above for add1 smoothing.