Wednesday, February 6, 5:15 PM, 460-126
Probabilistic Knowledge in Human Language Comprehension and Production
UCSD
The talk covers two fundamental issues, one each in language
comprehension and production: what determines the difficulty of
comprehending a given word in a given sentence, and what factors
influence the choice that a speaker makes when it is possible to express
a meaning more than one way? The first half of the talk presents the
surprisal theory of processing difficulty, building on Hale (2001),
based on the premise that sentence comprehension involves the rational
and fully incremental application of probabilistic knowledge. On this
theory, the comprehender's probabilistic grammatical knowledge
determines expectations about the continuations of a sentence at
multiple structural levels; and these expectations determine the
difficulty of processing the words that are actually encountered. I
show how this theory can be applied to a number of results in online
sentence comprehension (Konieczny, 2000; Konieczny & Doering, 2003;
Jaeger et al., 2005) that are problematic for memory-oriented theories
such as the Dependency Locality Theory (Gibson 1998, 2000) or
Similarity-Based Interference (Gordon et al., 2001, 2004; Lewis et al.,
2006), yet were not covered by previous probabilistic theories because
the results do not involve resolution of structural ambiguity. Next, I
describe how surprisal can be derived in multiple ways from optimality
principles, and present results supporting the claim that processing
times in language comprehension truly are linear in negative
log-probability. I also describe recent experimental results on the
processing of extraposed relative clauses showing that a number of
results reported by Gibson & Breen (2005) originally interpreted in
terms of locality and phrasal adjacency can be subsumed and generalized
under the rubric of surprisal.
The idea that probabilistic expectations drive processing difficulty
leads to the final proposal of the talk: that speakers make choices in
language production such that their utterances tend toward an optimal,
uniform level of information density. This last part of the talk
introduces the basic theory of uniform information density, and presents
an empirical study and model using the parsed Switchboard corpus to
investigate speaker choice in optional relativizer omission, such as (1)
below:
-
How big is the family (that) you cook for __?
We find that speakers tend to use the optional relativizer "that" more often when the information density of the onset of the relative clause is higher. These results provide evidence in support of uniform information density as a locus of optimal production decisions.