Wednesday, February 6, 5:15 PM, 460-126

Probabilistic Knowledge in Human Language Comprehension and Production

Roger Levy

UCSD

The talk covers two fundamental issues, one each in language comprehension and production: what determines the difficulty of comprehending a given word in a given sentence, and what factors influence the choice that a speaker makes when it is possible to express a meaning more than one way? The first half of the talk presents the surprisal theory of processing difficulty, building on Hale (2001), based on the premise that sentence comprehension involves the rational and fully incremental application of probabilistic knowledge. On this theory, the comprehender's probabilistic grammatical knowledge determines expectations about the continuations of a sentence at multiple structural levels; and these expectations determine the difficulty of processing the words that are actually encountered. I show how this theory can be applied to a number of results in online sentence comprehension (Konieczny, 2000; Konieczny & Doering, 2003; Jaeger et al., 2005) that are problematic for memory-oriented theories such as the Dependency Locality Theory (Gibson 1998, 2000) or Similarity-Based Interference (Gordon et al., 2001, 2004; Lewis et al., 2006), yet were not covered by previous probabilistic theories because the results do not involve resolution of structural ambiguity. Next, I describe how surprisal can be derived in multiple ways from optimality principles, and present results supporting the claim that processing times in language comprehension truly are linear in negative log-probability. I also describe recent experimental results on the processing of extraposed relative clauses showing that a number of results reported by Gibson & Breen (2005) originally interpreted in terms of locality and phrasal adjacency can be subsumed and generalized under the rubric of surprisal.

The idea that probabilistic expectations drive processing difficulty leads to the final proposal of the talk: that speakers make choices in language production such that their utterances tend toward an optimal, uniform level of information density. This last part of the talk introduces the basic theory of uniform information density, and presents an empirical study and model using the parsed Switchboard corpus to investigate speaker choice in optional relativizer omission, such as (1) below:


We find that speakers tend to use the optional relativizer "that" more often when the information density of the onset of the relative clause is higher. These results provide evidence in support of uniform information density as a locus of optimal production decisions.