Catalog description
Machine understanding of human language. Computational semantics
(determination of sense, event structure, thematic role, time,
aspect, synonymy/meronymy, causation, compositional semantics,
treatment of scopal operators), and computational pragmatics and
discourse (coherence relations, anaphora resolution, information
packaging, generation). Theoretical issues, online resources, and
relevance to applications including question answering,
summarization, and textual inference. Prerequisites: one of
LING180, CS224N, CS224S; and knowledge of logic (LING130A or B,
CS157, or PHIL159).
Attendance will be taken daily, with one point assigned for
each class attended. Class will begin on time and end on time; we
are obliged to finish on time, and you are obliged to arrive on
time.
We would like everyone to ask questions, offer ideas, etc., in
class. Questions and ideas sent via email to
also count as participation, though we would prefer it if everyone
got involved during our class meetings.
There are eight homeworks, due before the start of meetings 2-9
of the term. (After that point, the assignment are oriented
towards final projects.)
The homeworks will depend on materials from the readings, so
you should do the readings before starting the homeworks. With the
reading done, each homework should take you 15-20 minutes (longer
if you decide to pursue the issues in greater depth, perhaps as a
lead-in to a project).
Our goals for the homeworks: (i) to raise important questions,
(ii) to foster common ground for the in-class discussions, and
(iii) to help you master central NLU concepts.
All homeworks are due by the start of class on the day
they are due.
Submit all homeworks by email to the course
address:
The final project is the main assignment of the second half of
the course. Final projects can be done in groups of 1-3 people.
They are required to be related in a substantive way to at least
one of the central topics of the course. The main components are
as follows:
Literature review paper (due Feb 14, 11:59 pm): a short
6-page single-spaced paper summarizing and synthesizing 5
papers on the area of your final project. Groups of two
should review 7 papers, and groups of three should review
9. The ideal is to have the same topic for your lit review and
final project, but it's possible that you'll discover in the
lit review that you hate the topic, so you can switch topics
(or groups) for the final project; your lit review will be
graded on its own terms. Tips on major things to include:
General problem/task definition: What are these
papers trying to solve? Why?
Concise summaries of the articles: Do not
simply copy the article text in full. We can read them
ourselves. Put in your own words the major contributions of
each article.
Compare and contrast: Point out the similarities and
differences of the papers. Do they agree with each other?
Are results seemingly in conflict? If the papers address
different subtasks, how are they related? (If they are not
related, then you may have made poor choices for a lit
review...). This section is probably the most
valuable for the final project.
Future work: Make several suggestions for how
the work can be extended. Are there open questions to
answer? This would presumably include how the papers relate
to your final project idea.
A summary of previous approaches (drawing on the lit review).
A summary of the current approach.
A summary of progress so far: what you have been done,
what you still need to do, and any obstacles or concerns
that might prevent your project from coming to
fruition.
Research papers: These are papers where you
attempted some new research idea. This doesn't have to be
publishable research; it's totally great to do a
replication of a result you read about. Such papers
should contain clear sections describing (i) the problem
you are addressing; (ii) your hypothesis or proposed
solution (and if you are implementing someone else's
solution, where you got the idea from); (iii) alternative
solutions, or at least a baseline that you are comparing
your solution to; (iv) your methodology; (v) your
evaluation; and (vi) some discussion of what your results
imply for your hypothesis/problem.
Implementation papers: These are papers where
you code up a version of someone else's algorithm just to
learn the details of the algorithm, or do a big semantic
data labeling project. Here your want clear sections
describing (i) the task that you are replicating, the
algorithm you are implementing, or the data you are
labeling; (ii) your methodology (what you did, how you did
it); (iii) an evaluation, i.e., the experimental results;
and (iv) a discussion of what you learned.
Each student will have a total of 4 free late (calendar) days
applicable to any assignment (including the lit review and project
milestone) except the final project paper. These can be used at
any time, no questions asked. Each 24 hours or part thereof that
a homework is late uses up one full late day. Once these late days
are exhausted, any homework turned in late will be penalized 20%
per late day. Late days are not applicable to final projects. If a
group's assignment is late n days, then each group member
is charged n late days.
On the one hand, we want to encourage you to pursue unified
interdisciplinary projects that weave together themes from
multiple classes. On the other hand, we need to ensure that final
projects for this course are original and involve a substantial
new effort.
To try to meet both these demands, we are adopting the
following policy on joint submission: if your final project for
this course is related to your final project for another course,
you are required to submit both projects to us by our final
project due date. If we decide that the projects are too similar,
your project will receive a failing grade. To avoid this extreme
outcome, we strongly encourage you to stay in close communication
with us if your project is related to another you are submitting
for credit, so that there are no unhappy surprises at the end of
the term. Since there is no single objective standard for what
counts as "different enough", it is better to play it
safe by talking with us.
Fundamentally, we are saying that combining projects is not a
shortcut. In a sense, we are in the same position as professional
conferences and journals, which also need to watch out for
multiple submissions. You might have a look
at the current
ACL/NAACL policy, which strives to ensure that any two papers
submitted to those conferences are make substantially different
contributions — our goal here as well.
Students who may need an academic accommodation based on the
impact of a disability must initiate the request with the Student
Disability Resource Center (SDRC) located within the Office of
Accessible Education (OAE). SDRC staff will evaluate the request
with required documentation, recommend reasonable accommodations,
and prepare an Accommodation Letter for faculty dated in
the current quarter in which the request is being made. Students
should contact the SDRC as soon as possible since timely notice is
needed to coordinate accommodations. The OAE is located at 563
Salvatierra Walk (phone: 723-1066).
Ferrucci, David; Eric Brown; Jennifer Chu-Carroll;
James Fan; David Gondek; Aditya A. Kalyanpur;
Adam Lally; J. William Murdock; Eric Nyberg; John Prager;
Nico Schlaefer; and Chris Welty. 2010.
Building Watson: an overview of the DeepQA project.
AI Magazine 31(3): 59-79.
Optional advanced reading:
McCarthy, Diana; Rob Koeling; Julie Weeds; and John Carroll. 2007.
Finding predominant word senses in untagged text.
In Proceedings of ACL,
279-286.
Barcelona, Spain: ACL.
de Marneffe, Marie-Catherine; Bill MacCartney; and Christopher D. Manning. 2006.
Generating typed dependency parses from phrase structure parses.
In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006),
449-454.
Genoa, Italy: ELRA.
Optional
de Marneffe, Marie-Catherine and Christopher Manning. 2008.
The Stanford typed dependencies representation.
In Proceedings of the COLING 2008 Workshop on Cross-Framework and Cross-Domain Parser Evaluation,
1-8.
ACL.
Banko, Michele; Michael J. Cafarella; Stephen Soderland; Matt Broadhead; and Oren Etzioni. 2007.
Open information extraction from the web.
In Proceedings of IJCAI,
2670-2676.
Optional
Fillmore, Charles J. and B. T. Atkins. 1992.
Towards a frame-based lexicon: the case of RISK.
In Adrienne Lehrer and Eva F. Kittay, eds., Frames and Fields,
75-102.
Hillsdale, NJ: Erlbaum Publishers.
Incredibly useful general resource
Pang, Bo and Lillian Lee. 2008.
Opinion mining and sentiment analysis.
Foundations and Trends in Information Retrieval 2(1-2):1-135.
Tan, Chenhao; Lillian Lee; Jie Tang; Long Jiang; Ming Zhou; and Ping Li. 2011.
User-level sentiment analysis incorporating social networks.
In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1397-1405.
San Diego, CA: ACM Digital Library.
Jurafsky, Daniel and James H. Martin. 2009.
Speech and Language Processing, 2nd edition.
Chapter 21,
Computational discourse,
p. 1-15.
Prasad, Rashmi; Nikhil Dinesh; Alan Lee; Eleni Miltsakaki; Livio Robaldo; Aravind Joshi; and Bonnie Webber. 2008.
The Penn Discourse Treebank 2.0. In Nicoletta Calzolari,
Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Daniel Tapias, ed.,
Proceedings of the Sixth International Language Resources and Evaluation (LREC'08).
Marrakech, Morocco: European Language Resources Association (ELRA).
DeVault, David and Matthew Stone. 2009.
Learning to interpret utterances using dialogue history.
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009), 184-192.
Athens: Association for Computational Linguistics.
Allen, James; Nathanael Chambers; George Ferguson; Lucian Galescu; Hyuckchul Jung; Mary Swift; and William Taysom. 2007.
PLOW: a collaborative task learning agent.
Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, 1514-1519.
Vancouver: AAAI Press.