Due: Friday, January 30 (by noon)
Submit assignments electronically to all three teachers
(ron.kaplan "at" microsoft.com, tracy.king "at" microsoft.com, mforst "at" parc.com)
Turn in: | 1. the final grammar you end up with (eng-week2.lfg for PART 1-3 and eng-week2-mltsec.lfg for PART 4); |
please name it lastname-eng-week2.lfg | |
2. your revised testsuite with parse statistics (eng-week2-test.lfg.new); | |
please name it lastname-eng-week2-test.lfg.new | |
3. a rough estimate of how long this took so that we can adjust future assignments as needed |
Exercises on: | |
PART 1: | templates |
PART 2: | testsuites |
PART 3: | feature declarations |
PART 4: | multiple lexicon and grammar sections |
Part 4 is completely separate from the other parts; | |
if you get stuck on the other parts, try this for a change. |
Start from the grammar eng-week2.lfg
Do not use punctuation or capital letters; in later grammars we will add these in.
If you put a file called xlerc in the directory with your grammar and in xlerc you put:
create-parser eng-week2.lfg
then whenever you start xle in that directory, it will automatically load eng-week2.lfg. This will save a lot of time when making and testing changes.
If you look at the lexicon in your current grammar, you will see that a lot of material is repeated. This can lead to mistakes and makes it difficult to maintain grammars because if you change an analysis you have to make the change in many places. To capture regularities, XLE/LFG has a formal device called a template.
Use the existing templates as models to redo the lexicon using templates (see the entry for orange and the templates it calls). Your resulting lexicon should have no ^ in it, although some lexical entries may call more than one template. You can use the following template names; feel free to add additional templates to group these or to have these templates call other more basic ones:
To see how a lexical entry expands, on the xle command line try:
print-lex-entry orange
When you make a change to the templates, you must restart XLE for it to take effect. Templates are like grammar rules in this respect. XLE should warn you if you forgot to restart.
Templates can be called from the grammar rules. Look at the templates:
UP-OBJ = "annotation to assign object function" @(UP-GF OBJ). UP-GF(_GF) = "generic annotation to assign a grammatical function" (^ _GF)=!.
In the VP rule, replace:
(^ OBJ)=!
with:
@UP-OBJ
Restart the grammar and parse:
the monkey devoured a banana
To see how your new rule expands, on the xle command line try:
print-rule VP
Create similar templates and calls for SUBJ, OBJ2, and OBL.
Also create templates and calls for CASE and for the ! $ (^ ADJUNCT) annotation.
Turn in: Submit the final version of your grammar including the additions for PART 3 (that is, the changes for this part and PART 3 can be included in the same grammar file).
As the grammar expands, it is very easy to make changes that effect sentences in ways you did not expect. To help detect this problem, you can create a testsuite.
Look at the basic testsuite eng-week2-test.lfg (the emacs library works best if you name your testsuite with a .lfg suffix). # is used to introduce comments. Each item to be parsed is on its own line, surrounded by blank lines. The default parse category is defined by ROOTCAT in the grammar; here it is S. If you want to parse another category, it must precede the item with (e.g., NP: a monkey).
In xle, try:
parse-testfile eng-week2-test.lfg
This will produce several files:
Add sentences to the testfile that will cover all the basic grammar rules. For example:
Add some NPs to test out the NP rules. For example:
Parse your new testfile. Make sure that all the items parse and get the correct number of parses (usually 1, but there may be some legitimate ambiguities which will result in 2 or more parses).
Turn in: Submit the .new version of your new testsuite.
Like with changes to the rules and templates, if you change anything in the CONFIG or the feature declaration, you must restart XLE.
You need to create a feature declaration for eng-week2.lfg. (If you got stuck on PART 1 or 2, you can create a new version of the grammar for this part.) Do this in the following steps:
FEATURES (DEMO ENGLISH).
Make sure to include the ending period.
DEMO ENGLISH FEATURES (1.0) ----
DEMO ENGLISH FEATURES (1.0) NUM. ----
You can either examine the grammar to figure out the list of
features or you can read them off of the XLE warnings.
Remember to restart XLE after any changes to the feature declaration.
Note: You do not need to add features that are listed
as GOVERNABLERELATIONS or SEMANTICFUNCTIONS. You also do not need to
list PRED (this is a system declared feature).
If you add these in as a record keeping device, that is fine, but XLE
does not require it.
regenerate "a girl laughed"
Keep adding features until XLE has no more warnings and returns (the number of CPU seconds will depend on your machine):
A girl laughed regeneration took 0.2 CPU seconds.
DEMO ENGLISH FEATURES (1.0) NUM: -> $ { sg pl }. ----
There will be two basic formats. Features with atomic values will look like the NUM example above with the basic format:
FEAT: -> $ { val1 val2 val3 }.
Some features may only have a single value; you still need to include the {}.
Features that take f-structures as values will have the basic format:
FEAT: -> << [ FEAT1 FEAT2 FEAT3 ].
Every feature should have its values listed.
Once again, keep going until when you do:
regenerate "a girl laughed"
you get back:
A girl laughed regeneration took 0.2 CPU seconds.
Adding ADJUNCT-TYPE
For every ADJUNCT, we want to know what type of adjunct it is. Modify the ADJUNCT template so that it takes one parameter. This parameter should be the value of a new feature ADJUNCT-TYPE. So, when called in the VP, the call to ADJUNCT should be:
@(ADJUNCT VP)
And when called in the NP, the call should be:
@(ADJUNCT NP)
And when called in S, the call should be:
@(ADJUNCT S)
The f-structure for an ADJUNCT (PP or ADV) should now look roughly like the following for yesterday in yesterday the girl laughed:
[ PRED 'laugh<(^ SUBJ)>' SUBJ [ ... ] ADJUNCT { [ PRED 'yesterday' ADJUNCT-TYPE S ] } ]
Make sure to update the feature table as well as the template.
Adding TNS-ASP
Currently there are two features TENSE and ASPECT which can occur in the f-structures of verbs. Create a new feature TNS-ASP which takes the current TENSE and ASPECT features as its values. So, the f-structure of the girl is devouring a banana should look roughly like:
[ PRED 'devour<(^SUBJ)(^OBJ)>' SUBJ [...] OBJ [...] TNS-ASP [ TENSE pres ASPECT prog ] ]
In addition to modifying the templates, you will need to create a new entry in the feature declaration for TNS-ASP (note that you should not need to modify the TENSE and ASPECT declarations).
Turn in: Your new grammar with the feature declaration and the new ADJUNCT-TYPE and TNS-ASP features.
For this part, you should use the grammar eng-week2-mltsec.lfg. This grammar has two additional lexicon and rule sections in it:
DEMO-PLUS ENGLISH RULES (1.0) ----
occurs just after the DEMO ENGLISH RULES and:
DEMO-PLUS ENGLISH LEXICON (1.0) ----
occurs just after the DEMO ENGLISH LEXICON section at the very end of the file.
As a first step, you need to modify the RULES and LEXENTRIES listings in the CONFIG so that the DEMO-PLUS sections are more highly ranked than the DEMO ones that are already there. If you don't remember how to do this, either look in the XLE documentation or at the slides for week 2.
Add an entry for orange in the DEMO-PLUS lexicon so that orange is now both a noun and an adjective (see the entry for purple for a sample adjective entry).
The DEMO rules define an exremely simple AP (adjective phrase) rule. Modify the DEMO NP rule to allow you to parse things like:
a orange monkey devoured a purple banana a purple orange monkey laughed the girl devoured a orange orange
Add entries for ate and eats in the DEMO-PLUS lexicon so that they are both transitive (V-TRANS) and intransitive (V-INTRANS). You should now be able to parse:
the girl ate the girl ate a banana
Now add a new noun of your choice to the DEMO-PLUS lexicon and make sure you can parse it.
The AP rule simply goes to A. In the DEMO-PLUS rules,
create a new AP rule that allows very to optionally appear in
front of the A.
(Hint: make very some new c-structure category
such as AMOD and then make a lexical entry for it that looks similar
to that of adverbs like today only with this new c-structure category.)
Make sure you can parse:
a very purple monkey laughed
In the DEMO-PLUS lexicon add an entry for one other adjectival modifier similar to very; in a comment in the entry list a sentence your grammar can parse that uses this word.
Write a rule in your DEMO-PLUS rules that says:
VPaux --> FALSE.
Figure out what effect this has on your grammar. In XLE, you can use:
print-rule VP print-rule VPaux
to see what the rules expand to.
In a comment after the new VPaux rule, state what this basic affect was and list one sentence whose behaviour has changed with the addition of this rule.
Comments in the grammar are anything between "" (these have to be the straight up and down, non-directional quotes). There are some sample comments in the templates. Comments can be many lines long. An example from eng-week3-mltsec.lfg:
NOUN-SG(_P) = "template for singular nouns" @(PRED _P) @(NUM sg).
Turn in: The new version of your grammar with the new lexicon and rule sections and the comment about the VPaux rule in it.
If you have any questions, you can send us email (ron.kaplan "at" microsoft.com, tracy.king "at" microsoft.com, mforst "at" parc.com), call us (Ron: 650-245-6865; Tracy: 415-8487276, Martin: 650-812-4788), or set up office hours with us.