First-Order Translation Checklist

Although first-order logic is a system used to precisely describe worlds and define various terms, in many ways it feels like a little mini programming language built into mathematics. It has variables, things that work like loops (the $\forall$ and $\exists$ quantifiers), etc. As a result, we figured it would be worth putting together the equivalent of a “style guide” for that programming language. Like a coding style guide, this handout contains advice designed to do many things: to make your formulas more readable so you can more easily fix bugs, to describe common errors so you can more easily diagnose errors when they creep up, and to provide methodologies that make it harder to introduce bugs in the first place.

Here’s the checklist we recommend you run through when working on logic translations:

Pair quantifiers with the appropriate connectives.
Use whitespace and indentation to clarify meaning.
Defer quantifiers until they’re needed.
Check the types of all terms in the formula.
Check the scoping on each variable.

The rest of this page goes into more detail about what all these items mean and why they’re there in the first place.

Pair Quantifiers With the Appropriate Connectives

In our lectures on first-order logic translations and in the Guide to Logic Translations, we stress the importance of knowing the four Aristotelian forms and how to represent each of them in first-order logic. Those forms pair the $\to$ connective with the $\forall$ quantifier and the $\land$ connective with the $\exists$ quantifier. As you saw in our first lecture on first-order logic, mixing these up and pairing $\to$ with $\exists$ or $land$ with $\forall$ can result in incorrect translations.

When you’re reading over your translations and are getting ready to submit them, please, please, please do a double-check to make sure that you’re pairing the quantifiers and connectives properly. It might seem silly, but mixing up the $\to$ and $\land$ connectives and pairing them with the wrong quantifier is one of the single most common mistakes we see, and if you aren’t trained to look for them it’s easy to end up with formulas that seem correct but which fundamentally don’t work correctly. So go quantifier by quantifier through your formulas and make sure that everything agrees properly – this will save you lots of hours “debugging” your translations!

For example, consider this attempted translation of the statement “there’s a horse with no name:”

⚠ Incorrect Translation ⚠

\[\begin{aligned} & \exists h. (Horse(h) \to \\ & \quad \forall n. (NameOf(h) \ne n) \\ &) \end{aligned}\]

The top-level quantifier here is an existential quantifier, and it’s paired with the $\to$ connective, which immediately suggests that this formula isn’t translated correctly. And indeed you can see this: imagine there’s a world where every horse has a name, in which case “there’s a horse with no name” is false. But this first-order logic formula would be true: just choose as $h$ something that isn’t a horse.

Similarly, consider this attempted translation of the statement “everybody has someone to lean on:”

⚠ Incorrect Translation ⚠

\[\begin{aligned} & \forall p. (Person(p) \land \\ & \quad \exists q. (Person(p) \land CanLeanOn(p, q)) \\ &) \end{aligned}\]

Here, at the top level we see a universal quantifier paired with the $\land$ connective, which immediately suggests that this formula isn’t correct. As before, we can see this by thinking about an example. Imagine a world where everyone does indeed have someone to lean on. Now, pick as $p$ something that isn’t a person (say, Mt. Everest). Then $Person(p)$ is false, so the inside of universally-quantified formula is false for that choice of $p$, so the overall formula is false.

There are a few times when checking these rules can be a bit tricky. For example, let’s go back to our translation of the statement “there is someone that everybody else loves,” which is shown here:

Correct Formula

\[\begin{aligned} & \exists p. (Person(p) \land \\ & \quad \forall q. (Person(q) \land q \ne p \to \\ & \quad \quad Loves(q, p) \\ & \quad ) \\ & ) \end{aligned}\]

If you look at the universal quantifier introducing $q$, you’ll see that it is applied to a statement that does indeed involve the $\land$ connective, which seems like it breaks the rule from before. However, remember that the $\land$ connective has higher precedence than $\to$, so the grouping here is

$\forall q. ((Person(q) \land q \ne p) \to (\dots)$

with the implies at the top level.

As mentioned in the Guide to Logic Translations, and as you saw in lecture, you will sometimes see the $\leftrightarrow$ connective used with the $\forall$ quantifier, especially in the context of set theory. That’s a common context where you’ll see the normal $\to$/$\forall$ pairing break down.

You will also sometimes see $\exists$ paired with $\leftrightarrow$, which looks really weird the first time you see it. This often results from taking the negation of a formula that pairs $\forall$ with $\leftrightarrow$. For example, consider this formula:

Correct Formula

\[\begin{aligned} & \exists p. (Set(S) \land \\ & \quad \forall x. (x \in S \leftrightarrow x = 137) \\ & ) \end{aligned}\]

(Question to ponder: what does this say?) If we negate this formula and push the negation as deep as possible, we get this statement:

Correct Formula

\[\begin{aligned} & \forall p. (Set(S) \to \\ & \quad \exists x. (x \in S \leftrightarrow x \ne 137) \\ & ) \end{aligned}\]

Notice that the innermost $\forall$ got flipped to an $\exists$, but the inner connective is still a $\leftrightarrow$. That’s totally fine. Remember that you can simplify $\lnot(p \leftrightarrow q)$ to $p \leftrightarrow \lnot q$, so the $\leftrightarrow$ connective “flips” to itself when it’s negated.

A great specific test you can do check if your formulas have the right quantifiers is to use the “Mount Everest” test that we did in class. Try plugging in Mount Everest in for some of the quantified variables in your formula, and see whether you accidentally make the entire formula true or the entire formula false.

Use Whitespace and Indentation to Clarify Meaning

If you’re writing out a simple formula like this one:

$\forall x_1 \in A. \forall x_2 \in A. (x_1 \ne x_2 \to f(x_1) \ne f(x_2))$

the formula can nicely and concisely fit onto a single line. But if you’re writing out something more complex, it’s important to use whitespace to make it cleaner and easier to read. For example, consider this formula, which says “every nonempty tournament has a tournament champion:”

\[\begin{aligned} & \forall T. (Tournament(T) \land (\exists p. p \in T) \to \\ & \quad \exists c. (c \in T \land \\ & \quad \quad \forall p. (p \in T \land c \ne p \to \\ & \quad \quad \quad (Beat(c, p) \lor \exists q \in T. (Beat(c, q) \land Beat(q, p)) \\ & \quad \quad ) \\ & \quad ) \\ & ) \end{aligned}\]

It’s a good execise to see why this works. As a hint, the general pattern is the following:

\[\begin{aligned} & \text{For any tournament }T\text{ that has at least one player,} \\ & \quad \text{There is a player }c\text{ (our champion) in } T \text{ where} \\ & \quad \quad \text{For any player }p\text{ in the tournament other than c,} \\ & \quad \quad \quad \text{Either } c \text{ beat } \text{ p } \text{ or } c \text{ beat some player } q \text{ who in turn beat } p. \end{aligned}\]

This formula is lengthy and dense, but by writing it out on multiple lines and by indenting to show the nesting structure, it’s significantly easier to read each piece, especially if you’re comfortable with the Aristotelian forms and can recognize what each piece says.

On the other hand, imagine that you submit something like this for your TA to grade:

⚠ Please Don't Do This! ⚠

\[\forall T. (Tournament(T) \land (\exists p. p \in T) \to \exists c. (c \in T \land \forall p. (p \in T \land c \ne p \to (Beat(c, p) \lor \exists q \in T. (Beat(c, q) \land Beat(q, p))))\]

I mean, visually, that line is so long that on my computer it doesn't even fit into the box. Here's something to check - did we miss a close parenthesis somewhere? Hard to say, isn't it?

This is the mathematical equivalent of writing a program like this one:

int main(){for(int i=0;i<5;i++){for(int j=0;j<5;j++){cout<<(i*j)<<endl;}}return 0;}

This program is technically correct, but it’s written in such a way that even a veteran programmer would have to pause and think for a bit to see what’s being said. Even if this is functionally correct, it’s ✓- style at best. Just as you wouldn’t write programs like the one above, you should not try to represent a complicated first-order logic formula with a number of different pieces all on a single line, and you should not put line breaks in unusual places.

There’s a good chance that, as you’re writing out first-order logic formulas, you will need to revise what you’ve written. If you’ve properly indented your formulas and included appropriate whitespace, this will be much easier to do. If you have to investigate a giant line of text looking for where the parentheses don’t align, this will be quite challenging!

Defer Quantifiers Until They’re Needed

In most programming languages, you have the option to declare variables at many different points in a function. It’s generally considered a good idea to declare variables as late as you possibly can. For example, the following code would be considered poor style:

int i;
int bestIndex;
int cost;
int bestCost = -1;
 				
for (i = 0; i < v.size(); i++) {
    cost = costOf(v[i]);
    if (cost > bestCost) {
        bestCost = cost;
        bestIndex = i;
    }
}        

With all the variables declared up at the top, it’s hard to tell which values are transient and are only needed once per loop iteration, which values are needed just to control the loop, and which values are supposed to persist across the entire run of the loop. A much better way to write this code would be as follows:

int bestIndex;
int bestCost = -1;
 				
for (int i = 0; i < v.size(); i++) {
    int cost = costOf(v[i]);
    if (cost > bestCost) {
        bestCost = cost;
        bestIndex = i;
    }
}     

Here, it’s clearer that bestIndex and bestCost are supposed to persist across all iterations of the loop (that’s why they’re declared outside the loop), that i is the loop counter and only exists while the loop runs, and that cost is needed once per loop iteration and only exists during a single run of the loop. In that sense, this code is easier to read. But this code is also far less error-prone than the initial one. In the original version of the code, we could, at any point, read or write any of the four variables, making it easy to read a value that hasn’t been initialized, or to read a value from a previous loop iteration that hasn’t been reset yet.

In first-order logic, you have the ability to introduce variables at different points in a formula by using quantifiers. Our advice to you is the same as in programming: don’t put all the quantifiers at the front of the formula. This makes the formula much harder to read and is very error-prone.

As an example of this, let’s look at an attempted translation of the statement “there’s a kitten that loves everyone.” The following formula does not correctly represent this idea:

⚠ Incorrect Translation ⚠

\[\exists k. \forall p. (Kitten(k) \land Person(p) \to Loves(k, p))\]

Before we go into exactly why this doesn’t work, take a minute to read over this and see if you can figure out why it’s not a proper translation. Okay, you did that? Like, you actually took a minute to try this out? Great! Let's now discuss it as a group.

The problem with this particular translation is that, the way the connectives are written out, this gets parsed as follows:

⚠ Incorrect Translation ⚠

\[\exists k. \forall p. ((Kitten(k) \land Person(p)) \to Loves(k, p))\]

Notice that this is essentially an implication wrapped in an existential quantifier, which is something that, as we talked about earlier, does not work out correctly! Specifically, imagine that we pick $k$ to be something that isn’t a kitten. That choice of $k$ makes the entire formula true (do you see why?), completely independently of whether there is a kitten that loves everyone.

On the other hand, suppose we defer the quantifiers until they’re actually needed. That gives us this formula, which is indeed correct:

Correct Formula

\[\begin{aligned} & \exists k. (Kitten(k) \land \\ & \quad \forall p. (Person(p) \to Loves(k, p)) \\ & ) \end{aligned}\]

Notice here that the existential statement wraps a statement whose top-level connective is $\top$, which matches the general pattern we’ve seen so far. Specifically, picking $k$ to be something that isn’t a kitten won’t make the inside of the formula true, since we need to find a choice of $k$ where both $k$ is a kitten and $k$ loves every person.

As a general rule, when translating statements into first-order logic, we strongly advise against trying to guess all the quantifiers up front. Instead, follow the methodology that we describe in the Guide to Logic Translations: introduce quantifiers one at a time and only at the specific point in which you actually need them. If you do decide to put all the quantifiers up front, understand that even if your answer turns out to be right, it’s still not considered good style, just as declaring all your variables up front at the top of a function can be technically correct but not good style. (Pro tip / life advice: you should not feel good about anything you do that is technically correct or technically legal.)

Check the Types of All Terms in the Formula

First-order logic draws a distinction between objects and truth values. All variables in first-order logic represent objects, and predicates applied to those functions produce truth values. You can only apply predicates or functions to objects, and you can only apply connectives or quantifiers to truth values. If you try applying a connective to an object, you end up with a syntactically invalid formula for essentially the same reason that you get an error if in a language like C or C++ you try comparing a string against an integer – the types are wrong.

Before you submit any formula, make sure that all the types work out. Start off by finding all the objects or variables in the formula. For each of them, mark them as type object. Then find the functions. The inputs to any function must also have type object, and the overall function evaluates to something of type object. Then find the predicates – their inputs must all be objects, and their output is of type truth value. Finally, look at quantifiers and propositional connectives. Their inputs must have type truth value, and their outputs are of type truth value as well.

As an example, let’s imagine that we have the following predicates and functions at our disposal:

The predicate $Puppy(x)$, which states that $x$ is a puppy.
The predicate $Cute(x)$, which states that $x$ is cute.
The predicate $Person(x)$, which states that $x$ is a person.
The function $OwnerOf(x)$, which evaluates to the owner of $x$.

Let’s suppose we want to translate the statement “everyone owns at least one cute puppy” into first-order logic and that we come up with the following statement, which turns out to be incorrect:

⚠ Incorrect Translation ⚠

\[\forall p. (Person(p) = OwnerOf(\exists d. (Puppy(d) \land Cute(d))))\]

One way to see that this is incorrect is to notice that inside the top-level quantifier, we’re comparing two terms with the equality predicate: $Person(p)$ and $OwnerOf(\dots)$. This, unfortunately, isn’t syntactically valid, because $Person(p)$ is a predicate and evaluates to a truth value, but the equality predicate can only work on objects. In that sense, we can immediately see that this formula can’t be correct because the types don’t match up properly.

There’s another type error in here. Notice that we apply the $OwnerOf$ predicate to an existentially-quantified statement $(\exists d. \dots)$. Remember that any quantified statement evaluates to a truth value (in this case, there either is a cute puppy or there isn’t), and functions can only be applied to objects. Again, this by itself is sufficient for us to see that this formula cannot be correct.

Here’s another formula that’s an incorrect way of translating the statement:

⚠ Incorrect Translation ⚠

\[\begin{aligned} & \forall p. (Person(p) \to \\ & \quad \exists d. (OwnerOf(Cute(Puppy(d))) = p) \\ & ) \end{aligned}\]

To see where the issue is, let’s look at the expression $OwnerOf(Cute(Puppy(d))) = p$. Let’s work from the inside out. The variable $d$ has type object, since it’s a quantified variable. We’re providing it as input to the predicate $Puppy$, which takes in an object and produces a truth value. But then we try to apply the predicate $Cute$, which is supposed to take in something of type object, to $Puppy(d)$, which is a truth value. As a result, this formula isn’t syntactically valid.

Be sure to “type-check” your formulas before submitting them. Otherwise, you might end up with something that looks syntactically valid, but which doesn’t represent a first-order sentence.

Check the Scoping on Each Variable

The variable introduced by a quantifier has a scope, and it’s important to make sure that any time you reference a variable that that variable is in scope. (This is something you’re hopefully familiar with from your programming experience.)

There are two general classes of scoping mistakes we see people make. The first one is to omit an important pair of parentheses. For example, suppose you want to translate the statement “there is a movie featuring both Emma Stone and Ryan Gosling.” Here’s an incorrect translation:

⚠ Incorrect Translation ⚠

\[\exists m. Movie(m) \land AppearsIn(EmmaStone, m) \land AppearsIn(RyanGosling, m)\]

The problem with this statement has to do with the precedence of the existential quantifier. Remember that existential quantifiers have high operator precedence – higher than anything else, in fact – so this expression is interpreted as

⚠ Incorrect Translation ⚠

\[(\exists m. Movie(m)) \land (AppearsIn(EmmaStone, \underline{m})) \land (AppearsIn(RyanGosling, \underline{m}))\]

and the underlined appearances of $m$ refer to a nonexistent variable $m$. It’s the programming equivalent of writing the following:

{
    Movie m;
}
AppearsIn(EmmaStone, m);   // Error! m is out of scope.
AppearsIn(RyanGosling, m); // Error! m is out of scope.

This sort of error, fortunately, is easily corrected. As a rule of thumb, any time you introduce a quantifier, surround the expression you want to quantify with parentheses, as shown here:

Correct Formula

\[\exists m. (Movie(m) \land AppearsIn(EmmaStone, m) \land AppearsIn(RyanGosling, m))\]

As you’re reviewing all of your first-order logic formulas, make sure that every time you reference a quantified variable that you’re inside of some parentheses preceded by the quantifier. If you follow our indentation guidelines from earlier (which we strongly recommend!), you can see whether you’ve done this right by making sure that you put in some parentheses right after the quantifier and then just look at the indentation.

The other scoping error we typically see involves quantifying over a variable at the wrong time. For example, suppose we want to translate the statement “every person has a puppy” into first-order logic. Here’s an incorrect way to do this:

⚠ Incorrect Trnslation ⚠

\[\exists d. (Puppy(d)) \quad \land \quad \forall p. (Person(p) \to HasPet(p, d))\]

Notice that when we write $HasPet(p, d)$ that the variable $p$ is in scope, but the variable $d$ is out of scope. As a result, this formula isn’t syntactically valid.

This type of mistake is harder to correct. The issue here is that the quantifier introducing $d$ is in the wrong place – it’s at the top-level, meaning that there’s no connection between $p$ and $d$. A better translation would be something like this:

Correct Formula

\[\begin{aligned} & \forall p. (Person(p) \to \\ & \quad \exists d. (Puppy(d) \land HasPet(p, d)) \\ & ) \end{aligned}\]

Here, the quantifier introducing the puppy $d$ is nested inside of the statement quantified over by $p$, so everything is scoped properly. You can think of this rule, in some sense, as a consequence of the advice to delay quantifiers until they’re needed. We don’t need to be talking about puppies until we’re talking about a specific person, so we won’t introduce that quantifier at the start of the formula.