Quantifer elimination

A theory admits quantifier-elimination when every assertion is logically equivalent over the theory to a quantifier-free assertion. This is quite a remarkable property when it occurs, because it reveals a severe limitation on the range of concepts that can be expressed in the theory—a quantifier-free assertion, after all, is able to express only combinations of the immediate atomic facts at hand. As a result, we are generally able to prove quantifier-elimination results for a theory only when we already have a profound understanding of it and its models, and the quantifier-elimination result itself usually leads quickly to classification of the definable objects, sets, and relations in the theory and its models. In this way, quantifier-elimination results often showcase our mastery over a particular theory and its models. So let us present a few quantifier-elimination results, exhibiting our expertise over some natural theories.

Endless dense linear orders


Consider any two rational numbers $x,y$ in the structure $\<\Q,<>$. What can one say about them? Well, we can certainly make the atomic assertions that $x$ to a Boolean combination of these assertions.

Theorem. The theory of the rational order $\<\Q,<>$ admits elimination of quantifiers—every assertion $\varphi(x,\ldots)$ is logically equivalent in the rational order to a quantifier-free assertion.

Proof. To see this, observe simply by Cantor’s categoricity theorem for countable dense linear orders that any pair $x<y$ in $\Q$ is automorphic to any other such pair $x'<y’$, and similarly for pairs with $x=y$ or $y<x$. Consequently, $\varphi(x,y)$ either holds of all pairs with $x<y$ or of none of them, of all pairs with $x=y$ or none, and of all pairs with $y<x$ or none. The assertion $\varphi(x,y)$ is therefore equivalent to the disjunction of the three atomic relations for which it is realized, including $\top$ as the disjunction of all three atomic possibilities and $\bottom$ as the empty disjunction.

More generally, a similar observation applies to assertions $\varphi(x_1,\ldots,x_n)$ with more free variables. By Cantor’s theorem, every $n$-tuple of points in $\Q$ is automorphic with any other such $n$-tuple of points having the same atomic order relations. Therefore any assertion holding of one such $n$-tuple holds of all $n$-tuples with that same atomic type, and consequently every assertion $\varphi(x_1,\ldots,x_n)$ is logically equivalent in $\<\Q,<>$ to a disjunction of those combinations of atomic relations amongst the variables $x_1,\ldots,x_n$ for which it holds. In particular, every assertion is equivalent in $\<\Q,<>$ to a quantifier-free assertion. In short, the theory of this model $\Th(\<\Q,<>)$ admits elimination of quantifiers. $\Box$

What about other endless dense linear orders? The argument we have given so far is about the theory of this particular model $\<\Q,<>$. In fact, the theory of the rational order is exactly the theory of endless dense linear orders, because this theory is complete, which one can see as an immediate consequence of the categoricity result of Cantor’s theorem and the downward Löwenheim-Skolem theorem. In my book, I have not yet proved the Löwenheim-Skolem theorem at this stage, however, and so let me give a direct proof of quantifier-elimination in the theory of endless dense linear orders, from which we can also derive the completeness of this theory.

Theorem. In the theory of endless dense linear orders, every statement is logically equivalent to a quantifier-free statement.

Proof. To clarify, the quantifier-free statement will have the same free variables as the original assertion, provided we allow $\bottom$ and $\top$ as logical constants. We argue by induction on formulas. The claim is of course already true for the atomic formulas, and it is clearly preserved under Boolean connectives. So it suffices inductively to eliminate the quantifier from $\exists x\, \varphi(x,\ldots)$, where $\varphi$ is itself quantifier-free. We can place $\varphi$ in disjunctive normal form, a disjunction of conjunction clauses, where each conjunction clause is a conjunction of literals, that is, atomic or negated atomic assertions. Since the atomic assertions $x<y$, $x=y$ and $y<x$ are mutually exclusive and exhaustive, the negation of any one of them is equivalent to the disjunction of the other two. Thus we may eliminate any need for negation. By redistributing conjunction over disjunction as much as possible, we reduce to the case of $\exists x\,\varphi$, where $\varphi$ is in disjunctive normal form without any negation. The existential quantifier distributes over disjunction, and so we reduce to the case $\varphi$ is a conjunction of atomic assertions. We may eliminate any instance of $x=x$ or $y=y$, since these impose no requirement. We may assume that the variable $x$ occurs in each conjunct, since otherwise that conjunct commutes outside the quantifier. If $x=y$ occurs in $\varphi$ for some variable $y$ not identical to $x$, then the existential claim is equivalent to $\varphi(y,\ldots)$, that is, by replacing every instance of $x$ with $y$, and we have thus eliminated the quantifier. If $x<x$ occurs as one of the conjuncts, this is not satisfiable and so the assertion is equivalent to $\bottom$. Thus we have reduced to the case where $\varphi$ is a conjunction of assertions of the form $x<y_i$ and $z_j<x$. If only one type of these occurs, then the assertion $\exists x\,\varphi$ is outright provable in the theory by the endless assumption and thus equivalent to $\top$. Otherwise, both types $x<y_i$ and $z_j<x$ occur, and in this case the existence of an $x$ obeying this conjunction of assertions is equivalent over the theory of endless dense linear orders to the quantifier-free conjunction $\bigwedge_{i,j}z_j<y_i$, since there will be an $x$ between them in this case and only in this case. Thus, we have eliminated the quantifier $\exists x$, and so by induction every formula is equivalent over this theory to a quantifier-free formula. $\Box$

Corollary. The theory of endless dense linear orders is complete.

Proof. If $\sigma$ is any sentence in this theory, then by theorem above, it is logically equivalent to a Boolean combination of quantifier-free assertions with the same variables. Since $\sigma$ is a sentence and there are no quantifier-free atomic sentences except $\bottom$ and $\top$, it follows that $\sigma$ is equivalent over the theory to a Boolean combination of $\bottom$ or $\top$. All such sentences are equivalent either to $\bottom$ or $\top$, and thus either $\sigma$ is entailed by the theory or $\neg\sigma$ is, and so the theory is complete. $\Box$

Corollary. In any endless dense linear order, the definable sets (allowing parameters) are precisely the finite unions of intervals.

Proof. By intervals we mean a generalized concept allowing either open or closed endpoints, as well as rays, in any of the forms:
$$(a,b)\qquad [a,b]\qquad [a,b)\qquad (a,b]\qquad (a,\infty)\qquad [a,\infty)\qquad (\unaryminus\infty,b)\qquad (\unaryminus\infty,b]$$
Of course any such interval is definable, since $(a,b)$ is defined by $(a<x)\wedge(x<b)$, taking the endpoints $a$ and $b$ as parameters, and $(-\infty,b]$ is defined by $(x<b)\vee (x=b)$, and so on. Thus, finite unions of intervals are also definable by taking a disjunction.

Conversely, any putative definition $\varphi(x,y_1,\ldots,y_n)$ is equivalent to a Boolean combination of atomic assertions concerning $x$ and the parameters $y_i$. Thus, whenever it is true for some $x$ between, above, or below the parameters $y_i$, it will be true of all $x$ in that same interval, and so the set that is defined will be a finite union of intervals having the parameters $y_i$ as endpoints, with the intervals being open or closed depending on whether the parameters themselves satisfy the formula or not. $\Box$

Theory of successor

Let us next consider the theory of a successor function, as realized for example in the Dedekind model, $\<\N,S,0>$, where $S$ is the successor
function $Sn=n+1$. The theory has the following three axioms:
$$\forall x\, (Sx\neq 0)$$

$$\forall x,y\, (Sx=Sy\implies x=y)$$

$$\forall x\, \bigl(x\neq 0\implies \exists y\,(Sy=x)\bigr).$$
In the Dedekind model, every individual is definable, since $x=n$ just in case $x=SS\cdots S0$, where we have $n$ iterative applications of $S$. So this is a pointwise definable model, and hence also Leibnizian. Note the interplay between the $n$ of the object theory and $n$ of the metatheory in the claim that every individual is definable.

What definable subsets of the Dedekind model can we think of? Of course, we can define any particular finite set, since the numbers are definable as individuals. For example, we can define the set ${1,5,8}$ by saying, “either $x$ has the defining property of $1$ or it has the defining property of $5$ or it has the defining property of $8$.” Thus any finite set is definable, and by negating such a formula, we see also that any cofinite set—the complement of a finite set—is definable. Are there any other definable sets? For example, can we define the set of even numbers? How could we prove that we cannot? The Dedekind structure has no automorphisms, since all the individuals are definable, and so we cannot expect to use automorphism to show that the even numbers are not definable as a set. We need a deeper understanding of definability and truth in this structure.

Theorem. The theory of a successor function admits elimination of quantifiers—every assertion is equivalent in this theory to a quantifier-free assertion.

Proof. By induction on formulas. The claim is already true for atomic assertions, since they have no quantifiers, and quantifier-free assertions are clearly closed under the Boolean connectives. So it suffices by induction to eliminate the quantifier from assertions of the form $\exists x\, \varphi(x,\ldots)$, where $\varphi$ is quantifier free. We may place $\varphi$ in disjunctive normal form, and since the quantifier distributes over disjunction, we reduce to the case that $\varphi$ is a conjunction of atomic and negated atomic assertions. We may assume that $x$ appears in each atomic conjunct, since otherwise we may bring that conjunct outside the quantifier. We may furthermore assume that $x$ appears on only one side of each atomic clause, since otherwise the statement is either trivially true as with $SSx=SSx$ or $Sx\neq SSx$, or trivially false as with $Sx=SSx$. Consider for example:
$$\exists x\,\bigl[(SSSx=y)\wedge (SSy=SSSz)\wedge (SSSSSx=SSSw)\wedge{}$$
$$\hskip1in{}\wedge (Sx\neq SSSSw)\wedge (SSSSy\neq SSSSSz)\bigr]$$
We can remove duplicated $S$s occurring on both sides of an equation. If $x=S^ky$ appears, we can get rid of $x$ and replace all occurrences with $S^ky$. If $S^nx=y$ appears, can add $S$’s everywhere and then replace any occurrence of $S^nx$ with $y$. If only inequalities appear, then the statement is simply true.

For example, since the third clause in the formula above is equivalent to $SSx=w$, we may use that to omit any need to refer to $x$, and the formula overall is equivalent to
$$(Sw=y)\wedge (y=Sz)\wedge (w\neq SSSSSw)\wedge (y\neq Sz),$$ which has no quantifiers.
Since the method is completely general, we have proved that the theory of successor admits elimination of quantifiers. $\Box$

It follows that the definable sets in the Dedekind model $\<\N,S,0>$, using only the first-order language of this structure, are precisely the finite and cofinite sets.

Corollary. The definable sets in $\<\N,S,0>$ are precisely the finite and cofinite sets

Proof. This is because an atomic formula defines a finite set, and the collection of finite or cofinite sets is closed under negation and Boolean combinations. Since every formula is equivalent to a quantifier-free formula, it follows that every formula is a Boolean combination of atomic formulas, and hence defines a finite or cofinite set. $\Box$

In particular, the concepts of being even or being odd are not definable from the successor operation in $\<\N,S,0>$, since the set of even numbers is neither finite nor cofinite.

Corollary. The theory of a successor function is complete—it is the theory of the standard model $\<\N,S,0>$.

Proof. If $\sigma$ is a sentence in the language of successor, then by the quantifier-elimination theorem it is equivalent to a quantifier-free assertion in the language with the successor function $S$ and constant symbol $0$. But the only quantifier-free sentences in this language are Boolean combinations of equations of the form $S^n0=S^k0$. Since all such equations are settled by the theory, the sentence itself is settled by the theory, and so the theory is complete. $\Box$

We saw that the three axioms displayed on the previous page were true in the Dedekind model $\<\N,S,0>$. Are there any other models of these axioms? Yes, there are. For example, we can add another $\Z$-chain of successors on the side, as with $\N+\Z$ or $\N\sqcup\Z$, although we shall see that the order is not definable. What are the definable elements in the enlarged structure? Still $0$ and all its finite successors are definable as before. But no elements of the $\Z$-chains can be definable, because we may perform an automorphism of the structure that translates elements within the $\Z$-chain by a fixed amount.

Let me prove next that the theory implies the induction axiom schema.

Corollary. The theory of successor (the three axioms) implies the induction axiom schema in the language of successor, that is, the following assertion for any assertion $\varphi(x)$:
$$\left[\varphi(0)\wedge\bigl(\forall x\,\bigl(\varphi(x)\implies\varphi(Sx)\bigr)\right]\implies\forall x\,\varphi(x)$$

Proof. Consider the set defined by $\varphi(x)$. By the earlier corollary, it must be eventually periodic in the standard model $\<\N,S,0>$. But by the induction assumption stated in the theorem, it must hold of every number in the standard model. So the standard model thinks that $\forall x\,\varphi(x)$. But the theory of the standard model is the theory of successor, which is complete. So the theory of successor entails that $\varphi$ is universal, as desired. $\Box$

In other words, in the trivial theory of successor–the three axioms—we get the corresponding induction axiom for free.

Presburger arithmetic

Presburger arithmetic is the theory of addition on the natural numbers, that is, the theory of the structure $\<\N,+,0,1>$. The numbers $0$ and $1$ are actually definable here from addition alone, since $0$ is the unique additive identity, and $1$ is the only number $u$ that is not expressible as a sum $x+y$ with both $x\neq u$ and $y\neq u$. So we may view this model if desired as a definitional expansion of $\<\N,+>$, with addition only. The number $2$ is similarly definable as $1+1$, and indeed any number $n$ is definable as $1+\cdots+1$, with $n$ summands, and so this is a pointwise definable model and hence also Leibnizian.

What are the definable subsets? We can define the even numbers, of course, since $x$ is even if and only if $\exists y\,(y+y=x)$. We can similarly define congruence modulo $2$ by $x\equiv_2 y\iff \exists z\,\bigl[(z+z+x=y)\vee (z+z+y=x)\bigr]$. More generally, we can express the relation of congruence modulo $n$ for any fixed $n$ as follows:
$$x\equiv_n y\quad\text{ if and only if }\exists z\,\bigl[(\overbrace{z+\cdots+z}^n+x=y)\vee(\overbrace{z+\cdots+z}^n+y=x)\bigr].$$
What I claim is that this exhausts what is expressible.

Theorem. Presburger arithmetic in the definitional expansion with all congruence relations, that is, the theory of the structure
$$\<\N,+,0,1,\equiv_2,\equiv_3,\equiv_4,\ldots>$$
admits elimination of quantifiers. In particular, every assertion in the language of $\<\N,+,0,1>$ is equivalent to a quantifer-free assertion in the language with the congruence relations.

Proof. We consider Presburger arithmetic in the language with addition $+$, with all the congruence relations $\equiv_n$ for every $n\geq 2$, and the constants $0$ and $1$. We prove quantifier-elimination in this language by induction on formulas. As before the claim already holds for atomic assertions and is preserved by Boolean connectives. So it suffices to eliminate the quantifier from assertions of the form $\exists x\,\varphi(x,\ldots)$, where $\varphi$ is quantifier-free. By placing $\varphi$ into disjunctive normal form and distributing the quantifier over the disjunction, we may assume that $\varphi$ is a conjunction of atomic and negated atomic assertions. Note that negated congruences are equivalent to a disjunction of positive congruences, such as in the case:
$$x\not\equiv_4 y\quad\text{ if and only if }\quad (x+1\equiv_4y)\vee(x+1+1\equiv_4y)\vee (x+1+1+1\equiv_4 y).$$
We may therefore assume there are no negated congruences in $\varphi$. By canceling like terms on each side of an equation or congruence, we may assume that $x$ occurs on only one side. We may assume that $x$ occurs nontrivially in every conjunct of $\varphi$, since otherwise this conjunct commutes outside the quantifier. Since subtraction modulo $n$ is the same as adding $n-1$ times, we may also assume that all congruences occurring in $\varphi$ have the form $kx\equiv_n t$, where $kx$ denotes the syntactic expression $x+\cdots+x$ occurring in the formula, with $k$ summands, and $t$ is a term not involving the variable $x$. Thus, $\varphi$ is a conjunction of expressions each having the form $kx\equiv_n t$, $ax+r=s$, or $bx+u\neq v$, where $ax$ and $bx$ similarly denote the iterated sums $x+\cdots+x$ and $r,s,u,v$ are terms not involving $x$.

If indeed there is a conjunct of the equality form $ax+r=s$ occurring in $\varphi$, then we may omit the quantifier as follows. Namely, in order to fulfill the existence assertion, we know that $x$ will have to solve $ax+r=s$, and so in particular $r\equiv_a s$, which ensures the existence of such an $x$, but also in this case any inequality $bx+u\neq v$ can be equivalently expressed as $abx+au\neq av$, which since $ax+r=s$ is equivalent to $bs+au\neq av+br$, and this does does not involve $x$; similarly, any congruence $kx\equiv_n t$ is equivalent to $akx\equiv_{an}at$, which is equivalent to $s\equiv_{an} r+at$, which again does not involve $x$. Thus, when there is an equality involving $x$ present in $\varphi$, then we can use that fact to express the whole formula in an equivalent manner not involving $x$.

So we have reduced to the case $\exists x\,\varphi$, where $\varphi$ is a conjunction of inequalities $bs+u\neq v$ and congruences $kx\equiv_n t$. We can now ignore the inequalities, since if the congruence system has a solution, then it will have infinitely many solutions, and so there will be an $x$ solving any finitely given inequalities. So we may assume that $\varphi$ is simply a list of congruences of the form $kx\equiv_n t$, and the assertion is that this system of congruences has a solution. But there are only finitely many congruences mentioned, and so by working modulo the least common multiple of the bases that occur, there are only finitely many possible values for $x$ to be checked. And so we can simply replace $\varphi$ with a disjunction over these finitely many values $i$, replacing $x$ in each conjunction with $1+\cdots+1$, using $i$ copies of $1$, for each $i$ up to the least common multiples of the bases that arise in the congruences appearing in $\varphi$. If there is an $x$ solving the system, then one of these values of $i$ will work, and conversely.

So we have ultimately succeeded in expressing $\exists x\,\varphi$ in a quantifier-free manner, and so by induction every assertion in Presburger arithmetic is equivalent to a quantifier-free assertion in the language allowing addition, congruences, and the constants $0$ and $1$. $\Box$

Corollary. The definable sets in $\<\N,+,0,1>$ are exactly the eventually periodic sets.

Proof. Every periodic set is definable, since one can specify the set up to the period $p$, and then express the invariance modulo $p$. Any finite deviation from a definable set also is definable, since every individual number is definable. So every eventually period set is definable. Conversely, every definable set is defined by a quantifier-free assertion in the language of $\<\N,+,0,1,\equiv_2,\equiv_3,\equiv_4,\ldots>$. We may place the definition in disjunctive normal form, and again replace negated congruences with a disjunction of positive congruences. For large enough values of $x$, the equalities and inequalities appearing in the definition become irrelevant, and so the definition eventually agrees with a finite union of solutions of congruence systems. Every such system is periodic with a period at most the least common multiple of the bases of the congruences appearing in it. And so every definable set is eventually periodic, as desired. $\Box$

Corollary. Multiplication is not definable in $\<\N,+,0,1>$. Indeed, the squaring operation is not definable, and neither is the divisibility relation $p\divides q$.

Proof. If we could define multiplication, or even the squaring operation, then we would be able to define the set of perfect squares, but this is not eventually periodic. Similarly, if we could define the divides relation $p\divides q$, then we could define the set of prime numbers, which is not eventually periodic. $\Box$

Real-closed field

Let us lastly consider the ordered real field $\<\R,+,\cdot,0,1,<>$. I want to mention (without proof) that a deep theorem of Tarski shows that this structure admits elimination of quantifiers: every assertion is equivalent in this structure to a quantifier-free assertion. In fact all that is need is that this is a real-closed field, an ordered field in which every odd-degree polynomial has a root and every positive number has a square root.

We can begin to gain insight to this fact by reaching into the depths of our high-school education. Presented with an equation $ax^2+bx+c=0$ in the integers, we know by the quadratic formula that the solution is $x=\left(-b\pm\sqrt{b^2-4ac}\right)/2a$, and in particular, there is a solution in the real numbers if and only if $b^2-4ac\geq 0$, since otherwise a negative discriminant means the solution is a complex number. In other words,
$$\exists x\,(ax^2+bx+c=0)\quad\text{ if and only if }\quad b^2-4ac\geq 0.$$
The key point is that this an elimination of quantifiers result, since we have eliminated the quantifier $\exists x$.

Tarski’s theorem proves more generally that every assertion in the language of ordered fields is equivalent in real-closed fields to a quantifier-free assertion. Furthermore, there is a computable procedure to find the quantifier-free equivalent, as well as a computable procedure to determine the truth of any quantifier-free assertion in the theory of real-closed fields.

What I find incredible is that it follows from this that there is a computable procedure to determine the truth of any first-order assertion of Cartesian plane geometry, since all such assertions are expressible in the language of $\<\R,+,\cdot,0,1,<>$. Amazing! I view this as an incredible culmination of two thousand years of mathematical investigation: we now have an algorithm to determine by rote procedure the truth of any statement in Cartesian geometry. Meanwhile, a counterpoint is that the decision procedure, unfortunately, is not computationally feasible, however, since it takes more than exponential time, and it is a topic of research to investigate the computational running time of the best algorithms.

The logic of definite descriptions

We use a definite description when we make an assertion about an individual, referring to that individual by means of a property that uniquely picks them out. When I say, “the badly juggling clown in the subway car has a sad expression” I am referring to the clown by describing a property that uniquely determines the individual to whom I refer, namely, the clown that is badly juggling in the subway car, that clown, the one fulfilling this description. Definite descriptions in English typically involve the definite article “the” as a signal that one is picking out a unique object or individual.

If there had been no clown in the subway car, then my description wouldn’t have succeeded—there would have been no referent, no unique individual falling under the description. My reference would similarly have failed if there had been a clown, but no juggling clown, or if there had been a juggling clown, but juggling well instead of badly, or indeed if there had been many juggling clowns, perhaps both in the subway car and on the platform, but all of them juggling very well (or at least the ones in the subway car), for in this case there would have been no badly juggling clown in the subway car. My reference would also have failed, in a different way, if the subway car was packed full of badly juggling clowns, for in this case the description would not have succeeded in picking out just one of them. In each of these failing cases, there seems to be something wrong or insensible with my statement, “the badly juggling clown in the subway car has a sad expression.” What would be the meaning of this assertion if there was no such clown, if for example all the clowns were juggling very well?

Bertrand Russell emphasized that when one makes an assertion involving a definite description like this, then part of what is being asserted is that the definite description has succeeded. According to Russell, when I say, “the book I read last night was fascinating,” then I am asserting first of all that indeed there was a book that I read last night and exactly one such book, and furthermore that this book was fascinating. For Russell, the assertion “the king of France is bald” asserts first, that there is such a person as the king of France and second, that the person fitting that description is bald. Since there is no such person as the king of France, Russell takes the statement to be false.

Iota expressions

$$M\satisfies\exists!x\,\psi(x),$$
or in other words, when
$$M\satisfies\exists x\forall y\,\bigl(x=y\iff\psi(y)\bigr).$$
The value of the term $\bigl(\iiota x\,\psi(x)\bigr)$ interpreted in $M$ is this unique object fulfilling property $\psi$. The use of iota expressions is perhaps the most meaningful when this property is indeed fulfilled, that is, when the reference succeeds, and we might naturally take them otherwise to be meaningless or undefined, a failed reference.

Because the iota expressions are not always meaningful in this way, their treatment in formal logic faces many of the same issues faced by a formal treatment of partial functions, functions that are not necessarily defined on the whole domain of discourse. According to the usual semantics for first-order logic, the interpretation of a function symbol $f$ is a function defined on the whole domain of the model—in other words, we interpret functions symbols with total functions.

But partial functions commonly appear throughout mathematics, and we might naturally seek a formal treatment of them in first-order logic. One immediate response to this goal is simply to point out that partial functions are already fruitfully and easily treated in first-order logic by means of their graph relations $y=g(x)$. One can already express everything one would want to express about a partial function $g$ by reference to the graph relation—whether a given point is in the domain of the function and if it is, what the value of the function is at that point and so on. In this sense, first-order logic already has a robust treatment of partial functions.

In light of that response, the dispute here is not about the expressive power of the logic, but is rather entirely about the status of terms in the language, about whether we should allow partial functions to appear as terms. To be sure, mathematicians customarily form term expressions, such as $\sqrt{x^2-3}$ or $1/x$ in the context of the real numbers $\R$, which are properly defined only on a subset of the domain of discourse, and in this sense, allowing partial functions as terms can be seen as aligned with mathematical practice.

But the semantics are a surprisingly subtle matter. The main issue is that when a term is not defined it may not be clear what the meaning is of assertions formed using that term. To illustrate the point, suppose that $e(x)$ is a term arising from a partial function or from an iota expression that is not universally defined in a model $M$, and suppose that $R$ is a unary relation that holds of every individual in the model. Do we want to say that $M\satisfies\forall x\ R\bigl(e(x)\bigr)$? Is it true that for every person the elephant they are riding is self-identical? Well, some people are not riding any elephant, and so perhaps we might say, no, that shouldn’t be true, since some values of $e(x)$ are not defined, and so this statement should be false. Perhaps someone else suggests that it should be true, because $R\bigl(e(x)\bigr)$ will hold whenever $e(x)$ does succeed in its reference—in every case where someone is riding an elephant, it is self-identical. Or perhaps we want to say the whole assertion is meaningless? If we say it is meaningful but false, however, then it would seem we would want to say $M\satisfies\neg\forall x\ R\bigl(e(x)\bigr)$ and consequently also $M\satisfies\exists x\ \neg R\bigl(e(x)\bigr)$. In other words, in this case we are saying that in $M$ that there is some $x$ such that $e(x)$ makes the always-true predicate $R$ false—there is a person, such that the elephant they are riding is not self-identical. That seems weird and probably undesirable, since it only works because $e(x)$ must be undefined for this $x$. Furthermore, this situation seems to violate Russell’s injunction that assertions involving a definite description are committed to the success of that reference, for in this case, the truth of the assertion $\exists x\ \neg R\bigl(e(x)\bigr)$ is based entirely on the failure of the reference $e(x)$. Ultimately we shall face such decisions in how to define the semantics in the logic of iota expressions and more generally in the first-order logic of partial functions as terms.

The strong semantics for iota expressions

Let me first describe what I call the strong semantics for the logic of iota expressions. Inspired by Russell’s theory of definite descriptions, we shall define the truth conditions for every assertion in the extended language allowing iota expressions $(\iiota x\,\varphi)$ as terms. Notice that $\varphi$ itself might already have iota expressions inside it; the formulas of the extended language can be graded in this way by the nesting rank of their iota expressions. The strong semantics I shall describe here also works more generally for the logic of partial functions.

For any model $M$ and valuation $\valuation[\vec a]$, we shall define the satisfaction relation $M\satisfies\varphi\valuation[\vec a]$ and the term valuation $t^M\valuation[\vec a]$ recursively. In this logic, the interpretation of terms $t^M\valuation[\vec a]$ in a model $M$ will be merely partial, not necessarily defined in every case. Nevertheless, for a given iota expression $(\iiota x\,\varphi)$, if we already defined the semantics for $\varphi$, then the term $\bigl(\iiota x\,\varphi)^M\valuation[\vec a]$ is defined to be the unique individual $b$ in $M$, if there is one, for which $M\satisfies\varphi\valuation[b,\vec a]$, where $b$ is the value of variable $x$ in this valuation; and if there is not a unique individual with this property, then this term is not defined. An atomic assertion of the form $(Rt_1\cdots t_n)\valuation[\vec a]$ is defined to be true in $M$ if all the terms $t_i^M\valuation[\vec a]$ are defined in $M$ and $R^M(t_1^M\valuation[\vec a],\ldots,t_n^M\valuation[\vec a])$ holds. And similarly an atomic assertion of the form $s=t$ is true in $M$ with valuation $\valuation[\vec a]$ if and only if $t^M\valuation[\vec a]$ and $s^M\valuation[\vec a]$ are both defined and they are equal. Notice that if a term expression is not defined, then every atomic assertion it appears in will be false. Thus, in the atomic case we have implemented Russell’s theory of definite descriptions.

We now simply extend the satisfaction relation recursively in the usual way through Boolean connectives and quantifiers. That is, the model satisfies a conjunction $M\satisfies(\varphi\wedge\psi)\valuation[\vec a]$ just in case it satisfies both of them $M\satisfies\varphi\valuation[\vec a]$ and $M\satisfies\psi\valuation[\vec a]$; it satisfies a negation $M\satisfies\neg\varphi\valuation[\vec a]$ if and only if it fails to satisfy $\varphi$, and so on as usual with the other logical connectives. For quantifiers, $M\satisfies\forall x\ \varphi\valuation[\vec a]$ if and only if $M\satisfies\varphi\valuation[b,\vec a]$ for every individual $b$ in $M$, using the valuation assigning value $b$ to variable $x$; and similarly for the existential quantifier.

The strong semantics in effect combine Russell’s treatment of definite descriptions in the case of atomic assertions with Tarski’s disquotational theory to extend the truth conditions recursively to complex assertions. The strong semantics are complete—every assertion $\varphi$ or its negation $\neg\varphi$ will be true in $M$ at any valuation of the free variables, because if $\varphi$ is not true in $M$ at $\valuation[\vec a]$, then by definition, $\neg\varphi$ will be declared true. In particular, exactly one of the sentences will be true, even if they involve definite descriptions that do not refer.

No new expressive power

The principal observation to be made initially about the logic of iota expressions is that they offer no new expressive power to our language. Every assertion that can be made in the language with iota expressions can be made equivalently without them. In short, iota expressions are logically eliminable.

Theorem. Every assertion in the language with iota expressions is logically equivalent in the strong semantics to an assertion in the original language.

Proof. We prove this by induction on formulas. Of course, the claim is already true for all assertions in the original language, and since the strong semantics in the iota expression logic follow the same recursion for Boolean connectives and quantifiers as in the original language, it suffices to remove iota expressions from atomic assertions $Rt_1\cdots t_n$, where some of the terms involve iota expressions. Consider for simplicity the case of $R(\iiota x\,\varphi)$, where $\varphi$ is a formula in the original language with no iota expressions. This assertion is equivalent to the assertion that $(\iiota x\,\varphi)$ exists, which we expressed above as $\exists x\forall y\, \bigl(x=y\iff\varphi(y)\bigr)$, together with the assertion that $R$ holds of that value, which can be expressed as $\forall x\,\bigl(\varphi(x)\implies Rx\bigr)$. The argument is similar if additional functions were applied on top of the iota term or if there were multiple iota terms—one simply says that the overall atomic assertion is true if each of the iota expressions appearing in it is defined and the resulting values fulfill the atomic assertion. By this means, the iota terms can be systematically eliminated from the atomic assertions in which they appear, and therefore every assertion is equivalent to an assertion not using any iota expression. $\Box$

In light of this theorem, perhaps there is little at stake in the dispute about whether to augment our language with iota expressions, since they add no formal expressive power.

Criticism of the strong semantics

I should like to make several criticisms of the strong semantics concerning how well it fulfills the goal of providing a logic of iota expressions based on Russell’s theory of definite descriptions.

Does not actually fulfill the Russellian theory. We were led to the strong semantics by Russell’s theory of definite descriptions, and many logicians take the strong semantics as a direct implementation of Russell’s theory. But is this right? To my way of thinking, at the very heart of Russell’s theory is the idea that an assertion involving reference by definite description carries an implicit commitment that those references are successful. Let us call this the implicit commitment to reference, and I should like to consider this idea on its own, apart from whatever Russell’s view might have been. My criticism here is that the strong semantics does not actually realize the implicit commitment to reference for all assertions.

It does fulfill the implicit commitment to reference for atomic assertions, to be sure, for we defined that an atomic assertion $At$ involving a definite description term $t$ is true if and only if the term $t$ successfully refers and the referent falls under the predicate $A$. The atomic truth conditions thus implement exactly what the Russellian implicit commitment to reference requires.

But when we extend the semantics through the Tarskian compositional recursion, however, we lose that feature. Namely, if an atomic assertion $At$ is false because the definite description term $t$ has failed to refer successfully, then the Tarskian recursion declares the biconditional $At\iff At$ to be true, the biconditional of two false assertions, and $\neg At$ and $At\implies At$ are similarly declared true in this case for the strong semantics. In all three cases, we have a sentence with a failed definite description and yet we have declared it true anyway, violating the implicit commitment to reference.

The tension between Russell and Tarski. The issue reveals an inherent tension between Russell and Tarski, a tension between a fully realized implicit commitment to reference and the compositional theory of truth. Specifically, the examples above show that if we follow the Russellian theory of definite descriptions for atomic assertions and then apply the Tarskian recursion, we will inevitably violate the implicit commitment to reference for some compound assertions. In other words, to require that every true assertion making reference by definite description commits to the success of those references simply does not obey the Tarski recursion. In short, the implicit commitment to reference is not compositional.

Does not respect stipulative definitions. The strong semantics does not respect stipulative definitions in the sense that the truth of an assertion is not always preserved when replacing a defined predicate by its definition.

Consider the ring of integers, for example, and the sentence “the largest prime number is odd.” We could formalize this as an atomic assertion $O(\iiota x\, Px)$, where $Ox$ is the oddness predicate expressing that $x$ is odd and $Px$ expresses that $x$ is a largest prime number. Since there is no largest prime number in the integers, the definite description $(\iiota x\,Px)$ has failed, and so on the strong semantics the sentence $O(\iiota x\, Px)$ is false in the integer ring.

But suppose that we had previously introduced the oddness predicate $Ox$ by stipulative definition over the ring of integers $\<\Z,+,\cdot,0,1>$, in the plain language of rings, defining oddness via $Ox\iff \neg\exists y\, (y+y=x)$. In the original language, therefore the assertion “the largest prime is odd” would be expressed as $\neg\exists y\, \bigl(y+y=(\iiota x\,Px)\bigr)$. Because the iota expression does not succeed, the atomic assertion $y+y=(\iiota x\,Px)$ is false for every particular value of $y$, and so the existential $\exists y$ is false, making the negation true. In this formulation, therefore, “the largest prime is odd” is true! So although the sentence was false in the definitional expansion using the atomic oddness predicate, it became true when we replaced that predicate by its definition, and so the strong semantics does not respect the replacement of a stipulatively defined predicate by its definition.

Some philosophical treatments of these kinds of cases focus on ambiguity and scope. One can introduce lambda expressions $\bigl(\lambda x\,\varphi(x)\bigr)$, for example, expressing property $\varphi$ in a manner to be treated as atomic for the purpose of applying the Russellian theory, so that $\bigl(\lambda x\,\varphi(x)\bigr)(t)$ expresses that $t$ is defined and $\varphi(t)$ holds. In simple cases, these lambda expressions amount to stipulative definitions, while in more sophisticated instances, subtle levels of dependence are expressed when $\varphi$ has other free variables not to be treated as atomic in this way by this expression. Nevertheless, the combined lambda/iota expressions are fully eliminable, as every assertion in that expanded language is equivalently expressible in the original language. To my way of thinking, the central problem here is not one of missing notation, precisely because we can already express the intended meaning without any lambda or iota expressions at all. Rather, the fundamental phenomenon is the fact that the strong semantics violates the expected logical principles for the treatment of stipulative definitions.

Does not respect logical equivalence. The strong semantics of iota expressions does not respect logical equivalence. In the integer ring $\<\Z,+,\cdot,0,1>$, there are two equivalent ways to express that a number $x$ is odd, namely,
$$\text{Odd}_0(x) = \neg\exists y\, (y+y=x)$$ $$\text{Odd}_1(x) = \exists y\, (y+y+1=x).$$
These two predicates are equivalent in the integers,
$$\<\Z,+,\cdot,0,1>\satisfies\forall x\, \bigl(\text{Odd}_0(x)\iff\text{Odd}_1(x)\bigr).$$
And yet, if we assert “the largest prime number is odd” using these two equivalent formulations, either as $\text{Odd}_0(\iiota x\,Px)$ or as $\text{Odd}_1(\iiota x\,Px)$, then we get opposite truth values. The first formulation is true, as we just previously observed above, but the second is false, because the atomic assertion $y+y+1=(\iiota x\,Px)$ is false for every particular $y$ and so the existential fails. So this is a case where the substitution instance of an iota expression into logically equivalent assertions yields different truth values.

Does not respect instantiation. The strong semantics for iota expressions does not respect universal instantiation. Every structure will declare some instances of $[\forall x\,\varphi(x)]\implies\varphi(t)$ false. Specifically, in any structure $\forall x\, (x=x)$ is true, but if $(\iiota x\,\varphi)$ is a failing iota expression, such as $(\iiota x\, x\neq x)$, then $(\iiota x\,\varphi)=(\iiota x\,\varphi)$ is false, because this is an atomic assertion with a failing definite description. So we have a false instantiation of a true universal.

The weak semantics for iota expressions

In light of these criticisms, let me describe an alternative semantics for the logic of definite descriptions, a more tentative and hesitant semantics, yet in many respects both reasonable and appealing. Namely, the weak semantics takes the line that in order for an assertion about an individual specified by definite description to be meaningful, the description must in fact succeed in its reference—otherwise it is not meaningful. On this account, for example, the sentence “the largest prime number is odd” is meaningless in the integer ring, without a truth value, and similarly with any further sentence built from this assertion. On the weak semantics, the assertion fails to express a proposition because there is no largest prime number in the integers.

On the weak semantics, we first make a judgement about whether an assertion is meaningful before stating whether it is true or not. As before, an iota expression $(\iiota x\,\varphi)$ is undefined in a model at a valuation if there is not a unique fulfilling instance (using the weak semantics recursively in this evaluation). In the atomic case of satisfaction $(Rt_1\cdots t_n)\valuation[\vec a]$, the weak semantics judges the assertion to be meaningful only if all of the term expressions $t_i^M\valuation[\vec a]$ are defined, and in this case the assertion is true or false accordingly as to whether $R^M(t_1^M\valuation[\vec a],\ldots,t_n^M\valuation[\vec a])$ holds. If one or more of the terms is not defined, then the assertion is judged meaningless. Similarly, in the disquotational recursion through the Boolean connectives, we say that $\varphi\wedge\psi$ is meaningful only when both conjuncts are meaningful, and true exactly when both are true. And similarly for $\varphi\vee\psi$, $\varphi\implies\psi$, $\neg\varphi$, and $\varphi\iff\psi$. In each case, with the weak semantics we require all of the constituent subformulas to be meaningful in order for the whole expression to be judged meaningful, whether or not the truth values of those assertions could affect the overall truth value.

The compositional theory of truth implicitly defines a well-founded relation on the instances of satisfaction $M\satisfies\varphi\valuation[\vec a]$, with each instance reducing via the Tarski recursion to certain relevant earlier instances. In the weak semantics, an assertion $M\satisfies\varphi\valuation[\vec a]$ is meaningful if and only if every iota term valuation arising hereditarily in the recursive calculation of the truth values is successfully defined.

The choice to use the weak semantics can be understood as a commitment to use robust definite descriptions that succeed in their reference. For meaningful assertions, one should ensure that all the relevant definite descriptions succeed, such as by using robust descriptions with default values in cases of failure, rather than relying on the semantical rules to paper over or fix up the effects of sloppy failed references. Nevertheless, the weak and the strong semantics agree on the truth value of any assertion found to be meaningful. In this sense, being true in the weak semantics is simply a tidier way to be true, one without sloppy failures of reference.

The weak semantics can be seen as a nonclassical logic in that not all instances of the law of excluded middle $\varphi\vee\neg\varphi$ will be declared true, since if $\varphi$ is not meaningful then neither will be this disjunction. But the logic is not intuitionist, since not all instances of $\varphi\implies\varphi$ are true. Meanwhile, the weak semantics are classical in the sense that in every meaningful instance, the law of excluded middle holds, as well as double-negation elimination.

The natural semantics for iota expressions

The natural semantics is a somewhat less hesitant semantics guided by the idea that an assertion with iota expressions or partially defined terms is meaningful when sufficiently many of those terms succeed in their reference to determine the truth value. In this semantics, we take a conjunction $\varphi\wedge\psi$ as meaningfully true, if both $\varphi$ and $\psi$ are meaningfully true, and meaningfully false if at least one of them is meaningfully false. A disjunction $\varphi\vee\psi$ is meaningfully true, if at least one of them is meaningfully true, and meaningfully false if both are meaningful and false. A negation $\neg\varphi$ is meaningful exactly when $\varphi$ is, and with the opposite truth value. In the natural semantics an implication $\varphi\implies\psi$ is meaningfully true if either $\varphi$ is meaningful and false or $\psi$ is meaningful and true, and $\varphi\implies\psi$ is meaningfully false if $\varphi$ is meaningfully true and $\psi$ is meaningfully false. This formulation of the semantics enables a robust treatment of conditional assertions, such as in the ordered real field where we might take division $y/x$ as a partial function, defined as long as the denominator is nonzero. In the natural semantics, the assertion
$$\forall x\, (0<x\implies 0<1/x)$$
comes out meaningful and true in the reals, whereas it is not meaningful in the weak semantics because $1/0$ is not defined, which might seem rather too weak because this case arises only when it is also ruled out by the antecedent. A biconditional $\varphi\iff\psi$ is meaningful if and only if both $\varphi$ and $\psi$ are meaningful, and it is true if these have the same truth value. In the natural semantics, an existential assertion $\exists x\ \varphi\valuation[\vec a]$ is judged meaningful and true if there is an instance $\varphi\valuation[b,\vec a]$ that is meaningful and true, and meaningfully false if every instance is meaningful but false. A universal assertion $\forall x\ \varphi\valuation[\vec a]$ is meaningfully true if every instance $\varphi\valuation[b,\vec a]$ is meaningful and true, and meaningfully false if at least one instance is meaningfully false.

Still no new expressive power

The weak semantics and the natural semantics both address some of the weird aspects of the strong semantics by addressing head-on and denying the claim that assertions made about nonexistent individuals are meaningful. This could be refreshing to someone put off by any sort of claim made about the king of France or the largest prime number in the integers—such a person might prefer to regard these claims as not meaningful. And yet, just as with the strong semantics, the weak semantics and the natural semantics offer no new expressive power to the logic.

Theorem. The language of first-order logic with iota expressions in either the weak semantics or the natural semantics has no new expressive power—for every assertion in the language with iota expressions, there are formulas in the original language expressing that the given formula is meaningful, respectively in the two semantics, and others expressing that it is meaningful and true.

Proof. This can be proved by induction on formulas similar to the proof of the no-new-expressive-power theorem for the strong semantics. The key point is that the question whether a given instance of iota expression $(\iiota x\,\varphi)$ succeeds in its reference or not is expressible by means of $\varphi$ without using any additional iota terms not already in $\varphi$. By augmenting any atomic assertion in which such an iota expression appears with the assertions that the references have succeeded, we thereby express the meaningfulness of the atomic expression. Similarly, the definition of what it means to be meaningful and what it means to be true in the weak semantics or in the natural semantics was itself defined recursively, and so in this way we can systematically eliminate the iota expressions and simultaneously create formulas asserting the meaningfulness and the truth of any given assertion in the expanded iota language. $\Box$

Let me next prove the senses in which both the weak and natural semantics survive the criticisms I had mentioned earlier for the strong semantics.

Theorem.

1. Both the weak and natural semantics fulfill Russell’s theory of definite descriptions—if an assertion is true, then every definite description relevant for this was successful.
2. Both the weak and natural semantics respect Tarski’s compositional theory of truth—if an assertion is meaningful, then its truth value is determined by the Tarski recursion.
3. Both the weak and natural semantics respect stipulative definitions—replacing any defined predicate in a meaningful assertion by its definition, if meaningful, preserves the truth value.
4. Both the weak and natural semantics respect logical equivalence—if $\varphi(x)$ and $\psi(x)$ are logically equivalent ($x$ occuring freely in both) and $t$ is any term, then $\varphi(t)$ and $\psi(t)$ get the same truth judgement.
5. Both the weak and natural semantics respect universal instantiation—if $\forall x\varphi(x)$ and $\varphi(t)$ are both meaningful, then $[\forall x\varphi]\implies\varphi(t)$ is meaningful and true.

Proof. For statement (1), the notion of relevance here for the weak semantics is that of arising earlier in the well-founded recursive definition of truth, while in the natural semantics we are speaking of relevant instances sufficient for the truth calculation. In either semantics, for an assertion to be declared true, then all the definite descriptions relevant for this fact are successful. A truth judgement is never made on the basis of a failed reference.

Statement (2) is immediate, since both the weak and the natural semantics are defined in a compositional manner.

Statement (3) is proved by induction. If we have introduced predicate $Rx$ by stipulative definition $\rho(x)$, which is meaningful at every point of evaluation in the model, then whenever a term $t$ is defined, the predicate $Rt$ in the model will be meaningful and have the same truth value as $\rho(t)$. So replacing $R$ with $\rho$ in any formula will give the same truth judgement.

For statement (4), suppose that $\varphi(x)$ and $\psi(x)$ are logically equivalent, meaning that $x$ occurs freely in both assertions and the model gives the same truth judgement to them at every point. If $t$ is not defined, then both $\varphi(t)$ and $\psi(t)$ are declared meaningless (this is why we need the free variable actually to occur in the formulas). If $t$ is defined, then by the assumption of logical equivalence, $\varphi(t)$ and $\psi(t)$ will get the same truth value.

Statement (5) follows immediately for the weak semantics, since if $\forall x\,\varphi(x)$ and $\varphi(t)$ are both meaningful, then in particular, if $x$ actually occurs freely in $\varphi$ then $t$ must be defined in the weak semantics, in which case $\varphi(t)$ will follow from $\forall x\,\varphi(x)$. In the natural semantics, if $\forall x\,\varphi(x)$ is meaningfully true, then $\varphi(t)$ also will be, and if it is meaningfully false, then the implication $[\forall x\,\varphi]\implies\varphi(t)$ is meaningfully true. $\Box$

Deflationary philosophical conclusions

To my way of thinking, the principal philosophical conclusion to make in light of the no-new-expressive-power theorems is that there is nothing at stake in the debate about whether to add iota expressions to the logic or whether to use the strong or weak semantics. The debate is moot, and we can adopt a deflationary stance, because any proposition or idea that we might wish to express with iota expressions and definite descriptions in one or the other semantics can be expressed in the original language. The expansion of linguistic resources provided by iota expressions is ultimately a matter of mere convenience or logical aesthetics whether we are to use them or not. If the logical feature or idea you wish to convey is more clearly or easily conveyed with iota expressions, then go to town! But in principle, they are eliminable.

Similarly, there is little at stake in the dispute between the weak and the strong semantics. In fact they agree on the truth judgements of all meaningful assertions. In light of this, there seems little reason not to proceed with the strong semantics, since it provides coherent truth values to the assertions lacking a truth value in the weak semantics. The criticisms I had mentioned may be outweighed simply by having a complete assignment of truth values. The question of whether assertions made about failed definite descriptions are actually meaningful can be answered by the reply: they are meaningful, because the strong semantics provide the meaning. But again, because every idea that can be expressed in these semantics can also be expressed without it, there is nothing at stake in this decision.

This blog post is adapted from my book-in-progress, Topics in Logic, an introduction to a wide selection of topics in logic for philosophers, mathematicians, and computer scientists.

Checkmate is not the same as a forced capture of the enemy king in simplified chess

In my imagination, and perhaps also in historical reality, our current standard rules of chess evolved from a simpler era with a simpler set of rules for the game. Let us call it simplified chess. In simplified chess there was the same 8×8 board with the same pieces as now moving under the same movement rules. But the winning aim was different, and simpler. Namely, the winning goal of simplified chess was simply to capture the enemy king. You won the game by capturing the opposing king, just as you would capture any other piece.

There was therefore no need to mention check or checkmate in the rules. Rather, these words described important situations that might arise during game play. Specifically, you have placed your opponent in check, if you were in a position to capture their king on the next move, unless they did something to prevent it. You placed your opponent in checkmate, if there was indeed nothing that they could do to prevent it.

In particular, in simplified chess there was no rule against leaving your king in check or even moving your king into check. Rather, this was simply an inadvisable move, because your opponent could take it and thereby win. Checkmate was merely the situation that occurred when all your moves were like this.

It is interesting to notice that it is common practice in blitz chess and bullet chess to play with something closer to simplified chess rules—it is often considered winning simply to capture the opposing king, even if there was no checkmate. This is also how the chess variant bughouse is usually played, even in official tournaments. To my way of thinking, there is a certain attractive simplicity to the rules of simplified chess. The modern chess rules might seem to be ridiculous for needlessly codifying into the rules a matter that could simply be played out on the board by actually capturing the king.

Part of what I imagine is that our contemporary rules could easily have evolved from simplified chess from a practice of polite game play. In order to avoid the humiliation of actually capturing and removing the opponent’s king and replacing it with one’s own piece, which indeed might even have been a lowly pawn, a custom developed to declare the game over when this was a foregone conclusion. In other words, I find it very reasonable to suppose that the winning checkmate rule simply arose from a simplified rule set by common practice of respectful game play.

I am not a chess historian, and I don’t really know if chess did indeed evolve from a simpler version of the game like this, but it strikes me as very likely that something like this must have happened. I await comment from the chess historians. Let me add though that I would also find it reasonable to expect that simplified chess might also have had no provision for opening pawns moving two squares instead of just one. Such a rule could arise naturally as an agreed upon compromise to quicken the game and get to the more interesting positions more quickly. But once that rule was adopted, then the en passant rule is a natural corrective to prevent abuse. I speculate that castling may have arisen similarly—perhaps many players in a community customarily took several moves, perhaps in a standard manuver sequence, to accomplish the effect of hiding their kings away toward the corner and also bringing their rooks to the center files; the practice could have simply been codified into a one-move practice.

My main point in this post, however, does not concern these other rules, but rather only the checkmate winning condition rule and to a certain logic-of-chess issue it brings to light.

When teaching chess to beginners, it is common to describe the checkmate winning situation in terms of the necessary possibility of capturing the king. One might say that a checkmate situation means the king is attacked, and there is nothing the opponent can do to prevent the king from being captured.

This explanation suggests a general claim in the logic of chess: a position is in checkmate (in the contemporary rules) if and only if the winning player can necessarily capture the opposing king on the next move (in simplified chess).

This claim is mostly true. In most instances, when you have placed your opponent in checkmate, then you would be able to capture their king on your next move in simplified chess, since all their moves would leave the king in check, and so you could take it straight away.

But I would like to point out something I found interesting about this checkmate logic claim. Namely, it isn’t true in all cases. There is a position, I claim, that is checkmate in the modern rules, but in simplified chess, the winning player would not be able to capture the enemy king on the next move.

My example proceeds from the following (rather silly) position, with black to move. (White pawns move upward.)

Of course, Black should play the winning move: knight to C7, as shown here:

This places the white king in check, and there is nothing to be done about it, and so it is checkmate. Black has won, according to the usual chess rules of today.

But let us consider the position under the rules of simplified chess. Will Black be able to capture the white king? Well, after Black’s nC7, it is now White’s turn. Remember that in simplified chess, it is allowed (though inadvisable) to leave one’s king in check at the end of turn or even to move the king into check. But the trouble with this position is not that White can somehow move to avoid check, but rather simply that White has no moves at all. There are no White moves, not even moves that leave White in check. But therefore, in simplified chess, this position is a stalemate, rather than a Black win. In particular, Black will not actually be able to capture the White king, since White has no moves, and so the game will not proceed to that juncture.

Conclusion: checkmate in contemporary chess is not always the same thing as being necessarily able to capture the opposing king on the next move in simplified chess.

Of course, perhaps in simplified chess, one wouldn’t regard stalemate as a draw, but as a win for the player who placed the opponent in stalemate. That would be fine by me (and it would align with the rules currently used in draughts, where one loses when there is no move available), but it doesn’t negate my point. The position above would still be a Black win under that rule, sure, but still Black wouldn’t be able to capture the White king. That is, my claim would stand that checkmate (in modern rules) is not the same thing as the necessary possibility to capture the opposing king.

I had made a Tweet earlier today about this idea (below), but opted for a fuller explanation in this blog post to explain the idea more carefully.

On Twitter, user Gro-Tsen pointed out a similar situation arises with stalemates. Namely, consider the following position, with White to play:

Perhaps Black had just played qB5, which perhaps was a blunder, since now the White king sits in stalemate, with no move. So in the usual chess rules, this is a drawn position.

But in simplified chess, according to the rules I have described, the White king is not yet explicitly captured, and so he is free still to move around, to A6, A8, B6, or B7 (moving into check), or also capturing the black rook at B8. But in any case, whichever move White makes, Black will be able to capture the White king on the next move. So in the simplified chess rules, this position is White-to-win-in-1, and not a draw.

Furthermore, this position therefore shows once again that checkmate (in normal rules) is not the same as the necessary possibility to capture the king (in simplified rules), since this is a position where Black has a necessary possibility to capture the White king on the next move in simplified chess, but it is not checkmate in ordinary chess, being rather stalemate.

;TLDR Ultimately, my view is that our current rules of chess likely evolved from simplified rules and the idea that checkmate is what occurs when you have a necessary possibility of capturing the enemy king on the next move in those rules. But nevertheless, example chess positions show that these two notions are not quite exactly the same.

Fregean abstraction in set theory—a deflationary account, Italian Philosophy of Mathematics, September 2022

This will be a talk for the conference Philosophy of Mathematics: Foundations, Definitions and Axioms, the Fourth International Conference of the Italian Network for the Philosophy of Mathematics, 29 September to 1 October 2022.

Abstract. The standard set-theoretic distinction between sets and classes instantiates in important respects the Fregean distinction between objects and concepts, for in set theory we commonly take the universe of sets as a realm of objects to be considered under the guise of diverse concepts, the definable classes, each serving as a predicate on that domain of individuals. Although it is commonly held that in a very general manner, there can be no association of classes with objects in a way that fulfills Frege’s Basic Law V, nevertheless, in the ZF framework, it turns out that we can provide a completely deflationary account of this and other Fregean abstraction principles. Namely, there is a mapping of classes to objects, definable in set theory in senses I shall explain (hence deflationary), associating every first-order parametrically definable class $F$ with a set object $\varepsilon F$, in such a way that Basic Law V is fulfilled: $$\varepsilon F =\varepsilon G\iff\forall x\ (Fx\leftrightarrow Gx).$$ Russell’s elementary refutation of the general comprehension axiom, therefore, is improperly described as a refutation of Basic Law V itself, but rather refutes Basic Law V only when augmented with powerful class comprehension principles going strictly beyond ZF. The main result leads also to a proof of Tarski’s theorem on the nondefinability of truth as a corollary to Russell’s argument. A central goal of the project is to highlight the issue of definability and deflationism for the extension assignment problem at the core of Fregean abstraction.

Nonlinearity and illfoundedness in the hierarchy of consistency strength and the question of naturality, Italy (AILA), September 2022

This will be a talk for the meeting of The Italian Association for Logic and its Applications (AILA) in Caserta, Italy 12-15 September 2022.

Abstract. Set theorists and philosophers of mathematics often point to a mystery in the foundations of mathematics, namely, that our best and strongest mathematical theories seem to be linearly ordered and indeed well-ordered by consistency strength. Why should it be? The phenomenon is thought to carry profound significance for the philosophy of mathematics, perhaps pointing us toward the ultimately correct mathematical theories, the “one road upward.” And yet, we know as a purely formal matter that the hierarchy of consistency strength is not well-ordered. It is ill-founded, densely ordered, and nonlinear. The statements usually used to illustrate these features, however, are often dismissed as unnatural or as Gödelian trickery. In this talk, I aim to rebut that criticism by presenting a variety of natural hypotheses that reveal ill-foundedness in consistency strength, density in the hierarchy of consistency strength, and incomparability in consistency strength.

Set theory inside out: realizing every inner model theory in an end extension, European Set Theory Conference, September 2022

This will be a talk for the European Set Theory Conference 2022 in Turin, Italy 29 August – 2 September 2022.

Abstract. Every countable model of ZFC set theory with an inner model satisfying a sufficient theory must also have an end-extension satisfying that theory. For example, every countable model with a measurable cardinal has an end-extension to a model of $V=L[\mu]$; every model with extender-based large cardinals has an end-extension to a model of $V=L[\vec E]$; every model with infinitely many Woodin cardinals and a measurable above has an end-extension to a model of $\text{ZF}+\text{DC}+V=L(\mathbb{R})+\text{AD}$. These results generalize the famous Barwise extension theorem, of course, asserting that every countable model of ZF set theory admits an end-extension to a model of $\text{ZFC}+{V=L}$, a theorem which was simultaneously a technical culmination of Barwise’s pioneering methods in admissible set theory and infinitary logic and also one of those rare mathematical theorems that is saturated with philosophical significance. In this talk, I shall describe a new proof of the Barwise theorem that omits any need for infinitary logic and relies instead only on classical methods of descriptive set theory, while also providing the generalization I mentioned. This proof furthermore leads directly to the universal finite sequence, a $\Sigma_1$-definable finite sequence, which can be extended arbitrarily as desired in suitable end-extensions of the universe, a result holding important consequences for the nature of set-theoretic potentialism.  This work is joint with Kameryn J. Williams.

• J. D. Hamkins and K. J. Williams, “The $\Sigma_1$-definable universal finite sequence,” Journal of Symbolic Logic, 2021.
[Bibtex]
@ARTICLE{HamkinsWilliams2021:The-universal-finite-sequence,
author = {Joel David Hamkins and Kameryn J. Williams},
title = {The $\Sigma_1$-definable universal finite sequence},
journal = {Journal of Symbolic Logic},
year = {2021},
volume = {},
number = {},
pages = {},
month = {},
note = {},
abstract = {},
keywords = {},
eprint = {1909.09100},
archivePrefix = {arXiv},
primaryClass = {math.LO},
source = {},
doi = {10.1017/jsl.2020.59},
}

The ontology of mathematics, Japan Association for the Philosophy of Science, June 2022

I shall give the Invited Lecture for the Annual Meeting (online) of the Japanese Association for the Philosophy of Science, 18-19 June 2022.

Abstract. What is the nature of mathematical ontology—what does it mean to make existence assertions in mathematics? Is there an ideal mathematical realm, a mathematical universe, that those assertions are about? Perhaps there is more than one. Does every mathematical assertion ultimately have a definitive truth value? I shall lay out some of the back-and-forth in what is currently a vigorous debate taking place in the philosophy of set theory concerning pluralism in the set-theoretic foundations, concerning whether there is just one set-theoretic universe underlying our mathematical claims or whether there is a diversity of possible set-theoretic conceptions.

Infinite Games, Frivolities of the Gods, Logic at Large Lecture, May 2022

The Dutch Association for Logic and Philosophy of the Exact Sciences (VvL) has organized a major annual public online lecture series called LOGIC AT LARGE, where “well-known logicians give public audience talks to a wide audience,” and I am truly honored to have been invited to give this year’s lecture. This will be an online event, the second of the series, scheduled for May 31, 2022 (note change in date!), and further access details will be posted when they become available. Free registration can be made on the VvL Logic at Large web page.

Abstract. Many familiar finite games admit natural infinitary analogues, which often highlight intriguing issues in infinite game theory. Shall we have a game of infinite chess? Or how about infinite draughts, infinite Hex, infinite Go, infinite Wordle, or infinite Sudoku? Let me introduce these games and use them to illustrate various fascinating concepts in the theory of infinite games.

Come enjoy the lecture, and stay for the online socializing event afterwards. Hope to see you there!

Reflection in second-order set theory with abundant urelements bi-interprets a supercompact cardinal

• J. D. Hamkins and B. Yao, “Reflection in second-order set theory with abundant urelements bi-interprets a supercompact cardinal,” Mathematics arXiv, 2022.
[Bibtex]
@ARTICLE{HamkinsYao:Reflection-in-second-order-set-theory-with-abundant-urelements,
author={Joel David Hamkins and Bokai Yao},
year={2022},
eprint={2204.09766},
archivePrefix={arXiv},
primaryClass={math.LO},
title = {Reflection in second-order set theory with abundant urelements bi-interprets a supercompact cardinal},
journal = {Mathematics arXiv},
volume = {},
number = {},
pages = {},
month = {},
note = {manuscript under review},
abstract = {},
keywords = {},
source = {},
doi = {},
url = {http://jdh.hamkins.org/second-order-reflection-with-abundant-urelements},
}

Abstract. After reviewing various natural bi-interpretations in urelement set theory, including second-order set theories with urelements, we explore the strength of second-order reflection in these contexts. Ultimately, we prove, second-order reflection with the abundant atom axiom is bi-interpretable and hence also equiconsistent with the existence of a supercompact cardinal. The proof relies on a reflection characterization of supercompactness, namely, a cardinal $\kappa$ is supercompact if and only if every $\Pi^1_1$ sentence true in a structure $M$ (of any size) containing $\kappa$ in a language of size less than $\kappa$ is also true in a substructure $m\prec M$ of size less than $\kappa$ with $m\cap\kappa\in\kappa$.

See also my talk at the CUNY Set Theory Seminar: The surprising strength of reflection in second-order set theory with abundant urelements

The surprising strength of reflection in second-order set theory with abundant urelements, CUNY Set Theory seminar, April 2022

This was an online talk 15 April 12:15 for the CUNY Set Theory Seminar. Held on Zoom at 876 9680 2366.

Abstract. I shall give a general introduction to urelement set theory and the role of the second-order reflection principle in second-order urelement set theory GBCU and KMU. With the abundant atom axiom, asserting that the class of urelements greatly exceeds the class of pure sets, the second-order reflection principle implies the existence of a supercompact cardinal in an interpreted model of ZFC. The proof uses a reflection characterization of supercompactness: a cardinal $\kappa$ is supercompact if and only if for every second-order sentence $\psi$ true in some structure $M$ (of any size) in a language of size less than $\kappa$ is also true in a first-order elementary substructure $m\prec M$ of size less than $\kappa$. This is joint work with Bokai Yao.

Infinite Wordle and the Mastermind numbers

• J. D. Hamkins, “Infinite Wordle and the mastermind numbers,” Mathematics arXiv, 2022.
[Bibtex]
@ARTICLE{Hamkins:Infinite-Wordle-and-the-mastermind-numbers,
author = {Joel David Hamkins},
title = {Infinite Wordle and the mastermind numbers},
journal = {Mathematics arXiv},
year = {2022},
volume = {},
number = {},
pages = {},
month = {},
note = {Under review},
abstract = {},
keywords = {under-review},
source = {},
doi = {},
eprint = {2203.06804},
archivePrefix = {arXiv},
primaryClass = {math.LO},
url = {http://jdh.hamkins.org/infinite-wordle-mastermind},
}

Abstract. I consider the natural infinitary variations of the games Wordle and Mastermind, as well as their game-theoretic variations Absurdle and Madstermind, considering these games with infinitely long words and infinite color sequences and allowing transfinite game play. For each game, a secret codeword is hidden, which the codebreaker attempts to discover by making a series of guesses and receiving feedback as to their accuracy. In Wordle with words of any size from a finite alphabet of $n$ letters, including infinite words or even uncountable words, the codebreaker can nevertheless always win in $n$ steps. Meanwhile, the mastermind number 𝕞, defined as the smallest winning set of guesses in infinite Mastermind for sequences of length $\omega$ over a countable set of colors without duplication, is uncountable, but the exact value turns out to be independent of ZFC, for it is provably equal to the eventually different number $\frak{d}({\neq^*})$, which is the same as the covering number of the meager ideal $\text{cov}(\mathcal{M})$. I thus place all the various mastermind numbers, defined for the natural variations of the game, into the hierarchy of cardinal characteristics of the continuum.

Infinite Wordle and the mastermind numbers, CUNY Logic Workshop, March 2022

This will be an in-person talk for the CUNY Logic Workshop at the Graduate Center of the City University of New York on 11 March 2022.

Abstract. I shall introduce and consider the natural infinitary variations of Wordle, Absurdle, and Mastermind. Infinite Wordle extends the familiar finite game to infinite words and transfinite play—the code-breaker aims to discover a hidden codeword selected from a dictionary $\Delta\subseteq\Sigma^\omega$ of infinite words over a countable alphabet $\Sigma$ by making a sequence of successive guesswords, receiving feedback after each guess concerning its accuracy. For any dictionary using the usual 26-letter alphabet, for example, the code-breaker can win in at most 26 guesses, and more generally in $n$ guesses for alphabets of finite size $n$. Meanwhile, for some dictionaries on an infinite alphabet, infinite play is required, but the code-breaker can always win by stage $\omega$ on a countable alphabet, for any fixed dictionary. Infinite Mastermind, in contrast, is a subtler game than Wordle because only the number and not the position of correct bits is given. When duplication of colors is allowed, nevertheless, the code-breaker can still always win by stage $\omega$, but in the no-duplication variation, no countable number of guesses (even transfinite) is sufficient for the code-breaker to win. I therefore introduce the mastermind number, denoted $\frak{mm}$, defined to be the size of the smallest winning no-duplication Mastermind guessing set, a new cardinal characteristic of the continuum, which I prove is bounded below by the additivity number $\text{add}(\mathcal{M})$ of the meager ideal and bounded above by the covering number $\text{cov}(\mathcal{M})$. In particular, the precise value of the mastermind number is independent of ZFC and can consistently be strictly between $\aleph_1$ and the continuum $2^{\aleph_0}$. In simplified Mastermind, where the feedback given at each stage includes only the numbers of correct and incorrect bits (omitting information about rearrangements), then the corresponding simplified mastermind number is exactly the eventually different number $\frak{d}(\neq^*)$.

I am preparing an article on the topic, which will be available soon.

Pluralism in the ontology of mathematics, MaMuPhi, Paris, February 2022

This will be a talk for the conference L’indépendance mathématique et ses limites logiques, an instance of the MAMUPHI seminar (mathématiques – musique – philosophie), organized by Mirna Džamonja, 12 February 2022. Most talks will be in-person in Paris, but my talk will be on Zoom via https://u-pec-fr.zoom.us/j/86448599486 at 4:30 pm CET (10:30 am EST).

Abstract: What is the nature of mathematical ontology—what does it mean to make existence assertions in mathematics? Is there an ideal mathematical realm, a mathematical universe, that those assertions are about? Perhaps there is more than one. Does every mathematical assertion ultimately have a definitive truth value? I shall lay out some of the back-and-forth in what is currently a vigorous debate taking place in the philosophy of set theory concerning pluralism in the set-theoretic foundations, concerning whether there is just one set-theoretic universe underlying our mathematical claims or whether there is a diversity of possible set-theoretic conceptions.

Infinite Hex is a draw

• J. D. Hamkins and D. Leonessi, “Infinite Hex is a draw,” Mathematics arXiv, 2022.
[Bibtex]
@ARTICLE{HamkinsLeonessi:Infinite-Hex-is-a-draw,
author = {Joel David Hamkins and Davide Leonessi},
title = {Infinite Hex is a draw},
journal = {Mathematics arXiv},
year = {2022},
volume = {},
number = {},
pages = {},
month = {},
note = {Under review},
abstract = {},
keywords = {under-review},
source = {},
doi = {},
eprint = {2201.06475},
archivePrefix = {arXiv},
primaryClass = {math.LO},
url = {http://jdh.hamkins.org/infinite-hex-is-a-draw},
}

Abstract. We introduce the game of infinite Hex, extending the familiar finite game to natural play on the infinite hexagonal lattice. Whereas the finite game is a win for the first player, we prove in contrast that infinite Hex is a draw—both players have drawing strategies. Meanwhile, the transfinite game-value phenomenon, now abundantly exhibited in infinite chess and infinite draughts, regrettably does not arise in infinite Hex; only finite game values occur. Indeed, every game-valued position in infinite Hex is intrinsically local, meaning that winning play depends only on a fixed finite region of the board. This latter fact is proved under very general hypotheses, establishing the conclusion for all simple stone-placing games.

This is my second joint project with Davide Leonessi, the first being our work on Transfinite games values in infinite draughts, both projects growing out of his work on his MSc in MFoCS at Oxford, for which he earned a distinction in September 2021.

Here is a convenient online Hex player, for those who want to improve their game: http://www.lutanho.net/play/hex.html.

The model theory of set-theoretic mereology, Notre Dame Math Logic Seminar, February 2022

This will be a talk for the Mathematical Logic Seminar at the University of Notre Dame on 8 February 2022 at 2 pm in 125 Hayes Healy.

Abstract. Mereology, the study of the relation of part to whole, is often contrasted with set theory and its membership relation, the relation of element to set. Whereas set theory has found comparative success in the foundation of mathematics, since the time of Cantor, Zermelo and Hilbert, mereology is strangely absent. Can a set-theoretic mereology, based upon the set-theoretic inclusion relation ⊆ rather than the element-of relation ∈, serve as a foundation of mathematics? How well is a model of set theory ⟨M,∈⟩ captured by its mereological reduct ⟨M,⊆⟩? In short, how much set theory does set-theoretic mereology know? In this talk, I shall present results on the model theory of set-theoretic mereology that lead broadly to negative answers to these questions and explain why mereology has not been successful as a foundation of mathematics. (Joint work with Makoto Kikuchi)

See the research papers:

• Set-theoretic mereology
• J. D. Hamkins and M. Kikuchi, “Set-theoretic mereology,” Logic and Logical Philosophy, Special issue “Mereology and beyond, part II”, vol. 25, iss. 3, p. 285–308, 2016.
[Bibtex]
@ARTICLE{HamkinsKikuchi2016:Set-theoreticMereology,
author = {Joel David Hamkins and Makoto Kikuchi},
title = {Set-theoretic mereology},
journal = {Logic and Logical Philosophy, Special issue Mereology and beyond, part II''},
editor = {A.~C.~Varzi and R.~Gruszczy{\'n}ski},
year = {2016},
volume = {25},
number = {3},
pages = {285--308},
month = {},
doi = {10.12775/LLP.2016.007},
note = {},
eprint = {1601.06593},
archivePrefix = {arXiv},
primaryClass = {math.LO},
url = {http://jdh.hamkins.org/set-theoretic-mereology},
abstract = {},
keywords = {},
source = {},
ISSN = {1425-3305},
MRCLASS = {03A05 (03E70)},
MRNUMBER = {3546211},
}
• The inclusion relations of the countable models of set theory are all isomorphic
• J. D. Hamkins and M. Kikuchi, “The inclusion relations of the countable models of set theory are all isomorphic,” ArXiv e-prints, 2017.
[Bibtex]
@ARTICLE{HamkinsKikuchi:The-inclusion-relations-of-the-countable-models-of-set-theory-are-all-isomorphic,
author = {Joel David Hamkins and Makoto Kikuchi},
title = {The inclusion relations of the countable models of set theory are all isomorphic},
journal = {ArXiv e-prints},
editor = {},
year = {2017},
volume = {},
number = {},
pages = {},
month = {},
doi = {},
note = {Manuscript under review},
eprint = {1704.04480},
archivePrefix = {arXiv},
primaryClass = {math.LO},
url = {http://jdh.hamkins.org/inclusion-relations-are-all-isomorphic},
abstract = {},
keywords = {under-review},
source = {},
}