title | subtitle | author | date | date-updated |
---|---|---|---|---|
The Logical Implication |
Languages and models of a first-order logic. |
Mort Yao |
2017-09-06 |
2017-09-09 |
Prologue. This is the summary of a set of notes I took in KU's Introduction to Mathematical Logic course, that I should have finished months ago but somehow procrastinated until recently. I basically followed Enderton's A Mathematical Introduction to Logic book (with all its conventions of mathematical notations). A few issues to address in advance:
- These notes are mainly about first-order logic. Propositional logic is assumed to be a prerequisite. The important thing to know is that its semantics (as a Boolean algebra) can be fully described by a truth table, which is obviously finite and mechanizable.
-
Naïve set theory is also a prerequisite, that is, one must accept the notion of sets unconditionally: A set is just a collection of objects, and it must exist as we describe it. We are convinced that no paradoxical use of sets will ever emerge, and that no "naïve set" could be a proper class. The formal notion of sets is built upon first-order logic, but without some informal reference to sets we can't even say what a first-order logic is.
- A set
$S$ is said to be "countable" iff there exists a bijection$f : \mathbb{N} \to S$ ; otherwise it is "uncountable".
- A set
- One must be willing to accept the validity of mathematical induction on natural numbers (or the well-ordering principle for natural numbers). Again, to formalize the induction principle we need the set theory, but we don't have it in the first place (until we introduce a first-order logic).
- On the model-theoretic part: Notes on definability and homomorphisms are omitted from this summary, not to mention ultraproducts and the Löwenheim-Skolem theorem. Model theory is a big topic in its own right and deserves an individual treatment, better with some algebraic and topological contexts. (Note that the homomorphism theorem is used in proving the completeness theorem; also the notion of definability is mentioned in the supplementary sections.)
- On the proof-theoretic part: Notes on some metatheorems are also omitted, as they are a purely technical aspect of a Hilbert-style deductive system. (One may find it convenient to prove an actual theorem with more metatheorems, but they are really not adding any extra power to our system.)
- The relation between logic and computability (i.e., Gödel's incompleteness theorems) is not discussed.
- But the meanings of "decidable" and "undecidable" are clear from the previous notes Mst. #6 (from a computer scientist's perspective).
-
Axiomatic set theory, which is another big part of the course, is not included in these notes. (Maybe I'm still too unintelligent to grasp the topic.) But it is good to know:
- First-order logic has its limitation in definability (i.e., it's not capable of ruling out non-standard models of arithmetic), until we assign to it a set-theoretic context. So set theory is often considered a foundation of all mathematics (for its expressive power).
- Axiom of Choice (AC) causes some counter-intuitive consequences, but it was shown to be consistent with ZF axioms (Gödel 1938). And there are models of ZF$\cup\lnot$AC so well as ZF$\cup$AC.
- Constructivists tend to avoid AC in mathematics. However, Henkin's proof of the completeness theorem in first-order logic assumes AC (Step II in finding a maximal consistent set). (thus it is a non-constructive proof!^[Is there a constructive approach as a replacement of Henkin's construction? https://math.stackexchange.com/questions/1462408/is-there-a-constructive-approach-as-a-replacement-of-henkins-construction])
- Intuitionistic logic is not in the scope of these course notes. (And most logic books, including the Enderton one, are not written by constructive mathematicians.) Basically, in a Hilbert-style system, a classical logic would admit all tautologies in propositional logic as Axiom Group 1. Intuitionistic logic, in contrast, rejects those tautologies that are non-constructive in a first-logic setting.
-
First-order language: A formal language consisting of the following symbols:
- Logical symbols
- Parentheses:
$($ ,$)$ . - Connective symbols:
$\to$ ,$\lnot$ . - Variables:
$v_1$ ,$v_2$ , ... - Equality symbol (optional 2-place predicate symbols):
$=$ .
- Parentheses:
- Parameters (non-logical symbols; open to interpretation)
- Universal quantifier symbol:
$\forall$ . - Predicate symbols (relation symbols):
$P_1$ ,$P_2$ , ... - Constant symbols (0-place function symbols):
$c_1$ ,$c_2$ , ... - Function symbols:
$f_1$ ,$f_2$ , ...
- Universal quantifier symbol:
- When specifying a concrete first-order language
$\mathcal{L}$ , we must say (i) whether the quality symbol is present; (ii) what the parameters are.
- Logical symbols
Remark 7.1. (Language of propositional logic) The language of propositional logic may be seen as a stripped form of first-order languages, in which parentheses, connective symbols and sentential symbols (the only parameters; may be treated as 0-place predicate symbols) are present. Intuitively, that language might seem too weak to encode our formal reasoning in all kinds of mathematics and many practical areas, so to speak.
-
Terms and formulas
- An expression is a finite sequence of symbols (i.e., a finite string). Among all expressions, we are interested in two kinds of them which we refer to as terms and formulas.
- A term is either:
- a single variable or constant symbol; or
-
$f t_1 \cdots t_n$ , where$f$ is a$n$ -place function symbol, and every$t_i$ $(1 \leq i \leq n)$ is also a term.
- A formula (or wff, well-formed formula) is either:
-
$P t_1 \cdots t_n$ , where$P$ is a$n$ -place predicate symbol (or the equality symbol$=$ ), and every$t_i$ $(1 \leq i \leq n)$ is a term; or - one of the following forms:
-
$(\lnot \psi)$ , where$\psi$ is also a formula; -
$(\psi \to \theta)$ , where$\psi$ and$\theta$ are also formulas; -
$\forall v_i \psi$ , where$v_i$ is a variable and$\psi$ is also a formula.
-
-
- A variable may occur free in a formula. A formula without any free variable is called a sentence.
Remark 7.2. (Metatheory and philosophical concerns) A first-order expression, as a finite sequence (also called a tuple), may be defined in terms of ordered pairs in axiomatic set theory. But we will not appeal to set theory in our first definitions of expressions in logic. (So far we have no notion about what a "set" formally is!)
A further concern is whether our definitions of terms and formulas are well established, that is, since we are defining the notions of terms and formulas inductively, would it be possible that there is a certain term or formula that is covered by our recursive definition, but can never be actually built using these operations? To show that first-order terms/formulas are well-defined, a beginner might try to prove these induction principles by mathematical induction on the complexities of terms/formulas, but that would rely on the fact that the set of natural numbers
For now, we must appeal to a metatheory that we can easily convince ourselves by intuition, so that we will accept these induction principles and the notion of "naïve sets" (or collections, if we don't want to abuse the formal term of sets too much). Notwithstanding, I believe that a prudent person can bootstrap theories like this without drawing in any inconsistency.
Remark 7.3. (Context freedom, unique readability and parentheses) Since the formations of first-order terms and formulas make use of context-free rules, one familiar with formal languages and automata theory might ask, "Are the set of terms/formulas context-free languages?" Generally they are not, since our set
Remark 7.4. (Abbreviations on notation) Why don't we have the existential quantifier
Remark 7.5. (Sentences and truth values) In propositional logic, we don't generally know whether a formula evaluates to true until every sentential symbol is assigned a truth value. (Sometimes we can tell the truth value with a little less information than what is required, if we apply a so-called short-circuit evaluation strategy, e.g., if
-
Structures: A structure (or an interpretation)
$\mathfrak{A}$ assigns a domain$|\mathfrak{A}|$ to the language$\mathcal{L}$ , and:- Every predicate symbol is assigned a relation
$P^\mathfrak{A} \subseteq |\mathfrak{A}|^n$ . - Every function symbol is assigned a function
$f^\mathfrak{A} : |\mathfrak{A}|^n \to |\mathfrak{A}|$ . - Every constant symbol is assigned a member
$c^\mathfrak{A}$ of the domain$\mathfrak{A}$ . - The universal quantifier symbol
$\forall$ is assigned the domain$|\mathfrak{A}|$ . (So it makes sense to say: "for all$x$ in$|\mathfrak{A}|$ ...")
- Every predicate symbol is assigned a relation
-
Satisfaction and truth
- Given a structure
$\mathfrak{A}$ and an assignment of variables$s : V \to |\mathfrak{A}|$ , we define an extension function$\bar{s} : T \to |\mathfrak{A}|$ (where$T$ is the set of all terms) that maps any term into the domain$|\mathfrak{A}|$ . - With the term valuation
$\bar{s}$ , we define recursively that a structure$\mathfrak{A}$ satisfies a formula$\varphi$ with an assignment$s$ of variables, written as$$\models_\mathfrak{A} \varphi[s]$$ If this is not the case, then$\not\models_\mathfrak{A} \varphi[s]$ and we say that$\mathfrak{A}$ does not satisfy$\varphi$ with$s$ . - For a sentence
$\sigma$ (which is just a formula with no free variables), the assignment of variables$s : V \to |\mathfrak{A}|$ does not make a difference whether$\varphi$ is satisfied by$\mathfrak{A}$ . So if$\models_\mathfrak{A} \sigma$ , we say that$\sigma$ is true in$\mathfrak{A}$ or that$\mathfrak{A}$ is a model of$\sigma$ . -
Satisfiability of formulas: A set
$\Gamma$ of formulas is said to be satisfiable iff there is a structure$\mathfrak{A}$ and an assignment$s$ of variables such that$\models_\mathfrak{A} \Gamma[s]$ .
- Given a structure
-
Logical implication and validity
- In a language
$\mathcal{L}$ , we say that a set$\Gamma$ of formulas logically implies a formula$\varphi$ , iff for every structure$\mathfrak{A}$ of$\mathcal{L}$ and every assignment$s : V \to |\mathfrak{A}|$ such that$\models_\mathfrak{A} \gamma [s]$ (for all$\gamma \in \Gamma$ ), it also holds that$\models_\mathfrak{A} \varphi [s]$ (note that$\varphi$ is not required to be a sentence):$$\Gamma \models \varphi$$ This is the analogue of tautological implication in propositional logic:$A \Rightarrow B$ , iff every truth assignment that satisfies$A$ also satisfies$B$ . - If the empty set logically implies a formula, i.e.,
$\emptyset \models \varphi$ , we write this fact simply as$\models \varphi$ and say that$\varphi$ is valid. A formula is valid iff given any assignment of variables, it is true in every structure; this is the analogue of tautologies in propositional logic: something that is considered "always true".
- In a language
Remark 7.6. (Dichotomy of semantic truthness and the liar's paradox)
It should be made clear from the definition that given a structure and an assignment, either
A problem arises with self-referential terms, woefully: Assume that we have a first-order language
Remark 7.7. (Decidability of truth/validity) In propositional logic, it is easy to see that given a truth assignment of sentential symbols, every formula can be decided for its truth or falsehood. Moreover, even without any truth assignment, one can enumerate a truth table to find out whether a given formula is a tautology. Truth and validity are decidable in propositional logic. However, this is often not the case in first-order logic: In order to decide whether a sentence is true, one needs to find the truth values of all prime formulas (i.e., formulas like
If semantic truth/validity is generally undecidable, how do we say that some formula is true in a predefined structure? Well, we can't, in most general cases, since an infinite argument of truth is a useless argument (you can't present it to someone / some Turing machine, as no physical device is capable of handling such an infinite object). Fortunately, there is a feasible way to say something is true, without appealing to any specific structures (that may give rise to unwanted undecidability), and that is called a formal deduction (also called a proof, expectedly).
-
Formal deduction: Given a set
$\Lambda$ of formulas (axioms), a set of rules of inference, we say that a set$\Gamma$ of formulas (hypotheses) proves another formula$\varphi$ , or$\varphi$ is a theorem of$\Gamma$ , iff there is a finite sequence (called a deduction of$\varphi$ from$\Gamma$ )$\langle \alpha_0, \dots, \alpha_n \rangle$ such that #.$\alpha_n$ is just$\varphi$ . #. For each$0 \leq k \leq n$ , either *$\alpha_k \in \Gamma \cup \Lambda$ ; or *$\alpha_k$ is obtained by a rule of inference from a subset of previous formulas$A \subseteq \bigcup_{0 \leq i < k} \alpha_i$ .$$\Gamma \vdash \varphi$$ - Formal systems and proof calculi: Different deductive systems made different choices on the set of axioms and rules of inferences. A natural deduction system may consist of no axiom but many rules of inference; on the contrary, a Hilbert-style system (named obviously after David Hilbert) uses many axioms but only two rules of inference. A proof calculus is the approach to formal deduction in a specified system, and as it is called a "calculus", any derivation in it must contain only a finite number of steps so as to be calculable (by a person or by a machine).
- We will use a Hilbert-style deductive system here:
-
Rules of inference
#. Modus ponens
$$\frac{\Gamma \vdash \psi \quad \Gamma \vdash (\psi \to \varphi)}{\Gamma \vdash \varphi}$$ #. Generalization$$\frac{\vdash \theta}{\vdash \forall x_1 \cdots \forall x_n \theta}$$ (where$\theta \in \Lambda$ .) -
Logical axioms: In a deductive system, axioms are better called logical axioms, to stress the fact that they are logically valid formulas in every structure, i.e., that their validity is not open to interpretation.
#. (Tautology)
$\alpha$ , where$\models_t \alpha$ . (take sentential symbols to be prime formulas in first-order logic) #. (Substitution)$\forall x \alpha \to \alpha^x_t$ , where$t$ is substitutable for$x$ in$\alpha$ . #.$\forall x (\alpha \to \beta) \to (\forall x \alpha \to \forall x \beta)$ . #.$\alpha \to \forall x \alpha$ , where$x$ does not occur free in$\alpha$ . #.$x = x$ . #.$x = y \to (\alpha \to \alpha')$ , where$\alpha$ is atomic and$\alpha'$ is obtained from$\alpha$ by replacing$x$ in zero or more places by$y$ .
-
Rules of inference
#. Modus ponens
Remark 7.8. (Validity of logical axioms)
It should be intuitively clear that all logical axioms are convincing, and that their validity can be argued without appealing to any specific model. In particular, for an axiom
Remark 7.9. (Tautological implication, logical implication and deduction)
If
It is worth noticing that even though
-
$\Gamma$ proves$\varphi$ . (There is a formal deduction that derives$\varphi$ from$\Gamma$ and axioms$\Lambda$ in our deductive system; it is finite and purely syntactical, in the sense that it does not depend on any structure or assignment of variables.) -
$\Gamma$ logically implies$\varphi$ . (For every structure and every assignment of variables,$\varphi$ holds true given$\Gamma$ . Of course, this is a semantical notion in the sense that it does involve structures and assignments of variables, which are infinite in numbers so it would be impossible for one to check this mechanically.) -
$\Gamma \cup \Lambda$ tautologically implies$\varphi$ . (We can reduce a first-order logic to propositional logic by adding logical axioms to the set of hypotheses, preserving all truths. For each prime formula, this is still a semantical notion for its truth value depends on structures / assignments of variables.)
-
Soundness a.
$\Gamma \vdash \varphi \implies \Gamma \models \varphi$ .- Proof idea:
#. For
$\varphi \in \Lambda$ , show that every logical axiom is valid (Lemma 25A in Enderton p. 132ff.), that is,$\models \varphi$ . Then trivially$\Gamma \models \varphi$ ; #. For$\varphi \in \Gamma$ , we have trivially$\Gamma \models \varphi$ ; #.$\varphi$ is obtained by generalization on variable$x$ from a valid formula$\theta$ . Since$\models \theta$ (if$\theta$ is an axiom, then this is already shown in Step 1; if$\theta$ is another generalization, then this can be shown by IH), for every structure$\mathfrak{A}$ and$a \in |\mathfrak{A}|$ ,$\models_\mathfrak{A} \theta[s(x|a)]$ , then by definition we have$\models_\mathfrak{A} \forall x \theta[s]$ . Therefore$\models \forall x \theta$ ; #.$\varphi$ is obtained by modus ponens from$\psi$ and$(\psi \to \varphi)$ . By IH we have$\Gamma \models \psi$ and$\Gamma \models (\psi \to \varphi)$ . Show that$\Gamma \models \varphi$ using Exercise 1 in Enderton p. 99. (NB. the wording in the last line of Enderton p. 131, i.e., "follows at once", seems too sloppy to me: we have not proved modus ponens semantically yet.) -
Consistency of formulas: A set
$\Gamma$ of formulas is said to be consistent iff for no formula$\varphi$ it is the case that both$\Gamma \vdash \varphi$ and$\Gamma \vdash (\lnot\varphi)$ .- By the soundness theorem, an inconsistent set
$\Gamma$ of formulas gives rise to both$\Gamma \models \varphi$ and$\Gamma \models (\lnot\varphi)$ . As discussed before, it would throw our trust on mathematical truths into fire -- we will have proved that some statement is both true and false! b. If$\Gamma$ is satisfiable, then$\Gamma$ is consistent.
(a. and b. are equivalent representations of the soundness theorem.)
- By the soundness theorem, an inconsistent set
- Proof idea:
#. For
-
Completeness a.
$\Gamma \models \varphi \implies \Gamma \vdash \varphi$ . b. If$\Gamma$ is consistent, then$\Gamma$ is satisfiable.
(a. and b. are equivalent representations of the completeness theorem.)- Proof idea: We will prove first a weaker form of b., i.e., the completeness for a countable language
$\mathcal{L}$ . Let$\Gamma$ be a consistent set of formulas. We show that it is satisfiable. #. Extend the language$\mathcal{L}$ with a countable set$\bar{C}$ of new constant symbols and get a new language $\mathcal{L}\bar{C}$; #. Given the set $\Gamma$ of $\mathcal{L}$-formulas, show that there is a set $\bar\Gamma$ of $\mathcal{L}\bar{C}$-formulas that is consistent, complete, deductively closed and Henkinized, i.e., for every formula$\exists x \varphi \in \Gamma$ there is a "witness" constant$\bar{c} \in \bar{C}$ such that$\varphi^x_\bar{c} \in \bar{\Gamma}$ ; #. Build a structure $\mathfrak{A}0$ from $\bar\Gamma$ where $|\mathfrak{A}0|$ is the set of terms of $\mathcal{L}\bar{C}$. The assignment $s$ maps every variable to itself; #. Define an equivalence relation $E$ on $|\mathfrak{A}0|$: $t E t' \iff t = t' \in \bar\Gamma$. Show by induction that for any $\mathcal{L}\bar{C}$-formula $\varphi$, $\models{\mathfrak{A}0} \varphi^* [s] \iff \varphi \in \bar\Gamma$ (where $\varphi^*$ is $\varphi$ with $=$ replaced by $E$ everywhere); #. Show by the homomorphism theorem that $\models\mathfrak{A} \varphi[s] \iff \varphi \in \bar\Gamma$ (where $\mathfrak{A} = \mathfrak{A}0 / E$); #. Restrict the structure $\mathfrak{A}$ (a model of $\mathcal{L}\bar{C}$) to$\mathcal{L}$ by dropping all new constants$\bar{C}$ . Then$\Gamma$ is satisfiable with$\mathfrak{A}$ and$s$ in$\mathcal{L}$ .
- Proof idea: We will prove first a weaker form of b., i.e., the completeness for a countable language
-
Compactness a.
$\Gamma \models \varphi \implies$ There is a finite$\Gamma_0 \subseteq \Gamma$ such that$\Gamma_0 \models \varphi$ . b. If every finite subset$\Gamma_0$ of$\Gamma$ is satisfiable, then$\Gamma$ is satisfiable.
(a. and b. are equivalent representations of the compactness theorem.)- Proof idea: A simple corollary of soundness and completeness theorems.
Remark 7.10. (Soundness and completeness) The soundness and the completeness theorems together evidence the equivalence of consistency and satisfiability of a set of formulas, or that of validity and provability of a formula. The completeness theorem is by no means an obvious result; the first proof was given by Kurt Gödel in 1930^[Gödel's original proof of the completeness theorem: https://en.wikipedia.org/wiki/Original_proof_of_G%C3%B6del%27s_completeness_theorem], but the proof that we use today is given by Leon Henkin in 1949 [@henkin1949completeness], which easily generalizes to uncountable languages.
Remark 7.11. (Completeness and incompleteness)
Note that the completeness theorem should not be confused with Gödel's incompleteness theorems. Specifically, the completeness theorem claims that (unconditionally) every formula that is logically implied by
The equality symbol
One reason is that in many first-order systems, functions (operations) are defined as axioms using equalities; we need a general way to say "something is defined as..." or just "something is..." There would be no better way of saying this rather than representing it as an equality, so we won't have the hassle of interpreting a made-up relation in every model. Consider the language of elementary number theory, in the intended model
The equality is a little special, since it is the most common relation with reflexivity, as justified by Axiom Group 5, i.e.,
Lemma 7.12. (Symmetry) If
Proof. Given
Lemma 7.13. (Transitivity) If
Proof. Given
Lemma 7.14. (Antisymmetry) If
Proof. Trivial.
Note that if any partial order is definable on a structure, the equality symbol may not be indispensable in our language, e.g., consider the language of set theory, where the equivalence of two sets
Consider again the language of elementary number theory, in the intended model
For any given
Recall that in a Peano arithmetic modeled by
Clearly, every number but
We can go even further. Let
Counter-intuitively, this is not the kind of "arithmetic" we are used to. What we are trying to do is to formulate the arithmetic for standard natural numbers that we use everyday (i.e.,
Well, the lucky part is that we can at least have the induction axiom as a first-order sentence:
Ideally, we would like to have a bunch of axioms that perfectly defines the model of arithmetic, where no non-standard part is allowed to exist, i.e., the set of numbers is well-ordered by a definable ordering relation
The intuition here is that in order to rule out any non-standard chains of numbers, we must find a 1-place predicate
- The first approach is by allowing quantification over relations. So we can say "for all such relations
$P$ , it holds that$\forall n P n$ ". Formally:$$\forall P (P \mathbf{0} \land \forall x (P x \to P \mathbf{S} x)) \to \forall n P n$$ Of course, in our previous definition of first-order languages, for$\forall v_i \psi$ to be a well-formed formula,$v_i$ is a variable such that$v_i \in |\mathfrak{N}|$ given a model$\mathfrak{N}$ ; here we have$P \subseteq |\mathfrak{N}|$ hence$P \in \mathcal{P}(|\mathfrak{N}|)$ . So in a first-order logic we would not be able to do this (we can only quantify a variable in the domain$|\mathfrak{N}|$ ). This approach leads to a second-order logic (where we can not only quantify over a plain variable in$|\mathfrak{N}|$ , but also quantify over a relation variable in the power set of the domain, i.e.,$\mathcal{P}(|\mathfrak{N}|)$ ; that gives our logic more expressive power!). - As we see, a relation is essentially a subset of
$|\mathfrak{N}|$ (thus its range is also a set); it is tempting to formulate Peano arithmetic using the notion of sets. Indeed, we can rewrite the formula in Approach 1 into the language of set theory as:$$\forall y (\emptyset \in y \land \forall x (x \in y \to S(x) \in y)) \to \forall n\ n \in y$$ where we encode the standard number$\mathbf{0}$ as$\emptyset$ ,$\mathbf{S}x$ as$S(x) = x \cup {x}$ . Clearly there is no non-standard number in this set-theoretic model. This is exactly how we define natural numbers$\mathbb{N}$ (as a minimal inductive set$\omega$ ) in set theory, and its existence is justified by the so-called axiom of infinity. Note that once we introduce the set theory (using a first-order language), we have the equivalently expressive (sometimes paradoxical) power of any arbitrary higher-order logic. Fundamentally.^[Many logicians (Kurt Gödel, Thoralf Skolem, Willard Van Quine) seem to adhere to first-order logic. And that's why.]
Books:
Herbert B. Enderton, A Mathematical Introduction to Logic, 2nd ed.
Kenneth Kunen, The Foundations of Mathematics.
Articles:
Terence Tao, "The completeness and compactness theorems of first-order logic," https://terrytao.wordpress.com/2009/04/10/the-completeness-and-compactness-theorems-of-first-order-logic/.
Asger Törnquist, "The completeness theorem: a guided tour," http://www.math.ku.dk/~asgert/teachingnotes/iml-completenessguide.pdf.
Eliezer Yudkowsky, "Godel's completeness and incompleteness theorems," http://lesswrong.com/lw/g1y/godels_completeness_and_incompleteness_theorems/.
Eliezer Yudkowsky, "Standard and nonstandard numbers," http://lesswrong.com/lw/g0i/standard_and_nonstandard_numbers/.
David A. Ross, "Fun with nonstandard models," http://www.math.hawaii.edu/~ross/fun_with_nsmodels.pdf.
Papers: