Introduction to Automated Theorem Proving
IA085: Satisfiability and Automated Reasoning
Martin Jonáš
FI MUNI, Spring 2024
Contents
Propositional satisfiability (SAT)
• (A ∨ ¬B) ∧ (¬A ∨ C)
• is it satisfiable?
Satisfiability modulo theories (SMT)
• x = 1 ∧ x = y + y ∧ y > 0
• is it satisfiable over reals?
• is it satisfiable over integers?
Automated theorem proving (ATP)
• axioms: ∀x (x + x = 0), ∀x∀y (x + y = y + x)
• do they imply ∀x∀y ((x + y) + (y + x) = 0)?
1 / 41
Big change
Today we are dealing with quantifiers!
2 / 41
Big change
Today we are dealing with quantifiers!
Today we are not dealing with theories!
2 / 41
First-order theorem proving
Problem specification
Input
• a set of hypotheses {H1, H2, . . . , Hk} that are arbitrary closed formulas
• a goal G that is arbitrary closed formula
Problem
• decide whether H1 ∧ H2 ∧ . . . ∧ Hk |= G
Notes
• not considering any background theory, only interpreted symbol is equality
(theory of UF)
3 / 41
Example
Claim If all elements of a group have order 2, the group is commutative.
4 / 41
Example
Claim If all elements of a group have order 2, the group is commutative.
Formalization in signature Σ = {=, ·, 1}
Hypotheses
• H1 = ∀x (1 · x = x ∧ x · 1 = x)
• H2 = ∀x∃y (x · y = 1)
• H3 = ∀x∀y∀z ((x · y) · z = x · (y · z))
• H4 = ∀x (x · x = 1)
Goal
• G = ∀x∀y(x · y = y · x)
Task
H1 ∧ H2 ∧ H3 ∧ H4 |= G
4 / 41
Proof by refutation
Goal
• prove H1 ∧ H2 ∧ . . . ∧ Hk |= G
Proof by refutation
• prove H1 ∧ H2 ∧ . . . ∧ Hk ∧ ¬G is unsatisfiable
5 / 41
Proving unsatisfiability
System I of inference rules
C1 C2 . . . Ck
A
Proof of unsatisfiability of set of formulas Φ is a tree with
• leaves from Φ
• inner nodes corresponding to inference rules
• root ⊥
6 / 41
Demo
7 / 41
Soundness of inferences
Sound inference rule
• if
C1 C2 . . . Ck
A
• then
C1 ∧ C2 ∧ . . . ∧ Ck |= A
8 / 41
Proving unsatisfiability
Important distinction
• Φ is unsatisfiable (Φ |= ⊥)
• Φ can be proven unsatisfiable using the proof system I (Φ ⊥)
Soundness
• Φ ⊥ implies Φ |= ⊥
• can be proven by proving soundness of each inference rule separately
Refutation completeness
• Φ |= ⊥ implies Φ ⊥
• proofs usually much harder
9 / 41
Proving unsatisfiability
Proving unsatisfiability of sets of first-order formulas
• in general undecidable
10 / 41
Proving unsatisfiability
Proving unsatisfiability of sets of first-order formulas
• in general undecidable
Challenge for the rest of the lecture
• Is the problem semi-decidable (recursively enumerable)?
• Is its complement semi-decidable (recursively enumerable)?
10 / 41
Today
Two proof systems
• resolution – sound and refutation complete for formulas without equalities
• superposition – sound and refutation complete for arbitrary formulas (with
or without equalities)
11 / 41
Notation
• variables x, y, z, . . .
• set of variables
−→
X
• constants c, d (in the Σ-signature, fixed elements of the Σ-structure)
• t[s] = a term t that can contain a subterm s
• given t[s], we denote as t[s2] the result of replacing s in t by s2
12 / 41
Normal forms
Goal
We want to convert the input formula to a conjunctive normal form.
• atomic formula = predicate symbol applied to terms (P(x, f(y), g(c)))
• literal = atomic formula or its negation
• clause = disjunction of literals with all variables quantified universally
(∀x∀y.
(
P(x, f(y), g(c)) ∨ Q(y)
)
)
• formula in CNF = conjunction of clauses
13 / 41
Negation normal form
Rationale
• we want to remove existential quantifiers
• some universal quantifiers under negations are in fact existential
Negation Normal Form (NNF)
• negations are applied only to atomic formulas
• the formula does not contain implication (→) and equivalence (↔)
14 / 41
Conversion into NNF
Conversion to NNF
1. rewrite all φ ↔ ψ to (φ → ψ) ∧ (φ ← ψ)
2. rewrite all φ → ψ to ¬φ ∨ ψ
3. apply double negation elimination, De Morgan rules, and quantifier
negations until fixed point
– rewrite ¬¬φ to φ
– rewrite ¬(φ ∧ ψ) to (¬φ) ∨ (¬ψ)
– rewrite ¬(φ ∨ ψ) to (¬φ) ∧ (¬ψ)
– rewrite ¬(∃x φ) to ∀x ¬φ
– rewrite ¬(∀x φ) to ∃x ¬φ
If the formulas are represented by DAGs, the conversion is linear.
15 / 41
Prenex normal form
Rationale
• we want to move the quantifiers to the top level (to create clauses)
Prenex Normal Form (PNF)
• formula is of form Q1x1Q2x2 . . . Qnxn φ where
• Qi ∈ {∃, ∀}
• φ is quantifier free
16 / 41
Conversion to PNF
Conversion to PNF
1. convert to NNF
2. rename bound variables to unique names
3. apply prenexing rules until fixed point
– rewrite φ ∧ (∀x ψ) to ∀x (φ ∧ ψ)
– rewrite φ ∧ (∃x ψ) to ∀x (∃ ∧ ψ)
– + symmetric variants
17 / 41
Skolem normal form
Skolem Normal Form (SNF)
• formula is of form ∀x1∀x2 . . . ∀xn φ where
• φ is quantifier free
18 / 41
Conversion to SNF
Conversion to SNF
1. convert to PNF
2. while the formula is of form
∀x1∀x2 . . . ∀xm∃y.φ,
where φ can contain quantifiers, replace y by fy(x1, x2, . . . , xm), where fy is a
new function symbol
19 / 41
Conversion to SNF
The formula skolemize(φ) is in general not equivalent to φ.
• φ = ∀x∃y (x + y = 0)
• skolemize(φ) = ∀x (x + f(x) = 0)
20 / 41
Conversion to SNF
The formula skolemize(φ) is in general not equivalent to φ.
• φ = ∀x∃y (x + y = 0)
• skolemize(φ) = ∀x (x + f(x) = 0)
Theorem
The formulas φ and skolemize(φ) are equisatisfiable.
20 / 41
Conversion to CNF
Conversion to CNF
1. convert to an equisatisfiable formula in SNF
2. obtain formula ∀
−→
X φ
3. convert φ to CNF (using distributivity or Tseitin)
4. obtain formula ∀
−→
X (C1 ∧ C2 ∧ . . . ∧ Ck)
5. obtain a set of formulas (∀
−→
X1 C1) ∧ (∀
−→
X2 C2) ∧ . . . (∀
−→
Xk Ck)
21 / 41
Conversion to CNF
Conversion to CNF
1. convert to an equisatisfiable formula in SNF
2. obtain formula ∀
−→
X φ
3. convert φ to CNF (using distributivity or Tseitin)
4. obtain formula ∀
−→
X (C1 ∧ C2 ∧ . . . ∧ Ck)
5. obtain a set of formulas (∀
−→
X1 C1) ∧ (∀
−→
X2 C2) ∧ . . . (∀
−→
Xk Ck)
(why is this correct?)
21 / 41
Conversion to CNF: improvements
In practice, conversion to full PNF is not needed.
Improvements
• convert to conjunctions of formulas in PNF, or
• convert to NNF, perform skolemization, then convert to conjunctions of
formulas in PNF
Why is this better?
22 / 41
CNF conventions
• clauses = sets of literals
• formulas = sets of clauses
• all variables are universally quantified → do not write the quantifiers
Example
• (∀P(x)) ∧ (∀y(Q(c1, y) ∨ Q(c2, y)))
• {{P(x)}, {Q(c1, y), Q(c2, y)}
23 / 41
Example
Recall the example
• H1 = ∀x (1 · x = x ∧ x · 1 = x)
• H2 = ∀x∃y (x · y = 1)
• H3 = ∀x∀y∀z ((x · y) · z = x · (y · z))
• H4 = ∀x (x · x = 1)
• C = ∀x∀y(x · y = y · x)
What is the CNF of the formula that we are trying to refute?
24 / 41
Saturation based theorem-proving
Saturation algorithm
1. start with the set of formulas Φ (not containing ⊥)
2. while there is an inference rule i ∈ I with premises from Φ and conclusion
A ∈ Φ
– if A = ⊥, return unsatisfiable
– otherwise set Φ to Φ ∪ {A} and continue
3. otherwise, there are no new inferences to add (the set Φ is saturated)
4. return satisfiable
25 / 41
Soundness and completeness
Theorem
Let I be a sound proof system. If the set of formulas Φ is satisfiable, the
saturation algorithm either returns satisfiable or does not terminate.
Not a theorem
Let I be a complete proof system. If the set of formulas Φ is unsatisfiable, the
saturation algorithm returns unsatisfiable.
Theorem
Let I be a complete proof system. If the set of formulas Φ is unsatisfiable, the
saturation algorithm returns unsatisfiable, given that rule selection is fair.
26 / 41
First-order resolution
Reminder: Propositional resolution
A ∨ C1 ¬A ∨ C2
C1 ∨ C2
In first-order logic, the above rule is still sound, but is not complete.
• {{P(x)}, {¬P(y + z)}} |= ⊥
• {{P(x)}, {¬P(y + z)}} ⊥
27 / 41
Reminder: Propositional resolution
A ∨ C1 ¬A ∨ C2
C1 ∨ C2
In first-order logic, the above rule is still sound, but is not complete.
• {{P(x)}, {¬P(y + z)}} |= ⊥
• {{P(x)}, {¬P(y + z)}} ⊥
Solution
• allow not only exactly matching complementary literals A and ¬A,
• but also literals that can be made complementary by using a suitable
instantiation of universal quantifiers
27 / 41
Substitution
Substitution
• a function from variables to terms
• θ = {x → y + 3, y → f(y)}
• not mentioned variables are not changed (θ(z) = z)
Application
• formula φ
• result φθ is result of simultaneous replacement of each x in φ by θ(x)
• (x − y)θ = (y + 3) − f(y)
• analogous tθ for terms
28 / 41
Substitution
Theorem
If C is a clause and θ is a substitution, then C |= Cθ.
Message
If we have a set of clauses Φ, it is safe to apply substitution θ to C ∈ Φ and
assume that Cθ holds.
29 / 41
Unification
Unifier of terms t and s
• substitution θ such that tθ = sθ
• analogous definition for atomic formulas
Examples
• f(g(x), y) and f(z, h(z)) are
30 / 41
Unification
Unifier of terms t and s
• substitution θ such that tθ = sθ
• analogous definition for atomic formulas
Examples
• f(g(x), y) and f(z, h(z)) are unifiable
• f(g(x), y) and f(h(z), h(z)) are
30 / 41
Unification
Unifier of terms t and s
• substitution θ such that tθ = sθ
• analogous definition for atomic formulas
Examples
• f(g(x), y) and f(z, h(z)) are unifiable
• f(g(x), y) and f(h(z), h(z)) are not unifiable
• f(f(x)) and f(c) are
30 / 41
Unification
Unifier of terms t and s
• substitution θ such that tθ = sθ
• analogous definition for atomic formulas
Examples
• f(g(x), y) and f(z, h(z)) are unifiable
• f(g(x), y) and f(h(z), h(z)) are not unifiable
• f(f(x)) and f(c) are not unifiable
30 / 41
Most-general unifier
Problem
• there are many different unifiers for a given set of pairs
• unifier of x and f(y) can be θ = {x → f(10), y → 10}, which is too specific
Most general unifier of terms t and s
• unifier θ such that every unifier ρ can be obtained as ρ = f ◦ θ for suitable
substitution f
• unique up to isomorphism (variable renaming)
• denoted mgu(t, s)
31 / 41
Resolution rule
A1 ∨ C1 ¬A2 ∨ C2
C1θ ∨ C2θ
if θ = mgu(A1, A2)
32 / 41
Resolution rule
A1 ∨ C1 ¬A2 ∨ C2
C1θ ∨ C2θ
if θ = mgu(A1, A2)
Examples
• resolvent of P(x) ∨ Q(f(x), y) and ¬P(f(z)) ∨ R(g(z, v))
32 / 41
Resolution rule
A1 ∨ C1 ¬A2 ∨ C2
C1θ ∨ C2θ
if θ = mgu(A1, A2)
Examples
• resolvent of P(x) ∨ Q(f(x), y) and ¬P(f(z)) ∨ R(g(z, v))
Theorem
The resolution inference rule is sound.
32 / 41
Resolution rule
Note
• need to allow renaming of bound variables of a clause before resolution
• example: P(x) P(y) (sound because ∀x P(x) ≡ ∀y P(y))
• why is this needed?
33 / 41
Resolution proof system
Resolution proof system
• sound for all first-order formulas
• refutation complete for first-order formulas without equality
34 / 41
Superposition calculus
Motivation
Goal
• extend the resolution proof system with rules that reason with equality
• make it refutation complete for all first-order formulas (with or without
equality)
35 / 41
Superposition rule
Naive version (not used)
(l = r) ∨ C1 L[l] ∨ C2
L[r] ∨ C1 ∨ C2
36 / 41
Superposition rule
Naive version (not used)
(l = r) ∨ C1 L[l] ∨ C2
L[r] ∨ C1 ∨ C2
General version
(l = r) ∨ C1 L[s] ∨ C2
(L[r] ∨ C1 ∨ C2)θ
if θ = mgu(l, s)
Purpose
• use the equality for substitution
36 / 41
Factoring rule
Naive version (not used)
A ∨ A ∨ C
A ∨ C
37 / 41
Factoring rule
Naive version (not used)
A ∨ A ∨ C
A ∨ C
General version
A1 ∨ A2 ∨ C
(A ∨ C)θ
if θ = mgu(A1, A2)
Purpose
• remove duplicate literals
37 / 41
Equality resolution rule
Naive version (not used)
(x = x) ∨ C
C
General version
(s = t) ∨ C
Cθ
if θ = mgu(s, t)
Purpose
• remove disequalities
38 / 41
Equality factoring rule
Naive version (not used)
(s = t) ∨ (s = t′) ∨ C
(s = t) ∨ (t = t′) ∨ C
General version
(s = t) ∨ (s′ = t′) ∨ C
((s = t) ∨ (t = t′) ∨ C)θ
if θ = mgu(s, s′)
Purpose
• case split on s = t and s = t
39 / 41
Superposition calculus
Superpositon calculus
• inference system with rules: resolution, superposition, factoring, equality
resolution, and equality factoring rules
• sound for arbitrary first-order formulas
• refutationally complete for arbitrary first-order formulas (with or without
equalities)
40 / 41
Next time
• SMT with quantifiers
• quantifier instantiation
41 / 41