Selected Algorithms for Theory Solvers IA085: Satisfiability and Automated Reasoning Martin Jonáš FI MUNI, Spring 2024 Last time CDCL solver T-solver φ SAT/UNSAT assigned variable backtracked variable T-conflict+explanation T-propagation T-lemma 1 / 38 Last time CDCL solver T-solver φ SAT/UNSAT assigned variable backtracked variable T-conflict+explanation T-propagation T-lemma 1 / 38 Last time Possible interface of the T-solver • void notifyAtom(lit) • void assignLiteral(lit) • void push() • void pop() • result checkSat() • option checkSat_approx() • list getConflictReason() • option getPropagatedLiteral() • list getExplanation(lit) 2 / 38 Last time Possible interface of the T-solver • void notifyAtom(lit) • void assignLiteral(lit) • void push() • void pop() • result checkSat() • option checkSat_approx() • list getConflictReason() • option getPropagatedLiteral() • list getExplanation(lit) 2 / 38 Equality and uninterpreted functions Theory of equality and uninterpreted functions Signature • a countable set of function symbols Σf = {f, g, h, . . .} • single predicate symbol Σp = {=} Theory • TUF is a set of all Σ-structures Idea • we only know that = is equivalence and • the symbols in Σf are interpreted as functions → for equal arguments return equal results → equality is congruence 3 / 38 Running example f(x, y) = x ∧ f(f(x, y), y) = z ∧ g(x) ̸= g(z) 4 / 38 Applications of uninterpreted functions • verification of software and hardware (represent unknown external components/functions) • abstraction of complex parts of the system • SMT-solving – overapproximation; cheaper unsatisfiability check • . . . 5 / 38 Flashbacks Flashbacks from Mathematical Foundations of Computer Science • each equivalence ∼ ⊆ X2 partitions X to a set of equivalence classes X/∼ • the equivalence class of x ∈ X is denoted [x]∼ Flashbacks from Algebra • given a set of functions Σf , an equivalence ∼ is congruence if for any f ∈ Σf of arity k and xi, yi ∈ X (x1 ∼ y1) ∧ (x2 ∼ y2) ∧ . . . ∧ (xk ∼ yk) =⇒ f(x1, x2, . . . , xk) ∼ f(y1, y2, . . . , yk) 6 / 38 Herbrand universe Herbrand universe • a set of all terms over the signature Σf • denoted T • can be turned to Σf -structure: all functions work syntactically f applied to t1 and t2 returns f(t1, t2) 7 / 38 Main theorem Theorem A set of equalities E and disequalities D is TUF satisfiable if and only if there exists a congruence ∼ on T such that • tl ∼ tr for all (tl = tr) ∈ E, and • tl ̸∼ tr for all (tl ̸= tr) ∈ D. Proof. • “⇒”: Set t1 ∼ t2 iff t1 = t2 in the model. • “⇐”: Construct model from the equality classes. 8 / 38 Subterm set Idea • terms not occurring in the formula φ are not relevant to its satisfiability • define a set Tφ of all subterms of φ • compute equivalence classes only of Tφ Example Given φ = f(x, y) = x ∧ f(f(x, y), y) = z ∧ g(x) ̸= g(z) Tφ = {x, y, z, f(x, y), g(x), g(z), f(f(x, y), y)} 9 / 38 Congruence closure Congruence closure of R ⊆ X2 • the smallest congruence R∗ that contains R • alternatively: R∗ = ∩ S is congruence ,R⊆S{S} 10 / 38 Congruence closure Congruence closure of R ⊆ X2 • the smallest congruence R∗ that contains R • alternatively: R∗ = ∩ S is congruence ,R⊆S{S} Computing congruence closure Compute the least fixed point of Ri defined by R0 = R Ri+1 = Ri ∪ idX ∪ {(x, y) ∈ X2 | (y, x) ∈ Ri} ∪ {(x, z) ∈ X2 | (x, y) ∈ Ri, (y, z) ∈ Ri} ∪ {(f(x1, . . . , xk), f(y1, . . . , yk)) ∈ X2 | (xj, yj) ∈ Ri for all 1 ≤ j ≤ Ar(f)} 10 / 38 Congruence closure Congruence closure of R ⊆ X2 • the smallest congruence R∗ that contains R • alternatively: R∗ = ∩ S is congruence ,R⊆S{S} Computing congruence closure Compute the least fixed point of Ri defined by R0 = R Ri+1 = Ri ∪ idX ∪ {(x, y) ∈ X2 | (y, x) ∈ Ri} ∪ {(x, z) ∈ X2 | (x, y) ∈ Ri, (y, z) ∈ Ri} ∪ {(f(x1, . . . , xk), f(y1, . . . , yk)) ∈ X2 | (xj, yj) ∈ Ri for all 1 ≤ j ≤ Ar(f)} Does it have to terminate? 10 / 38 More practical main theorem Theorem A formula φ = ∧ E ∧ ∧ D, where E is a set of equalities and D is a set of disequalities, is TUF satisfiable if and only if the congruence closure of {(tl, tr) | (tl = tr) ∈ E} over Tφ does not contain any (tl, tr) such that (tl ̸= tr) ∈ D. 11 / 38 Congruence closure algorithm: theoretical description Algorithm 1. start with singleton equivalence classes {t} for each t ∈ Tφ 2. for each (tl = tr) ∈ E, merge the equivalence classes [tl] and [tr] and all classes that need to be merged due to congruence 3. if at any point [tl] = [tr] for some (tl ̸= tr) ∈ D, return unsat 4. otherwise return sat 12 / 38 Congruence closure: example f(x, y) = x ∧ f(f(x, y), y) = z ∧ g(x) ̸= g(z) 13 / 38 Congruence closure algorithm: reality Questions 1. how to represent the equivalence classes? 2. how to merge the equivalence classes? 3. how to decide if two terms are in the same equivalence class? 4. we have O(|φ|) subterms, each of size O(|φ|); do we really need O(|φ|2) memory to store the set Tφ? 14 / 38 Reminder: Union-Find Union-Find • a data structure to store disjoint sets • allows creating a singleton sets, merging two sets into one, and computing a representative of a given set • internally represented by a forest: – set = tree in the forest – representative = root of a tree • each element stores its parent and rank 15 / 38 Reminder: Union-Find 1 make_singleton_set(value) { 2 return { value: value; parent = value; rank: 1} 3 } 1 find(item) { 2 repr ← item 3 while (repr ̸= repr.parent) { 4 repr = repr.parent 5 } 6 return repr 7 } 16 / 38 Reminder: Union-Find 1 union(item1, item2) { 2 repr1 ← find(item1) 3 repr2 ← find(item2) 4 if (repr1 = repr2) return 5 6 if (repr1.rank > repr2.rank) { 7 repr2.parent = repr1 8 } else if (repr1.rank < repr2.rank) { 9 repr1.parent = repr2 10 } else { 11 repr1.parent = repr2 12 repr2.rank++ 13 } 14 } 17 / 38 E-graph Efficient storage of terms with shared subterms. Nodes • constant/variable with 0 children • function symbol f of arity k with k children Example Consider f(x, y) = x ∧ f(f(x, y), y) = z ∧ g(x) ̸= g(z). 18 / 38 E-graph Efficient storage of terms with shared subterms. Nodes • constant/variable with 0 children • function symbol f of arity k with k children Example Consider f(x, y) = x ∧ f(f(x, y), y) = z ∧ g(x) ̸= g(z). We extend each node with parent pointer and rank to store equivalence classes of terms à la union-find. 18 / 38 Asserting new literals 1 def assert(t = s): 2 todo ← [(t, s)] 3 while todo not empty: 4 (u, v) ← todo.pop() 5 if find(u) = find(v): continue 6 union(u,v) 7 foreach f(u1, . . . , uk) and f(v1, . . . , vk) such that 8 ui = u, vi = v for some i and 9 find(uj ) = find(vj ) for all j 10 find(f(u1, . . . , uk)) ̸= find(f(v1, . . . , vk)): 11 todo.append((f(u1, . . . , uk), f(v1, . . . , vk))) 12 foreach (v ̸= w) ∈ inequalities: 13 if find(v) = find(w): return false 14 return true 15 16 def assert(t ̸= s): 17 if find(t) = find(s): return false 18 inequalities.append(t ̸= s) 19 return true 19 / 38 Computing explanations Idea • add explanations to each parent pointer (= edge of the E-graph) • if find(u) = find(v), the explanation is union of – sequence of explanations between u and the root find(u) and – sequence of explanations between v and the root find(v). Each union • of t and s due to assert(t = s) – explanation = the equality • of f(u1, . . . , un) and f(v1, . . . , vn) due to find(ui) = find(vi) for all 1 ≤ i ≤ n – explanation = the union of explanations of all find(ui) = find(vi) 20 / 38 Theory propagation Notation • t ∈ [s] = t is in the same subtree as s After each union merge(t, s) • propagate t′ = s′, where t′ ∈ [t] and s′ ∈ [s] • propagate t′ ̸= r, where t′ ∈ [t], r′ ∈ [r] and there exists s′ ∈ [s] with s′ ̸= r′ ∈ inequalities After assertion of t ̸= s • propagate t′ ̸= s′, where t′ ∈ [t] and s′ ∈ [s] 21 / 38 Efficient implementation Implemented in most of the existing SMT solvers. For efficient implementation and description of backtracking, see • R. Nieuwenhuis, A. Oliveras: Fast congruence closure and extensions, 2007 22 / 38 Difference logic Difference logic Difference logic • all atoms of form (x − y) ▷◁ k for ▷◁ ∈ {≤, <, ≥, >, =, ̸=} and a number k • can be over any numeric theory: – DL(Q) – DL(Z) 23 / 38 Applications of difference logic • planning • scheduling • verification of timed automata • . . . aend − astart ≥ 10 ∧ bstart − aend ≥ 0 ∧ bend − bstart ≥ 5 ∧ bend − astart ≤ 13 24 / 38 Normalization Atoms can be normalized to x − y ≤ k • x − y ≥ k y − x ≤ −k • x − y < k x − y ≤ k′ with k′ a smaller number than k (theory-dependent) • x − y > k x − y ≥ k′ with k′ a bigger number than k (theory-dependent) • x − y = k (x − y ≤ k) ∧ (x − y ≥ k) • x − y ̸= k (x − y < k) ∨ (x − y > k) (needs to be done in the original formula) Need theory solver only for φ = ∧ (xi − yj ≤ kj) 25 / 38 Running examples Example (x − y ≤ 3) ∧ (y − z ≤ −11) ∧ (x − z ≤ −1) ∧ (v − y ≤ 15) ∧ (z − v ≤ 5) ∧ (v − x ≤ 2) 26 / 38 Running examples Example (x − y ≤ 3) ∧ (y − z ≤ −11) ∧ (x − z ≤ −1) ∧ (v − y ≤ 15) ∧ (z − v ≤ 5) ∧ (v − x ≤ 2) Example (x − y ≤ 3) ∧ (y − z ≤ −7) ∧ (x − z ≤ −1) ∧ (v − y ≤ 15) ∧ (z − v ≤ 5) ∧ (v − x ≤ 2) 26 / 38 Constraint graph Given a formula φ = ∧ (xi − yj ≤ kj), we can construct a constraint graph Gφ. Nodes • variables of φ Edges • edge between x and y of weight k for each conjunct (x − y ≤ k) of φ 27 / 38 Main theorem Theorem The formula φ = ∧ (xi − yj ≤ kj) is DL-satisfiable if and only if Gφ does not contain negative cycle. 28 / 38 Main theorem Theorem The formula φ = ∧ (xi − yj ≤ kj) is DL-satisfiable if and only if Gφ does not contain negative cycle. Proof. • “⇒”: Show by induction that if there is a path between x and y in Gφ of weight k, then φ |=DL (x − y) ≤ k • “⇐”: Construct a model from shortest paths. 28 / 38 Algorithm for difference logic theory solver Algorithm 1. Construct the graph Gφ. 2. Add a new node s with edges of weight 0 to all nodes of Gφ 3. Run Bellman-Ford algorithm from s. 4. If the algorithm finds negative cycle, return unsat; otherwise return sat. 29 / 38 Computing explanations Idea • edges of Gφ = conjuncts φ • unsatisfiability reason = cycle of negative weight • unsatisfiability explanation = conjuncts on the cycle 30 / 38 Theory propagation Idea 1. Compute (or maintain) shortest paths between all pairs of vertices x and y 2. If dist(x, y) = d, propagate all x − y ≤ k with k ≥ d 31 / 38 Efficient implementation Implemented in most of the existing SMT solvers that deal with arithmetic. For efficient implementation and description of backtracking and theory propagation, see • A. Armando, C. Castellini, E. Giunchiglia, M. Maratea: A SAT-Based Decision Procedure for the Boolean Combination of Difference Constraints, SAT 2004 • S. Cotton, O. Maler: Fast and Flexible Difference Constraint Propagation for DPLL(T), SAT 2006 32 / 38 Other theories (quick overview) Linear Real Arithmetic Normalization • atoms of form a1x1 + a2x2 + . . . + akxk ≤ b Theory solver • decide satisfiability conjunctions of atoms of form a1x1 + a2x2 + . . . + akxk ≤ b and their negations • simplex algorithm • needs changes to be incremental and backtrackable, see – B. Dutetre, L. de Moura: A Fast Linear-Arithmetic Solver for DPLL(T), CAV 2006 33 / 38 Linear Integer Arithmetic Much more complicated. Combination of: • simplex on the LRA relaxation of the formula • branch and bound • cutting planes • diophantine equation solving • . . . 34 / 38 Linear Integer Arithmetic [from A.Griggio: A Practical Approach to Satisfiability Modulo Linear Integer Arithmetic] 35 / 38 Bit-vectors Combinations of 1. heavy preprocessing 2. converting all operations to Boolean circuits that compute them (usually and-inverter graphs) 3. more preprocessing 4. computing propositional formula that encodes the circuit 5. often done eagerly The conversion of bit-vector formula to the equisatisfiable propositional formula is called bit-blasting. 36 / 38 Arrays Lazy approach 1. treat read and write as uninterpreted functions 2. check UF-satisfiability 3. if unsat, return unsat 4. if sat, check whether the model satisfies array axioms – if it does, return sat – if not, add the violated axioms and check UF-satisfiability again 37 / 38 Next time • combination of theories • Nelson-Oppen algorithm 38 / 38