Formal Methods 101: Satisfiable Modulo Theories (SMT)

Introduction:

The Satisfiability Modulo Theories (SMT) problem is an extension to SAT problems in which the variables take on the values beyond just Boolean-valued variables.

SMT solvers can be thought of as finding out is the predicate is satisfiable by utilizing a combination of SAT techniques and domain theory.

A Boolean-valued logic formula (predicate) does not just have to have Boolean-valued variables The variables may be of any type (ex. the set all real numbers, ℝ). If a predicate just contains Boolean-valued variables the satisfiability of the predicate is use just referred to as the SAT problem. If a predicate contains variables of different values (such as ℝ), this is usually referred to as an SMT problem.

Example 1:
- Predicate: P = (Z ≥ X + Y) | Z, X, Y ∈ R
- This predicate is satisfiable (SAT)
- Proof:
  - Z = 5.14, X = 2.15, Z = 0.54
  - P = (5.14 ≥2.15 + 0.54)
  - P = (5.14 ≥2.69)
  - P = TRUE

Example 2:
Predicate: P = (Z ≥ X + Y) | Z, X, Y ∈ℝ
This predicate is unsatisfiable (not SAT)
Proof:
- Z = 0.00, X = 4.62, Z = -1.51
- P = (0.00 ≥4.62 + -1.51)
- P = (0.00 ≥3.11)
- P = FALSE

Solver Algorithms:

Bit Vectoring:

Bit Vectoring involves solving an SMT problem by converting the numerical predicate types (int, float, etc..) into a 32 bit (binary) representation and then running it through a SAT solver.

Example:
Predicate: P = (Z ≥ X + Y) | Z, X, Y ∈ ???
Bit vector:
- Z = Z1Z2Z3Z4Z5Z6Z7Z8Z9Z10… (symbolic binary)
- X = X1X2X3X4X5X6X7X8X9X10… (symbolic binary)
- Y = Y1Y2Y3Y4Y5Y6Y7Y8Y9Y10… (symbolic binary)
Predicate Conversation:
- P = (( X1 ∨ Y1 ) ∧ ¬Z1) ∨ (( X2 ∨ Y2 ) ∧ ¬Z2) ∨ (( X3 ∨ Y3 ) ∧ ¬Z3) ∨ …)
Ex. SAT for X1 = 1, Y1 = 1, Z1 = 0
- Ex. as a SAT Binary Solution
  - Z = 01010101010101… (binary)
  - X = 11010100101010… (binary)
  - Y = 11010101010101… (binary)

DPLL(T):

DPLL(T) mixes a combination of a DPLL SAT solver with a domain theory T in order to determine is the problem is satisfiable.
DPLL(T) works by first converting the SMT problem to a SAT problem, then replacing the Non-Boolean-valued variables by Boolean-valued variables. It then runs this expression through a classical DPLL SAT solver and checks the consistency of the solution with the domain theory T. That is, after a SAT solution is found, it looks to see if the non-Boolean-valued variable versions can agree that this is satisfiable.

Example:
DPLL(T), where T is Linear Integer Arithmetic (LIA)
- Example: P = ((X + 1 ≥ 0) ∧”(X + Y ≥ 3)”∧¬(“X – Y ≥ ” 10))
DPLL(LIA)
- Step 1: Convert expressions to Boolean variables
  - A = (X + 1 ≥ 0), B = (“X + Y ≥ 3), “C = ¬(“X – Y ≥ ” 10)
  - P =(A ∧”B”∧C)
- Step 2: Run DPLL on P to find possible SAT.
  - 1. A = TRUE
  - 2. T = P \ A = (B ∧¬C)
  - 3. B = TRUE
  - 4. T \ B = (¬C)
  - 5. ¬C = TRUE
  - 6. SAT @ A = TRUE, B = TRUE, C = FALSE
- Step 3: Use LIA to verify possible SAT
  - 1. (X + 1 ≥ 0) → (X ≥ -1)
  - 2. (“X + Y ≥ 3)”→ (X ≥ 3 – Y)
  - 3. ¬(“X – Y ≥ ” 10) → (“X – Y ” <10) → (“X ” <10+Y)
  - 4. (1.) X ≥ -1 → (assume) X = -3 → (substituted 1. into 2.) (-1 ≥ 3 – Y) → (reduce) (4 ≥ Y) → (assume) (4 = Y) (substitute 3. into 4.) (X < 10 + 4) →(reduce) (X < 14) → assume (X = 15)
  - 5. Solution: X = 15, Y = 4, therefore P his is satisfiable!!
Note: If step three did not produce a SAT solution; you would backtrack just as in the traditional DPLL algorithm until either SAT is proven, or UNSAT is proven.

Examples of Domain Theories T:
- Linear Integer Arithmetic (LIA)
- Ex. Simplex algorithm, Criss-cross algorithm
- Theory of Equality and Uninterpreted Functions (EUF)
- Congruence closure algorithm
- Theory of Linear Integer/Real Arithmetic
- Simplex algorithm
- Theory of Arrays
- Theory of Bit Vectors
- Theory of Inductive Datatypes (ID)
- Theory of Non-Linear Integer Arithmetic (incomplete procedure)
- Theory of Strings + Length constraints (incomplete procedure)
- Quantified formulas (incomplete procedure)
You can also combine domain theories in your solver:
- Ex. DPLL(LIA + ID)
Generalized to N combinations:
- Ex. DPLL(T1 + T2 + T3….TN)

Classifications of Domain Theories T Procedures:
- Sound
  - An algorithm is sound if, anytime it returns an answer, that answer is true.
  - Note: Soundness is a weak guarantee. It does not promise that A will terminate.
- Complete
  - An algorithm is complete if it guarantees to return a correct answer for any arbitrary input (or, if no answer exists, it guarantees to return failure).
- Complete Decision Procedures (for Decidable Problems) OR Incomplete Procedures (For Undecidable Problems)
  - This relates to decidability/undecidability of problems. An undecidable problem is one in which it is provable that no algorithm can be constructed that will guarantee a correct ‘yes’ or ‘no’ answer to the class of problems. (Ex. Halting Problem is undecidable).