Formal Methods 101: Satisfiable Modulo Theories (SMT)

Introduction:

  • The Satisfiability Modulo Theories (SMT) problem is an extension to SAT problems in which the variables take on the values beyond just Boolean-valued variables.
  • SMT solvers can be thought of as finding out is the predicate is satisfiable by utilizing a combination of SAT techniques and domain theory.
  • A Boolean-valued logic formula (predicate) does not just have to have Boolean-valued variables The variables may be of any type (ex. the set all real numbers, ℝ). If a predicate just contains Boolean-valued variables the satisfiability of the predicate is use just referred to as the SAT problem. If a predicate contains variables of different values (such as ℝ), this is usually referred to as an SMT problem.
  • Example 1:
    • Predicate: P = (Z ≥ X + Y) | Z, X, Y ∈ R
    • This predicate is satisfiable (SAT)
    • Proof:
      • Z = 5.14, X = 2.15, Z = 0.54
      • P = (5.14 ≥2.15 + 0.54)
      • P = (5.14 ≥2.69)
      • P = TRUE
  • Example 2:
  • Predicate: P = (Z ≥ X + Y) | Z, X, Y ∈ℝ
  • This predicate is unsatisfiable (not SAT)
  • Proof:
    • Z = 0.00, X = 4.62, Z = -1.51
    • P = (0.00 ≥4.62 + -1.51)
    • P = (0.00 ≥3.11)
    • P = FALSE

Solver Algorithms:

  1. Bit Vectoring
  2. DPLL(T)
  3. Symbolic Execution

Bit Vectoring:

  • Bit Vectoring involves solving an SMT problem by converting the numerical predicate types (int, float, etc..) into a 32 bit (binary) representation and then running it through a SAT solver.
  • Example:
  • Predicate: P = (Z ≥ X + Y) | Z, X, Y ∈ ???
  • Bit vector:
    • Z = Z1Z2Z3Z4Z5Z6Z7Z8Z9Z10… (symbolic binary)
    • X = X1X2X3X4X5X6X7X8X9X10… (symbolic binary)
    • Y = Y1Y2Y3Y4Y5Y6Y7Y8Y9Y10… (symbolic binary)
  • Predicate Conversation:
    • P = (( X1 ∨ Y1 ) ∧ ¬Z1)(( X2 ∨ Y2 ) ∧ ¬Z2)(( X3 ∨ Y3 ) ∧ ¬Z3) ∨ …)
  • Ex. SAT for X1 = 1, Y1 = 1, Z1 = 0
    • Ex. as a SAT Binary Solution
      • Z = 01010101010101… (binary)
      • X = 11010100101010… (binary)
      • Y = 11010101010101… (binary)

DPLL(T):

  • DPLL(T) mixes a combination of a DPLL SAT solver with a domain theory T in order to determine is the problem is satisfiable.
  • DPLL(T) works by first converting the SMT problem to a SAT problem, then replacing the Non-Boolean-valued variables by Boolean-valued variables. It then runs this expression through a classical DPLL SAT solver and checks the consistency of the solution with the domain theory T. That is, after a SAT solution is found, it looks to see if the non-Boolean-valued variable versions can agree that this is satisfiable.
  • Example:
  • DPLL(T), where T is Linear Integer Arithmetic (LIA)
    • Example: P = ((X + 1  ≥ 0) ∧”(X + Y  ≥ 3)”∧¬(“X – Y  ≥ ” 10))
  • DPLL(LIA)
    • Step 1: Convert expressions to Boolean variables
      • A = (X + 1  ≥ 0), B = (“X + Y  ≥ 3), “C = ¬(“X – Y  ≥ ” 10)
      • P =(A ∧”B”∧C)
    • Step 2: Run DPLL on P to find possible SAT.
      • 1. A = TRUE
      • 2. T = P \ A = (B ∧¬C)
      • 3. B = TRUE
      • 4. T \ B = (¬C)
      • 5. ¬C = TRUE
      • 6. SAT @ A = TRUE, B = TRUE, C = FALSE
    • Step 3: Use LIA to verify possible SAT
      • 1. (X + 1  ≥ 0) → (X ≥ -1)
      • 2. (“X + Y  ≥ 3)”→ (X ≥ 3 – Y)
      • 3. ¬(“X – Y  ≥ ” 10) → (“X – Y ” <10) → (“X ” <10+Y)
      • 4. (1.) X ≥ -1 → (assume) X = -3 → (substituted 1. into 2.) (-1 ≥ 3 – Y) → (reduce) (4 ≥ Y) → (assume) (4 = Y) (substitute 3. into 4.) (X < 10 + 4) →(reduce) (X < 14) → assume (X = 15)
      •  5. Solution: X = 15, Y = 4, therefore P his is satisfiable!!
  • Note: If step three did not produce a SAT solution; you would backtrack just as in the traditional DPLL algorithm until either SAT is proven, or UNSAT is proven.
  • Examples of Domain Theories T:
    • Linear Integer Arithmetic (LIA)
    • Ex. Simplex algorithm, Criss-cross algorithm
    • Theory of Equality and Uninterpreted Functions (EUF)
    • Congruence closure algorithm
    • Theory of Linear Integer/Real Arithmetic
    • Simplex algorithm
    • Theory of Arrays
    • Theory of Bit Vectors
    • Theory of Inductive Datatypes (ID)
    • Theory of Non-Linear Integer Arithmetic (incomplete procedure)
    • Theory of Strings + Length constraints (incomplete procedure)
    • Quantified formulas (incomplete procedure)
  • You can also combine domain theories in your solver:
    • Ex. DPLL(LIA + ID)
  • Generalized to N combinations:
    • Ex. DPLL(T1 + T2 + T3….TN)
  • Classifications of Domain Theories T Procedures:
    • Sound
      • An algorithm is sound if, anytime it returns an answer, that answer is true.
      • Note: Soundness is a weak guarantee. It does not promise that A will terminate.
    • Complete
      • An algorithm is complete if it guarantees to return a correct answer for any arbitrary input (or, if no answer exists, it guarantees to return failure).
    • Complete Decision Procedures (for Decidable Problems) OR Incomplete Procedures (For Undecidable Problems)
      • This relates to decidability/undecidability of problems. An undecidable problem is one in which it is provable that no algorithm can be constructed that will guarantee a correct ‘yes’ or ‘no’ answer to the class of problems. (Ex. Halting Problem is undecidable).