CERTORAMove fast and break nothing 2 0 2 2 Formal Verification of Smart Contracts with The Certora Prover Jaroslav Bendík April 2022 Blockchain and Smart Contracts Blockchain • A distributed database • Chronologically ordered data • Decentralized • Cryptographic security measures • Immutable • Usual use: digital ledger Blockchain and Smart Contracts Smart Contract • A set of functions running on Ethereum blockchain • A user can invoke some of the functions • Successful function invocations are irreversible • Unsuccessful function invocations revert • A maximum size of 24KB • Cannot be deleted/changed once deployed Blockchain • A distributed database • Chronologically ordered data • Decentralized • Cryptographic security measures • Immutable • Usual use: digital ledger CODE CERTORA PROVER SPEC Proof of all behaviors meeting the spec A rare behavior which violates the spec Formal Verification with Certora Prover Example Smart Contract contract Bank { mapping (address => uint256) public funds; function deposit (uint256 amount) public payable { funds[msg.sender] += amount; } function getFunds (address account) public view returns (uint256) { return funds[account]; } } Example Smart Contract contract Bank { mapping (address => uint256) public funds; function deposit (uint256 amount) public payable { funds[msg.sender] += amount; } function getFunds (address account) public view returns (uint256) { return funds[account]; } } Example Smart Contract contract Bank { mapping (address => uint256) public funds; function deposit (uint256 amount) public payable { funds[msg.sender] += amount; } function getFunds (address account) public view returns (uint256) { return funds[account]; } } Example Smart Contract contract Bank { mapping (address => uint256) public funds; function deposit (uint256 amount) public payable { funds[msg.sender] += amount; } function getFunds (address account) public view returns (uint256) { return funds[account]; } } Example Smart Contract contract Bank { mapping (address => uint256) public funds; function deposit (uint256 amount) public payable { funds[msg.sender] += amount; } function getFunds (address account) public view returns (uint256) { return funds[account]; } } Example Smart Contract contract Bank { mapping (address => uint256) public funds; function deposit (uint256 amount) public payable { funds[msg.sender] += amount; } function getFunds (address account) public view returns (uint256) { return funds[account]; } } How do we know that deposit increases funds by amount? Writing the Specification contract Bank { mapping (address => uint256) public funds; function deposit (uint256 amount) public payable { funds[msg.sender] += amount; } function getFunds (address account) public view returns (uint256) { return funds[account]; } } How do we know that deposit increases funds by amount? Need to first write “deposit increases funds by amount” more formally so that we can automatically check it! Specification in CVL rule deposit_ok (uint256 amount) { env e; uint256 before_deposit = getFunds (e, e.msg.sender); deposit (e, amount); uint256 after_deposit = getFunds (e, e.msg.sender); assert (after_deposit == before_deposit + amount); } Specification in CVL rule deposit_ok (uint256 amount) { env e; uint256 before_deposit = getFunds (e, e.msg.sender); deposit (e, amount); uint256 after_deposit = getFunds (e, e.msg.sender); assert (after_deposit == before_deposit + amount); } Not executable but looks like Solidity! Specification in CVL rule deposit_ok (uint256 amount) { env e; uint256 before_deposit = getFunds (e, e.msg.sender); deposit (e, amount); uint256 after_deposit = getFunds (e, e.msg.sender); assert (after_deposit == before_deposit + amount); } Specification in CVL rule deposit_ok (uint256 amount) { env e; uint256 before_deposit = getFunds (e, e.msg.sender); deposit (e, amount); uint256 after_deposit = getFunds (e, e.msg.sender); assert (after_deposit == before_deposit + amount); } Specification in CVL rule deposit_ok (uint256 amount) { env e; uint256 before_deposit = getFunds (e, e.msg.sender); deposit (e, amount); uint256 after_deposit = getFunds (e, e.msg.sender); assert (after_deposit == before_deposit + amount); } Specification in CVL rule deposit_ok (uint256 amount) { env e; uint256 before_deposit = getFunds (e, e.msg.sender); deposit (e, amount); uint256 after_deposit = getFunds (e, e.msg.sender); assert (after_deposit == before_deposit + amount); } Specification in CVL rule deposit_ok (uint256 amount) { env e; uint256 before_deposit = getFunds (e, e.msg.sender); deposit (e, amount); uint256 after_deposit = getFunds (e, e.msg.sender); assert (after_deposit == before_deposit + amount); } Inline from contract Specification in CVL rule deposit_ok (uint256 amount) { env e; uint256 before_deposit = getFunds (e, e.msg.sender); deposit (e, amount); uint256 after_deposit = getFunds (e, e.msg.sender); assert (after_deposit == before_deposit + amount); } Specification in CVL rule deposit_ok (uint256 amount) { env e; uint256 before_deposit = getFunds (e, e.msg.sender); deposit (e, amount); uint256 after_deposit = getFunds (e, e.msg.sender); assert (after_deposit == before_deposit + amount); } Must hold for ALL values of amount! Certora Verification Language Overview • Assumptions + assertions • Invariants • Ghost functions + Hooks (ghost solidity functions) • Summary functions (replace solidity functions) • CVL functions (to avoid repeated code in .spec files) • Quantifiers Certora Verification Language Overview • Assumptions + assertions • Invariants • Ghost functions + Hooks (ghost solidity functions) • Summary functions (replace solidity functions) • CVL functions (to avoid repeated code in .spec files) • Quantifiers require forall address i. forall address j. funds(i) + funds(j) <= totalFunds(); Certora Verification Language Overview • Assumptions + assertions • Invariants • Ghost functions + Hooks (ghost solidity functions) • Summary functions (replace solidity functions) • CVL functions (to avoid repeated code in .spec files) • Quantifiers require forall address i. forall address j. funds(i) + funds(j) <= totalFunds(); Certora Verification Language Overview • Assumptions + assertions • Invariants • Ghost functions + Hooks (ghost solidity functions) • Summary functions (replace solidity functions) • CVL functions (to avoid repeated code in .spec files) • Quantifiers require forall address i. forall address j. funds(i) + funds(j) <= totalFunds(); Formal Verification with Certora Prover CODE CERTORA PROVER SPEC Proof of all behaviors meeting the spec A rare behavior which violates the spec ??? Formal Verification with Certora Prover CODE SPEC Proof of all behaviors meeting the spec A rare behavior which violates the spec Code + Spec Logic SOLVERS Decompiler EVM Bytecode Compiler Logical formula Constraint Solver Z3, CVC5, Yices, Vampire Counterexamples Static analyzer TAC VC Generator TAC Certora Prover Architecture Decompiler EVM Bytecode Compiler Logical formula Constraint Solver Z3, CVC5, Yices, Vampire Counterexamples Static analyzer TAC VC Generator TAC Certora Prover Works on Bytecode Compile Solidity to get EVM Bytecode Can support other EVM languages (Vyper) Helps find compiler bugs! Compiler Bugs Found by Certora Prover Compiler Logical formula Constraint Solver Z3, CVC5, Yices, Vampire Counterexamples Static analyzer VC Generator TAC Bytecode to Three-Address Code Decompiler EVM Bytecode TAC Break down code into small simple steps One operation per TAC instruction Only a small number of instructions in TAC Easier to analyze Compiler Logical formula Constraint Solver Z3, CVC5, Yices, Vampire Counterexamples Static analyzer VC Generator TAC Bytecode to Three-Address Code Decompiler EVM Bytecode TAC Decompiler EVM Bytecode Compiler Logical formula Constraint Solver Z3, CVC5, Yices, Vampire Counterexamples VC Generator TAC Static Analysis on TAC Static analyzer TAC Even in TAC, instructions can have subtle dependencies Gather facts at various program points (e.g., points-to relation) Lower burden on subsequent steps in the pipeline Decompiler EVM Bytecode Compiler Logical formula Constraint Solver Z3, CVC5, Yices, Vampire Counterexamples VC Generator TAC Static Analysis on TAC Static analyzer TAC MyStruct memory x = { f: 1 }; MyStruct memory y = { f: 2 }; y.f = 3; assert(x.f == 1); Even in TAC, instructions can have subtle dependencies Gather facts at various program points (e.g., points-to relation) Lower burden on subsequent steps in the pipeline Decompiler EVM Bytecode Compiler Logical formula Constraint Solver Z3, CVC5, Yices, Vampire Counterexamples VC Generator TAC Static Analysis on TAC Static analyzer TAC Even in TAC, instructions can have subtle dependencies Gather facts at various program points (e.g., points-to relation) Lower burden on subsequent steps in the pipeline MyStruct memory x = { f: 1 }; MyStruct memory y = { f: 2 }; y.f = 3; assert(x.f == 1); does not affect x Decompiler EVM Bytecode Compiler Logical formula Constraint Solver Z3, CVC5, Yices, Vampire Counterexamples VC Generator TAC Static Analysis on TAC Static analyzer TAC Even in TAC, instructions can have subtle dependencies Gather facts at various program points (e.g., points-to relation) Lower burden on subsequent steps in the pipeline MyStruct memory x = { f: 1 }; MyStruct memory y = { f: 2 }; y.f = 3; assert(x.f == 1); Decompiler EVM Bytecode Compiler Logical formula Constraint Solver Z3, CVC5, Yices, Vampire Counterexamples VC Generator TAC Static Analysis on TAC Static analyzer TAC Even in TAC, instructions can have subtle dependencies Gather facts at various program points (e.g., points-to relation) Lower burden on subsequent steps in the pipeline MyStruct memory x = { f: 1 }; assert(x.f == 1); Static Analysis on TAC Cont. Analysis Type Points-to analysis --> …. Value range analysis --> .… Control-flow analysis --> Example Application reveals connections between TAC variables allows us to simplify SMT axioms ………. split the original program into several smaller programs Static analyzer TAC Decompiler EVM Bytecode Compiler Constraint Solver Z3, CVC5, Yices, Vampire Counterexamples TAC Logical formula VC Generator Generate Verification Conditions Hoare Triples Hoare Triple: {P} S {Q} If P holds before executing S, then Q holds after executing S WP (S, Q): weakest predicate such that after executing S Q holds {WP (S, Q)} S {Q} Then to prove the triple, “just” show that P ⇒ WP(S, Q) Thus, if P ⇒ WP (S, Q) then {P} S {Q} Hoare Triples Hoare Triple: {P} S {Q} If P holds before executing S, then Q holds after executing S WP (S, Q): weakest predicate such that after executing S Q holds {WP (S, Q)} S {Q} Then to prove the triple, “just” show that P ⇒ WP(S, Q) Thus, if P ⇒ WP (S, Q) then {P} S {Q} Weakest Precondition Hoare Triple: {P} S {Q} If P holds before executing S, then Q holds after executing S WP(S, Q): weakest predicate such that Q holds after executing S {WP(S, Q)} S {Q} Then to prove the triple, “just” show that P ⇒ WP(S, Q) Thus, if P ⇒ WP (S, Q) then {P} S {Q} Weakest Precondition Hoare Triple: {P} S {Q} If P holds before executing S, then Q holds after executing S WP(S, Q): weakest predicate such that Q holds after executing S {WP(S, Q)} S {Q} Then to prove the triple, just show that P ⇒ WP(S, Q) is valid Thus, if P ⇒ WP(S, Q) is valid then {P} S {Q} Weakest Precondition Computation Basic instructions: • Assertions: WP(assert A, B) = A ∧ B • Assumptions: WP(assume A, B) = A ⟹ B • Assignments = assumptions! • Sequential composition: WP(S;T, B) = WP(S, WP(T, B)) • Choice statements: WP(S[]T, B) = WP(S, B) ∧ WP(T, B) Weakest Precondition Computation Basic instructions: • Assertions: WP(assert A, B) = A ∧ B • Assumptions: WP(assume A, B) = A ⟹ B • Assignments = assumptions! • Sequential composition: WP(S;T, B) = WP(S, WP(T, B)) • Choice statements: WP(S[]T, B) = WP(S, B) ∧ WP(T, B) Weakest Precondition Computation Basic instructions: • Assertions: WP(assert A, B) = A ∧ B • Assumptions: WP(assume A, B) = A ⟹ B • Assignments = assumptions! • Sequential composition: WP(S;T, B) = WP(S, WP(T, B)) • Choice statements: WP(S[]T, B) = WP(S, B) ∧ WP(T, B) Weakest Precondition Computation Basic instructions: • Assertions: WP(assert A, B) = A ∧ B • Assumptions: WP(assume A, B) = A ⟹ B • Assignments = assumptions! • Sequential composition: WP(S;T, B) = WP(S, WP(T, B)) • Choice statements: WP(S[]T, B) = WP(S, B) ∧ WP(T, B) Weakest Precondition Computation Basic instructions: • Assertions: WP(assert A, B) = A ∧ B • Assumptions: WP(assume A, B) = A ⟹ B • Assignments = assumptions! • Sequential composition: WP(S;T, B) = WP(S, WP(T, B)) • Choice statements: WP(S[]T, B) = WP(S, B) ∧ WP(T, B) Weakest Precondition Computation Basic instructions: • Assertions: WP(assert A, B) = A ∧ B • Assumptions: WP(assume A, B) = A ⟹ B • Assignments = assumptions! • Sequential composition: WP(S;T, B) = WP(S, WP(T, B)) • Choice statements: WP(S[]T, B) = WP(S, B) ∧ WP(T, B) Weakest Precondition Computation Basic instructions: • Assertions: WP(assert A, B) = A ∧ B • Assumptions: WP(assume A, B) = A ⟹ B • Assignments = assumptions! • Sequential composition: WP(S;T, B) = WP(S, WP(T, B)) • Choice statements: WP(S[]T, B) = WP(S, B) ∧ WP(T, B) Loops Unroll specific number of iterations + 1. Either assume loop termination condition, or 2. Assert loop termination condition Verification Condition If P ⇒ WP(S, Q) is valid formula then the program satisfies the specification Verification Condition If P ⇒ WP(S, Q) is valid formula then the program satisfies the specification We check P ∧ ¬ WP(S, Q) for satisfiability (not validity!). • If P ∧ ¬ WP(S, Q) is unsatisfiable then the program satisfies the spec. • Else, if P ∧ ¬ WP(S, Q) is satisfiable, then the program might violate the spec. Static analyzer TAC Decompiler EVM Bytecode Compiler Constraint Solver Z3, CVC5, Yices, Vampire Counterexamples Generate Verification Conditions TAC Logical formula VC Generator P ∧ ¬ WP(S, Q) Turning the program + spec to logic is done! VC Generator TAC Static analyzer TAC Decompiler EVM Bytecode Compiler Using Constraint Solvers Logical formula Constraint Solver Z3, CVC5, Yices, Vampire Counterexamples Contract violated spec! Contract meets spec! P ∧ ¬ WP(S, Q) SMT Machinery Encodings of P ∧ ¬ WP(S, Q) SMT Machinery Encodings of P ∧ ¬ WP(S, Q) - Precise NIA - LIA Overraproximation SMT Machinery Encodings of P ∧ ¬ WP(S, Q) - Precise NIA - LIA Overraproximation SMT solvers - z3, cvc4, cvc5, vampire, yices - 1-4 configs per solver - Choosen configurations run in parallel SMT Machinery Encodings of P ∧ ¬ WP(S, Q) - Precise NIA - LIA Overraproximation SMT solvers - z3, cvc4, cvc5, vampire, yices - 1-4 configs per solver - Choosen configurations run in parallel SMT Machinery Encodings of P ∧ ¬ WP(S, Q) - Precise NIA - LIA Overraproximation SMT solvers - z3, cvc4, cvc5, vampire, yices - 1-4 configs per solver - Choosen configurations run in parallel NIA SMT Machinery Encodings of P ∧ ¬ WP(S, Q) - Precise NIA - LIA Overraproximation SMT solvers - z3, cvc4, cvc5, vampire, yices - 1-4 configs per solver - Choosen configurations run in parallel LIA NIA SMT Machinery Encodings of P ∧ ¬ WP(S, Q) - Precise NIA - LIA Overraproximation SMT solvers - z3, cvc4, cvc5, vampire, yices - 1-4 configs per solver - Choosen configurations run in parallel LIA NIA LIA model SMT Machinery Encodings of P ∧ ¬ WP(S, Q) - Precise NIA - LIA Overraproximation SMT solvers - z3, cvc4, cvc5, vampire, yices - 1-4 configs per solver - Choosen configurations run in parallel LIA NIA LIA model LIA Multiplication Axiomatization a*b modelled with an unintepreted function a$b LIA Multiplication Axiomatization • a$0 = 0 • a$b = b$a • a > 0, b > 0 à a$b > 0 • a > 0, b < 0 à a$b < 0 • a < 0, b > 0 à a$b < 0 • a < 0, b < 0 à a$b > 0 • a > 0, b > 0 à a$b >= a, a$b >= b • 0 <= a1 <= a2, 0 <= b1 <= b2 à a1$b1 <= a2$b2 a*b modelled with an unintepreted function a$b LIA Multiplication Axiomatization • a$0 = 0 • a$b = b$a • a > 0, b > 0 à a$b > 0 • a > 0, b < 0 à a$b < 0 • a < 0, b > 0 à a$b < 0 • a < 0, b < 0 à a$b > 0 • a > 0, b > 0 à a$b >= a, a$b >= b • 0 <= a1 <= a2, 0 <= b1 <= b2 à a1$b1 <= a2$b2 a*b modelled with an unintepreted function a$b LIA Multiplication Axiomatization • a$0 = 0 • a$b = b$a • a > 0, b > 0 à a$b > 0 • a > 0, b < 0 à a$b < 0 • a < 0, b > 0 à a$b < 0 • a < 0, b < 0 à a$b > 0 • a > 0, b > 0 à a$b >= a, a$b >= b • 0 <= a1 <= a2, 0 <= b1 <= b2 à a1$b1 <= a2$b2 a*b modelled with an unintepreted function a$b LIA Multiplication Axiomatization • a$0 = 0 • a$b = b$a • a > 0, b > 0 à a$b > 0 • a > 0, b < 0 à a$b < 0 • a < 0, b > 0 à a$b < 0 • a < 0, b < 0 à a$b > 0 • a > 0, b > 0 à a$b >= a, a$b >= b • 0 <= a1 <= a2, 0 <= b1 <= b2 à a1$b1 <= a2$b2 a*b modelled with an unintepreted function a$b LIA Multiplication Axiomatization • a$0 = 0 • a$b = b$a • a > 0, b > 0 à a$b > 0 • a > 0, b < 0 à a$b < 0 • a < 0, b > 0 à a$b < 0 • a < 0, b < 0 à a$b > 0 • a > 0, b > 0 à a$b >= a, a$b >= b • 0 <= a1 <= a2, 0 <= b1 <= b2 à a1$b1 <= a2$b2 a*b modelled with an unintepreted function a$b LIA Multiplication Axiomatization • a$0 = 0 • a$b = b$a • a > 0, b > 0 à a$b > 0 • a > 0, b < 0 à a$b < 0 • a < 0, b > 0 à a$b < 0 • a < 0, b < 0 à a$b > 0 • a > 0, b > 0 à a$b >= a, a$b >= b • 0 <= a1 <= a2, 0 <= b1 <= b2 à a1$b1 <= a2$b2 a*b modelled with an unintepreted function a$b Learned Literals Given a formula F, an SMT solver says: • F is SAT, or • F is UNSAT, or • timeout Learned Literals Given a formula F, an SMT solver says: • F is SAT, or • F is UNSAT, or • timeout, but learned F => L for some L Learned Literals Given a formula F, an SMT solver says: • F is SAT, or • F is UNSAT, or • timeout, but learned F => L for some L Example L ≡ (x = 5) ∧ (y <= 10 ∨ y > 20 )∧ (y < 100) ∧ (z = x ∨ z > 10) Learned Literals Given a formula F, an SMT solver says: • F is SAT, or • F is UNSAT, or • timeout, but learned F => L for some L Example L ≡ (x = 5) ∧ (y <= 10 ∨ y > 20 )∧ (y < 100) ∧ (z = x ∨ z > 10) Z3 Yices CVC5 CVC5 learning Learned Literals Given a formula F, an SMT solver says: • F is SAT, or • F is UNSAT, or • timeout, but learned F => L for some L Example L ≡ (x = 5) ∧ (y <= 10 ∨ y > 20 )∧ (y < 100) ∧ (z = x ∨ z > 10) Z3 Yices CVC5 CVC5 learning Adjust F Z3 Decompiler EVM Bytecode Compiler Logical formula Constraint Solver Z3, CVC5, Yices, Vampire Counterexamples Static analyzer TAC VC Generator TAC The Certora Prover Pipeline Putting It All Together https://demo.certora.com CERTORA PROVER Overflow! Certora Inc. - Founded in 2019 - 60 software engineers including 13 PhDs - Offices in Tel Aviv and Seattle - Teams: - Static analysis - SMT - Frontend - Rulewriters - Fuzzing and mutation testing - Security Engineers (white hat hackers) Certora Inc. - Founded in 2019 - 60 software engineers including 13 PhDs - Offices in Tel Aviv and Seattle - Teams: - Static analysis - SMT - Frontend - Rulewriters - Fuzzing and mutation testing - Security Engineers (white hat hackers) WE ARE HIRING full time, part time, internship (contact me, jaroslav@certora.com, or see https://www.certora.com)