IA159 Formal Methods for Software Analysis Symbolic Execution and Applications Jan Strejˇcek Faculty of Informatics Masaryk University Focus and sources focus symbolic execution automated whitebox fuzz testing bounded model checking sources J. C. King: Symbolic Execution and Program Testing, Communications of ACM, 1976. P. Godefroid, M. Y. Levin, and D. Molnar: Automated whitebox fuzz testing, NDSS 2008. Special thanks to Marek Trtík for providing me his slides. IA159 Formal Methods for Software Analysis: Symbolic Execution and Applications 2/53 Motivation 1 procedure sum (a, b, c) { 2 x = a + b; 3 y = b + c; 4 z = x + y − b; 5 return z; 6 } Testing checks that the program behaves correctly on selected inputs sum (1, 1, 1) = sum (1, 2, 3) = . . . IA159 Formal Methods for Software Analysis: Symbolic Execution and Applications 3/53 Motivation 1 procedure sum (a, b, c) { 2 x = a + b; 3 y = b + c; 4 z = x + y − b; 5 return z; 6 } Testing checks that the program behaves correctly on selected inputs sum (1, 1, 1) = 3 sum (1, 2, 3) = 6 . . . IA159 Formal Methods for Software Analysis: Symbolic Execution and Applications 4/53 Motivation 1 procedure sum (a, b, c) { 2 x = a + b; 3 y = b + c; 4 z = x + y − b; 5 return z; 6 } We can execute the program with symbols α1, α2, α3 representing arbitrary input values sum (α1, α2, α3) = IA159 Formal Methods for Software Analysis: Symbolic Execution and Applications 5/53 Motivation 1 procedure sum (a, b, c) { 2 x = a + b; 3 y = b + c; 4 z = x + y − b; 5 return z; 6 } We can execute the program with symbols α1, α2, α3 representing arbitrary input values sum (α1, α2, α3) = α1 + α2 + α3 → symbolic execution IA159 Formal Methods for Software Analysis: Symbolic Execution and Applications 6/53 Symbolic execution semantics in general Each programming language has an execution semantics describing the data objects which program variables may represent how statements manipulate data objects how control flows through the statements of a program In symbolic execution semantics real data objects can be represented by symbols basic operators of the language are extended to accept symbolic input and produce symbolic output IA159 Formal Methods for Software Analysis: Symbolic Execution and Applications 7/53 Simple programming language Consider the following programming language all program variables are of type unbounded signed integer input can be obtained by procedure parameters, global variables, or read operations arithmetic expressions may contain only operators +, −, ∗ commands: assignment := goto