Formal Verification 1 Formal Methods Mathematically rigorous techniques and tools for specification design verification of software and hardware systems. Definition 1 Formal verification is the act of proving or disproving the correctness of a system with respect to a certain formal specification or property. 2 Formal Verification Techniques manual – human tries to produce a proof of correctness semi-automatic – theorem proving automatic – algorithm takes a model and a property; decides whether the model satisfies the property We focus on automatic techniques. 3 Application Domains of FV generally safety-critical systems: a system whose failure can cause death, injury, or big financial loses (e.g., aircraft, nuclear station) particularly embedded systems often safety critical reasonably small and thus amenable to formal verification 4 Model Checking automatic verification technique user produces: a model of a system a logical formula which describes the desired properties model checking algorithm: checks if the model satisfies the formula if the property is not satisfied, a counterexample is produced 5 State Space model checking algorithms are based on state space exploration, i.e., "brute force" state space describes all possible behaviors of the model state space ≈ graph: nodes = states of the system edges = transitions of the system in order to construct state space, the model must be closed, i.e., we need to model environment of the system 6 Example: Model and State Space 7 Example: Peterson’s Algorithm flag[0], flag[1] (initialized to false) – meaning want to access CS turn (initialized to 1) – used to resolve conflict Process 0: while (true) { ; flag[0] := true; turn := 1; while flag[1] and turn = 1 do {}; ; flag[0] := false; } Process 1: while (true) { ; flag[1] := true; turn := 2; while flag[0] and turn = 2 do {}; ; flag[1] := false; } 8 Example: Peterson’s Algorithm desired property: always, at most one process in CS 9 desired property: always, at most one process in CS: G(¬([3][3]) ) 10 Model Checking: Steps 1. modeling: system → model 2. specification: natural language → property 3. verification: algorithm for checking whether a model satisfies the property For real-time systems: modeling formalism: timed automata specification formalism: reachability, (timed logics) 11 Fischer’s Protocol real-time mutual exclusion protocol – correctness depends on timing assumptions simple, just 1 shared variable, arbitrary number of processes assumptions: known upper bound D for the time between successive steps of the execution of a process while it attempts to access its critical section each process has it’s own timer (for delaying) 12 Fischer’s protocol id shared variable, initialized by -1 each process has it’s own timer (for delaying) for correctness it is necessary that K > D Process i: while (true) { ; while id != -1 do {}; id := i; delay K; if (id = i) { ; id := -1; } } 13 Modeling Fischer’s Protocol How to model clocks? How to model waiting (delay) ? 14 Modeling Real Time Systems Two possible models of time: discrete time domain continuous time domain 15 Discrete Time Domain clocks tick at regular interval at each tick, something may happen between ticks – the system only waits choose a fixed sample period ε all events happen at multiples of ε simple extension of classical model (time = new integer variable) main disadvantage – how to choose ε ? big ε ⇒ too coarse model small ε ⇒ time fragmentation, too big state space usage: particularly synchronous systems (hardware circuits) 16 Continuous Time Domain time ≈ real number delays may be arbitrarily small more faithful model, suitable for asynchronous systems model checking ≈ traversal of state space Problem: uncountable state space ⇒ cannot be directly handled by "brute force" 17 Timed Automata extension of finite state machines with clocks continuous real time semantics limited list of operations over clocks ⇒ automatic verification feasible allowed operations: comparison of a clock with a constant reset of a clock uniform flow of time (all clocks have the same rate) 18 What is a Timed Automaton? an automaton with locations (states) and edges the automaton spends time only in locations, not in edges 19 What is a Timed Automaton? real valued clocks all clocks run at the same speed clock constraints guard the edges 20 What is a Timed Automaton? clocks can be reset when taking an edge only a reset to value 0 is allowed 21 What is a Timed Automaton? location invariants forbid to stay in a state too long invariants must be satisfied ⇒ force taking an edge We also add labels to edges to allow definition of languages, behavioral equivalences, etc. 22 Timed Automata – Clock Constraints Definition 2 Let C be a set of clocks. Then the set B(C) of clock constraints is defined by the following abstract syntax g ::= x k | g ∧ g where x ∈ C, k ∈ N and ∈ {≤, <, =, >, ≥}. 23 Timed Automata Let C be a set of clocks and let Σ be a finite set of actions Definition 3 A timed automaton is a 4-tuple: A = (L, 0, E, I) L is a finite set of locations 0 ∈ L is an initial location E ⊆ L × B(C) × Σ × 2C × L is a finite set of edges I : L → B(C) assigns invariants to locations edge = (source location, clock constraint, action, set of clocks to be reset, target location) We omit the actions from edges if either Σ is a singleton set or the actions are not relevant (e.g. for reachability) 24 Semantics: Main Idea semantics is a transition system (states & transitions) states given by: location (local state of the automaton) clock valuation transitions: delay – only clock valuation changes action – change of location 25 Clock Valuations a clock valuation is a function ν : C → R+ given a set of clocks Y ⊆ C, denote by ν[Y := 0] the valuation obtained from ν by resetting clocks from Y: ν[Y := 0](x) =    0 x ∈ Y x otherwise. ν + d ≈ flow of time (by d units): (ν + d)(x) = ν(x) + d ν |= g means that the valuation ν satisfies the constraint g ν |= x k iff ν(x) k ν |= g1 ∧ g2 iff ν |= g1 and ν |= g2 26 Examples let ν = (x → 3, y → 2.4, z → 0.5) what is ν[{y} := 0] (usually written as ν[y := 0]) ? what is ν + 1.2 ? does ν |= y < 3 ? does ν |= x < 4 ∧ z ≥ 1 ? 27 Semantics of Timed Automata Definition 4 The semantics of a timed automaton A is a (labeled) transition system SA = (S, s0, → ) S = L × (C → R+) s0 = ( 0, ν0) where ν0(x) = 0 for all x ∈ C transitions are defined by delay ( , ν) δ −→ ( , ν + δ) for all δ ∈ R+ such that ν |= I( ) ν + δ |= I( ) for all 0 ≤ δ ≤ δ action ( , ν) a −→ ( , ν ) iff ( , g, a, Y, ) ∈ E where ν |= g ν = ν[Y := 0] ν |= I( ) We write ( , ν) → ( , ν ) iff ( , ν) h → ( , ν ) where h ∈ Σ ∪ R≥0 28 Example What is a clock valuation? What is a state? clock valuation: assignment of a real value to x initial state (off, 0); another state e.g. (light, 1.4) 29 Notes the semantics is infinite state (even uncountable) the semantics is even infinitely branching Investigated areas: languages – emptiness, universality, language inclusion (undecidable), ... equivalence checking – bisimulation of timed automata (timed and untimed), simulation, ... verification – reachability, (timed) temporal logics, ... 30 Reachability Problem A run is a maximal sequence (i.e. the one that cannot be prolonged) of the form ( 0, ν0) → ( 1, ν1) → · · · Definition 5 Input: a timed automaton A, a location of the automaton Question: Does there exist a run of A which reaches ? This problem formalizes the verification of safety problems – is an erroneous state reachable? 31 Reachability: Attempt 1 discretization (sampled semantics) allow time step (delay) 1 clock above maximal constant ⇒ value does not increase finite state space not equivalent ⇒ find a counterexample 32 Reachability: Attempt 2 what about time step 0.5 ? what about time step 0.25 ? what about time step 2−n ? 33 Reachability and Discretization for each automaton there exists ε such that sampled semantics with time step ε and dense semantics are equivalent w.r.t. reachability no fixed ε is sufficient for all timed automata for more complex verification problems sampled and dense semantics are not equivalent 34 Another approach? Idea: is it necessary to distinguish the following valuations? (0.589, 1.234) and (0.587, 1.235) some clock valuations are equivalent as the automaton cannot distinguish between them w.r.t. reachable locations let us find such equivalence classes (so called regions) 35 Complexity of Reachability Problem Theorem 6 The reachability problem is in PSPACE. note that even decidability is not straightforward – the semantics is infinite state decidability proved by region construction (to be discussed) completeness proved by general reduction from linearly bounded Turing machines (not discussed) 36 Region Construction Main idea: define equivalence on valuations so that if ν µ then the automaton “cannot distinguish between ( , ν) and ( , µ)” define so that ν µ implies that for every if ( , ν) → ( , ν ) then ( , µ) → ( , µ ) so that ν µ if ( , µ) → ( , µ ) then ( , ν) → ( , ν ) so that ν µ In particular, both configurations ( , ν) and ( , µ ) can reach the same set of locations (Note that this equivalence is basically a bisimulation) work with regions, i.e., equivalence classes of valuations, instead of valuations finite number of regions What conditions on do we need? 37 Preliminaries Let d ∈ R≥0. Define d to be the integer part of d fr(d) to be the fractional part of d Thus d = d + fr(d) Example: 42.37 = 42, fr(42.37) = 0.37 38 Equivalence on Clock Valuation: Condition 1 Let cx be the largest constant compared to a clock x (“max bound”) Two valuations ν and µ are equivalent, ν µ iff the following conditions are satisfied: C1 Clock x is in both valuations ν and µ above its max bound, or it has the same integer part in both of them: ν(x) ≥ cx ∧ µ(x) ≥ cx or ν(x) = µ(x) C2 If the value of clock is below its max bound, then either it has zero fractional part in both ν and µ or in none of them: ν(x) ≤ cx ⇒ (fr(ν(x)) = 0 ⇔ fr(µ(x)) = 0) C3 For two clocks that are below their max bound, ordering of fractional parts must be the same in both ν and µ: ν(x) ≤ cx∧µ(x) ≤ cy ⇒ (fr(ν(x)) ≤ fr(ν(y)) ⇔ fr(µ(x)) ≤ fr(µ(y))) 39 Equivalence: Examples Identify cx and cy suppose cx = 4, cy = 5, cz = 1 let (x, y, z) denote valuations, decide: 1. (0, 0.14, 0.3) (0.05, 0.1, 0.32) ? 2. (1.9, 4.2, 0.4) (2.8, 4.3, 0.7) ? 3. (0.05, 0.1, 0.3) (0.2, 0.1, 0.4) ? 4. (0.03, 1.1, 0.3) (0.05, 1.2, 0.3) ? 40 Regions Definition 7 Classes of equivalence are called regions, denoted by [ν]. Example: suppose TA with two clocks, cx = 3, cy = 2 draw all regions (since we have just 2 clocks, we can draw them in plane) 41 Regions Lemma 8 ν µ implies that for every if ( , ν) → ( , ν ) then ( , µ) → ( , µ ) so that ν µ if ( , µ) → ( , µ ) then ( , ν) → ( , ν ) so that ν µ Lemma 9 The number of regions is at most |C|! · 2|C| · x∈C(2cx + 2). 42 Region Graph A region graph is a (labeled) transition system where states are pairs of the form ( , [ν]) where is a location and ν is a valuation transitions are defined by ( , [ν]) → ( , [ν ]) iff ( , ν) → ( , ν ) region graph is equivalent to the semantics of A w.r.t. reachability, i.e., a location is reachable in the region graph iff it is reachable in the semantics of A moreover, region graph is finite and can be effectively constructed ⇒ region graph can be used to solve the reachability problem 43 Operations on Regions To construct the region graph, we need the following operations: let time pass – go to adjacent region at top right intersect with a clock constraint (note that clock constraints define supersets of regions) if region is in the constraint: no change otherwise: empty reset a clock – go to a corresponding region 44 Example: Timed Automaton 45 Example: Region Graph Here transitions that do not change location have been compressed (i.e. each transition in the above graph consists of an arbitrary number of delay transitions succeeded by one action transition) 46 Zones – More Efficient Reachability Analysis Regions – impractical, too many regions constructed explicitly – no on-the-fly approach Definition 10 Denote by B+(C) the set of extended clock constraints defined by ψ ::= x k | x − y k | φ ∧ φ where x, y ∈ C, k ∈ N and ∈ {≤, <, =, >, ≥}. Definition 11 A zone is a set of clock valuations described by an extended clock constraint gZ ∈ B+(C): Z = {ν | ν |= gZ } A symbolic state is a pair ( , Z) where is a location and Z a zone 47 Zone Operations & Symbolic Transitions Z↑ = ν + δ | ν ∈ Z ∧ δ ∈ R≥0 Z[Y := 0] = ν[Y := 0] | ν ∈ Z Lemma 12 If Z is a zone, then both Z↑ and Z[Y := 0] are zones. Symbolic transition relation over symbolic states: ( , Z) , Z↑ ∧ I( ) ( , Z) ( , (Z ∧ g)[Y := 0] ∧ I( )) if ( , g, a, Y, ) ∈ E 48 Zones – Reachability Theorem 13 If ( , Z) ( , Z ) and ν ∈ Z , then ( , ν) → ( , ν ) for some ν ∈ Z If ( , ν) → ( , ν ) with ν ∈ Z, then ( , Z) ( , Z ) with ν ∈ Z It follows that whenever ( , Z ) is reachable from ( 0, {ν0}), then all states of the form ( , ν ) with ν ∈ Z are reachable from ( 0, ν0), whenever ( , ν ) with ν ∈ Z is reachable from ( 0, ν0), then ( , Z ) is reachable from ( 0, {ν0}). 49 Example: Zones 50 Representation by Difference Bound Matrices Let C0 = C ∪ {0} where 0 is the clock with constant value 0 Each zone can be described using a conjunction of constraints of the form x − y ≤ k x − y < k where x, y ∈ C0 and k ∈ N When x − y ≤ k and x − y < k, take only x − y < k, when x − y k and x − y k , take only x − y min{k, k } ⇒ There are |C0||C0| such constraints. Store the contraints into a difference bound matrix 51 Difference Bound Matrix x < 20 ∧ y ≤ 20 ∧ y − x ≤ 10 ∧ x − y ≤ −10 ∧ z > 5 matrix representation can be used to perform necessary operation: passing of time, resetting clock, intersection with constraint, ... 52 Zone Graph: Example (source: R. Alur) 53 Network of TA interleaving semantics handshake communication – synchronization on c! and c? pairs 54 Networks of TA Let Chan be a finite set of communication channels Assume Σ = {c! | c ∈ Chan} ∪ {c? | c ∈ Chan} ∪ N where N contains a special action τ (an internal action) Definition 14 Consider n timed automata Ai = (Li, i 0 , Ei, Ii). The parallel composition A = A1 | · · · | An is a network of timed automata. A location vector: = ( 1, . . . , n) Invariants are composed into common invariants over location vectors: I( ) = I1( 1) ∧ · · · ∧ In( n) 55 Networks of TA – Semantics Semantics is defined by a transition system (S, s0, → ) where S = (L1 × · · · × Ln) × (C → R≥0) i.e. states are of the form ( , ν) s0 = ( 0, ν0) where 0 = ( 1 0 , . . . , n 0 ) and ν0(x) = 0 for x ∈ C transitions: ( , ν) → ( , ν + δ) if ν + δ |= I( ) for each δ ∈ [0, δ] (( 1, . . . , i, . . . , n), ν) → (( 1, . . . , i , . . . , n), ν ) if there exists ( i, g, a, Y, i ) ∈ Ei such that ν |= g, ν = ν[Y := 0] and ν |= I( 1, . . . , i , . . . , n) (( 1, . . . , i, . . . , j, . . . , n), ν) → (( 1, . . . , i , . . . , j , . . . , n), ν ) if there exist ( i, gi, c?, Yi, i ) ∈ Ei and ( j, gj, c!, Yj, j ) ∈ Ej such that ν |= gi ∧ gj, ν = ν[Yi ∪ Yj := 0] and ν |= I( 1, . . . , i , . . . , j , . . . , n) 56 UPPAAL Tool UPPAAL is a toolbox for modeling, simulation and verification of real-time systems Uppsala University + Aalborg University = UPPAAL Modeling language: networks of timed automata (+ additional features) widely used for teaching several industrial case studies www.uppaal.org 57 Functionality of UPPAAL modeling – graphical tool for specification of timed automata, templates simulation – simulation of the model (manual, random) verification – verification of simple properties (restricted subset of Computation Tree Logic), counterexample can be simulated Java user interface and C++ verification engine 58 Extensions of Timed Automata (UPAAL) Bounded integer variables – declared as int[min,max] name where min and max are the lower and upper bound, respectively. Violating a bound leads to an invalid state that is discarded at run-time. Arrays Broadcast channels – One sender c! can synchronise with an arbitrary number of receivers c?. Any available receiver must synchronize. Broadcast sending is never blocking. 59 Fischer’s Algorithm With the following declarations (for 6 processes): int[0,6] id; const k 2; clock x and the following parameter (for 6 processes): int[1,6] pid 60 Extensions to TA in UPPAAL Urgent locations – time is not allowed to pass in the location, i.e., they are semantically equivalent to adding an extra clock x that is reset on all incoming edges, and having an invariant x ≤ 0 on the location Committed locations – even more restrictive than urgent locations. A state of a network is committed if any of its locations is committed. A committed state cannot delay and the next transition must involve an outgoing edge from at least one of the committed locations ... useful in modeling atomic actions 61 The properties UPPAAL tool uses a simple fragment of CTL as a specification language Syntax: E P | A P | E P | A P | P−−>P P ::= A. | gc | gd | ¬P | P ∨ P where A. – a location of an automaton A (in a given network) gc – a clock constraint gd – a predicate over data variables (such as v ≥ 1, or v == v − 1) 62 Properties E P = it is possible to reach a state in which P is satisfied (written as E<>P) A P = P will inevitably become true, the automaton is guaranteed to eventually reach a state in which P is true. (written as A<>P) 63 Properties A P = P holds always and everywhere in the future (written as A[]P) E P = P is potentially always true; there is a run in which P is true in all states (written as E[]P) 64 Properties P−−>Q = P leads to Q; if P becomes true, Q will inevitably become true later on; P−−>Q ≡ A (P imply A Q) (written as P-->Q) 65 Timed CTL – Very Briefly Syntax of TCTL state-formulas over a set of atomic propositions AP and a set of clocks C: Φ ::= true | a | g | Φ ∧ Φ | ¬Φ | EΦUJ Φ | AΦUJ Φ where a ∈ AP, g is a clock constraint and J is an interval in R≥0 with bounds in N 66 TCTL – Very Briefly A run is divergent, if its total execution time is infinite recall that a run is a maximal path; it can be convergent if either it makes finitely many transitions, or the length of delays converges to a finite number Let L be a function which to every location assigns a set of atomic propositions. For a state s = ( , ν) we define a satisfaction relation |= by s |= true s |= a iff a ∈ L( ) s |= g iff ν |= g s |= ¬Φ iff s |= Φ s |= Φ1 ∧ Φ2 iff (s |= Φ1) and (s |= Φ2) s |= EΦ1UJΦ2 iff ω |= Φ1UJΦ2 for some divergent run ω s |= AΦ1UJΦ2 iff ω |= Φ1UJΦ2 for all divergent runs ω 67 TCTL – Very Briefly Let ω = ( 0, ν0) h1 −→ ( 1, ν1) h2 −→ ( 2, ν2) h3 −→ · · · be divergent run Here each hi is either a real number (delay), or an action of Σ Define δi =    hi if hi is a delay in R≥0 0 otherwise, i.e., hi ∈ Σ Given t ∈ R≥0 we denote by ωt the state “visited” by ω at time t: ωt = ( i, νi + δ) where i is the maximal number s. t. i j=1 δj ≤ t δ = t − i j=1 δj ω |= Φ1UJΦ2 if there is time t ∈ J such that ωt |= Φ2 ωt |= Φ1 ∨ Φ2 for all t < t 68 TCTL – Very Briefly – Examples E trueU[0,1] a = there exists a run which reaches a location satisfying a during the first time unit E bU[0,1] a = there exists a run which reaches a location satisfying a at some time t ∈ [0, 1] and before that visits only states that satisfy either b or a Define J Φ ≡ trueUJ Φ and J Φ = ¬ J ¬Φ A [0,3] a A [1,2] (E [0,1] a) A [0,1] ¬(E [0,11] a) 69