Formal Veriﬁcation 1 Formal Methods Mathematically rigorous techniques and tools for speciﬁcation design veriﬁcation of software and hardware systems. Deﬁnition 1 Formal veriﬁcation is the act of proving or disproving the correctness of a system with respect to a certain formal speciﬁcation or property. 2 Formal Veriﬁcation Techniques manual – human tries to produce a proof of correctness semi-automatic – theorem proving automatic – algorithm takes a model and a property; decides whether the model satisﬁes the property We focus on automatic techniques. 3 Application Domains of FV generally safety-critical systems: a system whose failure can cause death, injury, or big ﬁnancial loses (e.g., aircraft, nuclear station) particularly embedded systems often safety critical reasonably small and thus amenable to formal veriﬁcation 4 Model Checking automatic veriﬁcation technique user produces: a model of a system a logical formula which describes the desired properties model checking algorithm: checks if the model satisﬁes the formula if the property is not satisﬁed, a counterexample is produced 5 State Space model checking algorithms are based on state space exploration, i.e., "brute force" state space describes all possible behaviors of the model state space ≈ graph: nodes = states of the system edges = transitions of the system in order to construct state space, the model must be closed, i.e., we need to model environment of the system 6 Example: Model and State Space 7 Example: Peterson’s Algorithm flag[0], flag[1] (initialized to false) – meaning want to access CS turn (initialized to 1) – used to resolve conﬂict Process 0: while (true) { ; flag[0] := true; turn := 1; while flag[1] and turn = 1 do {}; ; flag[0] := false; } Process 1: while (true) { ; flag[1] := true; turn := 2; while flag[0] and turn = 2 do {}; ; flag[1] := false; } 8 Example: Peterson’s Algorithm desired property: always, at most one process in CS 9 desired property: always, at most one process in CS: G(¬([3][3]) ) 10 Model Checking: Steps 1. modeling: system → model 2. speciﬁcation: natural language → property 3. veriﬁcation: algorithm for checking whether a model satisﬁes the property For real-time systems: modeling formalism: timed automata speciﬁcation formalism: reachability, (timed logics) 11 Fischer’s Protocol real-time mutual exclusion protocol – correctness depends on timing assumptions simple, just 1 shared variable, arbitrary number of processes assumptions: known upper bound D for the time between successive steps of the execution of a process while it attempts to access its critical section each process has it’s own timer (for delaying) 12 Fischer’s protocol id shared variable, initialized by -1 each process has it’s own timer (for delaying) for correctness it is necessary that K > D Process i: while (true) { ; while id != -1 do {}; id := i; delay K; if (id = i) { ; id := -1; } } 13 Modeling Fischer’s Protocol How to model clocks? How to model waiting (delay) ? 14 Modeling Real Time Systems Two possible models of time: discrete time domain continuous time domain 15 Discrete Time Domain clocks tick at regular interval at each tick, something may happen between ticks – the system only waits choose a ﬁxed sample period ε all events happen at multiples of ε simple extension of classical model (time = new integer variable) main disadvantage – how to choose ε ? big ε ⇒ too coarse model small ε ⇒ time fragmentation, too big state space usage: particularly synchronous systems (hardware circuits) 16 Continuous Time Domain time ≈ real number delays may be arbitrarily small more faithful model, suitable for asynchronous systems model checking ≈ traversal of state space Problem: uncountable state space ⇒ cannot be directly handled by "brute force" 17 Timed Automata extension of ﬁnite state machines with clocks continuous real time semantics limited list of operations over clocks ⇒ automatic veriﬁcation feasible allowed operations: comparison of a clock with a constant reset of a clock uniform ﬂow of time (all clocks have the same rate) 18 What is a Timed Automaton? an automaton with locations (states) and edges the automaton spends time only in locations, not in edges 19 What is a Timed Automaton? real valued clocks all clocks run at the same speed clock constraints guard the edges 20 What is a Timed Automaton? clocks can be reset when taking an edge only a reset to value 0 is allowed 21 What is a Timed Automaton? location invariants forbid to stay in a state too long invariants must be satisﬁed ⇒ force taking an edge We also add labels to edges to allow deﬁnition of languages, behavioral equivalences, etc. 22 Timed Automata – Clock Constraints Deﬁnition 2 Let C be a set of clocks. Then the set B(C) of clock constraints is deﬁned by the following abstract syntax g ::= x k | g ∧ g where x ∈ C, k ∈ N and ∈ {≤, <, =, >, ≥}. 23 Timed Automata Let C be a set of clocks and let Σ be a ﬁnite set of actions Deﬁnition 3 A timed automaton is a 4-tuple: A = (L, 0, E, I) L is a ﬁnite set of locations 0 ∈ L is an initial location E ⊆ L × B(C) × Σ × 2C × L is a ﬁnite set of edges I : L → B(C) assigns invariants to locations edge = (source location, clock constraint, action, set of clocks to be reset, target location) We omit the actions from edges if either Σ is a singleton set or the actions are not relevant (e.g. for reachability) 24 Semantics: Main Idea semantics is a transition system (states & transitions) states given by: location (local state of the automaton) clock valuation transitions: delay – only clock valuation changes action – change of location 25 Clock Valuations a clock valuation is a function ν : C → R+ given a set of clocks Y ⊆ C, denote by ν[Y := 0] the valuation obtained from ν by resetting clocks from Y: ν[Y := 0](x) =    0 x ∈ Y x otherwise. ν + d ≈ ﬂow of time (by d units): (ν + d)(x) = ν(x) + d ν |= g means that the valuation ν satisﬁes the constraint g ν |= x k iff ν(x) k ν |= g1 ∧ g2 iff ν |= g1 and ν |= g2 26 Examples let ν = (x → 3, y → 2.4, z → 0.5) what is ν[{y} := 0] (usually written as ν[y := 0]) ? what is ν + 1.2 ? does ν |= y < 3 ? does ν |= x < 4 ∧ z ≥ 1 ? 27 Semantics of Timed Automata Deﬁnition 4 The semantics of a timed automaton A is a (labeled) transition system SA = (S, s0, → ) S = L × (C → R+) s0 = ( 0, ν0) where ν0(x) = 0 for all x ∈ C transitions are deﬁned by delay ( , ν) δ −→ ( , ν + δ) for all δ ∈ R+ such that ν |= I( ) ν + δ |= I( ) for all 0 ≤ δ ≤ δ action ( , ν) a −→ ( , ν ) iff ( , g, a, Y, ) ∈ E where ν |= g ν = ν[Y := 0] ν |= I( ) We write ( , ν) → ( , ν ) iff ( , ν) h → ( , ν ) where h ∈ Σ ∪ R≥0 28 Example What is a clock valuation? What is a state? clock valuation: assignment of a real value to x initial state (off, 0); another state e.g. (light, 1.4) 29 Notes the semantics is inﬁnite state (even uncountable) the semantics is even inﬁnitely branching Investigated areas: languages – emptiness, universality, language inclusion (undecidable), ... equivalence checking – bisimulation of timed automata (timed and untimed), simulation, ... veriﬁcation – reachability, (timed) temporal logics, ... 30 Reachability Problem A run is a maximal sequence (i.e. the one that cannot be prolonged) of the form ( 0, ν0) → ( 1, ν1) → · · · Deﬁnition 5 Input: a timed automaton A, a location of the automaton Question: Does there exist a run of A which reaches ? This problem formalizes the veriﬁcation of safety problems – is an erroneous state reachable? 31 Reachability: Attempt 1 discretization (sampled semantics) allow time step (delay) 1 clock above maximal constant ⇒ value does not increase ﬁnite state space not equivalent ⇒ ﬁnd a counterexample 32 Reachability: Attempt 2 what about time step 0.5 ? what about time step 0.25 ? what about time step 2−n ? 33 Reachability and Discretization for each automaton there exists ε such that sampled semantics with time step ε and dense semantics are equivalent w.r.t. reachability no ﬁxed ε is sufﬁcient for all timed automata for more complex veriﬁcation problems sampled and dense semantics are not equivalent 34 Another approach? Idea: is it necessary to distinguish the following valuations? (0.589, 1.234) and (0.587, 1.235) some clock valuations are equivalent as the automaton cannot distinguish between them w.r.t. reachable locations let us ﬁnd such equivalence classes (so called regions) 35 Complexity of Reachability Problem Theorem 6 The reachability problem is in PSPACE. note that even decidability is not straightforward – the semantics is inﬁnite state decidability proved by region construction (to be discussed) completeness proved by general reduction from linearly bounded Turing machines (not discussed) 36 Region Construction Main idea: deﬁne equivalence on valuations so that if ν µ then the automaton “cannot distinguish between ( , ν) and ( , µ)” deﬁne so that ν µ implies that for every if ( , ν) → ( , ν ) then ( , µ) → ( , µ ) so that ν µ if ( , µ) → ( , µ ) then ( , ν) → ( , ν ) so that ν µ In particular, both conﬁgurations ( , ν) and ( , µ ) can reach the same set of locations (Note that this equivalence is basically a bisimulation) work with regions, i.e., equivalence classes of valuations, instead of valuations ﬁnite number of regions What conditions on do we need? 37 Preliminaries Let d ∈ R≥0. Deﬁne d to be the integer part of d fr(d) to be the fractional part of d Thus d = d + fr(d) Example: 42.37 = 42, fr(42.37) = 0.37 38 Equivalence on Clock Valuation: Condition 1 Let cx be the largest constant compared to a clock x (“max bound”) Two valuations ν and µ are equivalent, ν µ iff the following conditions are satisﬁed: C1 Clock x is in both valuations ν and µ above its max bound, or it has the same integer part in both of them: ν(x) ≥ cx ∧ µ(x) ≥ cx or ν(x) = µ(x) C2 If the value of clock is below its max bound, then either it has zero fractional part in both ν and µ or in none of them: ν(x) ≤ cx ⇒ (fr(ν(x)) = 0 ⇔ fr(µ(x)) = 0) C3 For two clocks that are below their max bound, ordering of fractional parts must be the same in both ν and µ: ν(x) ≤ cx∧µ(x) ≤ cy ⇒ (fr(ν(x)) ≤ fr(ν(y)) ⇔ fr(µ(x)) ≤ fr(µ(y))) 39 Equivalence: Examples Identify cx and cy suppose cx = 4, cy = 5, cz = 1 let (x, y, z) denote valuations, decide: 1. (0, 0.14, 0.3) (0.05, 0.1, 0.32) ? 2. (1.9, 4.2, 0.4) (2.8, 4.3, 0.7) ? 3. (0.05, 0.1, 0.3) (0.2, 0.1, 0.4) ? 4. (0.03, 1.1, 0.3) (0.05, 1.2, 0.3) ? 40 Regions Deﬁnition 7 Classes of equivalence are called regions, denoted by [ν]. Example: suppose TA with two clocks, cx = 3, cy = 2 draw all regions (since we have just 2 clocks, we can draw them in plane) 41 Regions Lemma 8 ν µ implies that for every if ( , ν) → ( , ν ) then ( , µ) → ( , µ ) so that ν µ if ( , µ) → ( , µ ) then ( , ν) → ( , ν ) so that ν µ Lemma 9 The number of regions is at most |C|! · 2|C| · x∈C(2cx + 2). 42 Region Graph A region graph is a (labeled) transition system where states are pairs of the form ( , [ν]) where is a location and ν is a valuation transitions are deﬁned by ( , [ν]) → ( , [ν ]) iff ( , ν) → ( , ν ) region graph is equivalent to the semantics of A w.r.t. reachability, i.e., a location is reachable in the region graph iff it is reachable in the semantics of A moreover, region graph is ﬁnite and can be effectively constructed ⇒ region graph can be used to solve the reachability problem 43 Operations on Regions To construct the region graph, we need the following operations: let time pass – go to adjacent region at top right intersect with a clock constraint (note that clock constraints deﬁne supersets of regions) if region is in the constraint: no change otherwise: empty reset a clock – go to a corresponding region 44 Example: Timed Automaton 45 Example: Region Graph Here transitions that do not change location have been compressed (i.e. each transition in the above graph consists of an arbitrary number of delay transitions succeeded by one action transition) 46 Zones – More Efﬁcient Reachability Analysis Regions – impractical, too many regions constructed explicitly – no on-the-ﬂy approach Deﬁnition 10 Denote by B+(C) the set of extended clock constraints deﬁned by ψ ::= x k | x − y k | φ ∧ φ where x, y ∈ C, k ∈ N and ∈ {≤, <, =, >, ≥}. Deﬁnition 11 A zone is a set of clock valuations described by an extended clock constraint gZ ∈ B+(C): Z = {ν | ν |= gZ } A symbolic state is a pair ( , Z) where is a location and Z a zone 47 Zone Operations & Symbolic Transitions Z↑ = ν + δ | ν ∈ Z ∧ δ ∈ R≥0 Z[Y := 0] = ν[Y := 0] | ν ∈ Z Lemma 12 If Z is a zone, then both Z↑ and Z[Y := 0] are zones. Symbolic transition relation over symbolic states: ( , Z) , Z↑ ∧ I( ) ( , Z) ( , (Z ∧ g)[Y := 0] ∧ I( )) if ( , g, a, Y, ) ∈ E 48 Zones – Reachability Theorem 13 If ( , Z) ( , Z ) and ν ∈ Z , then ( , ν) → ( , ν ) for some ν ∈ Z If ( , ν) → ( , ν ) with ν ∈ Z, then ( , Z) ( , Z ) with ν ∈ Z It follows that whenever ( , Z ) is reachable from ( 0, {ν0}), then all states of the form ( , ν ) with ν ∈ Z are reachable from ( 0, ν0), whenever ( , ν ) with ν ∈ Z is reachable from ( 0, ν0), then ( , Z ) is reachable from ( 0, {ν0}). 49 Example: Zones 50 Representation by Difference Bound Matrices Let C0 = C ∪ {0} where 0 is the clock with constant value 0 Each zone can be described using a conjunction of constraints of the form x − y ≤ k x − y < k where x, y ∈ C0 and k ∈ N When x − y ≤ k and x − y < k, take only x − y < k, when x − y k and x − y k , take only x − y min{k, k } ⇒ There are |C0||C0| such constraints. Store the contraints into a difference bound matrix 51 Difference Bound Matrix x < 20 ∧ y ≤ 20 ∧ y − x ≤ 10 ∧ x − y ≤ −10 ∧ z > 5 matrix representation can be used to perform necessary operation: passing of time, resetting clock, intersection with constraint, ... 52 Zone Graph: Example (source: R. Alur) 53 Network of TA interleaving semantics handshake communication – synchronization on c! and c? pairs 54 Networks of TA Let Chan be a ﬁnite set of communication channels Assume Σ = {c! | c ∈ Chan} ∪ {c? | c ∈ Chan} ∪ N where N contains a special action τ (an internal action) Deﬁnition 14 Consider n timed automata Ai = (Li, i 0 , Ei, Ii). The parallel composition A = A1 | · · · | An is a network of timed automata. A location vector: = ( 1, . . . , n) Invariants are composed into common invariants over location vectors: I( ) = I1( 1) ∧ · · · ∧ In( n) 55 Networks of TA – Semantics Semantics is deﬁned by a transition system (S, s0, → ) where S = (L1 × · · · × Ln) × (C → R≥0) i.e. states are of the form ( , ν) s0 = ( 0, ν0) where 0 = ( 1 0 , . . . , n 0 ) and ν0(x) = 0 for x ∈ C transitions: ( , ν) → ( , ν + δ) if ν + δ |= I( ) for each δ ∈ [0, δ] (( 1, . . . , i, . . . , n), ν) → (( 1, . . . , i , . . . , n), ν ) if there exists ( i, g, a, Y, i ) ∈ Ei such that ν |= g, ν = ν[Y := 0] and ν |= I( 1, . . . , i , . . . , n) (( 1, . . . , i, . . . , j, . . . , n), ν) → (( 1, . . . , i , . . . , j , . . . , n), ν ) if there exist ( i, gi, c?, Yi, i ) ∈ Ei and ( j, gj, c!, Yj, j ) ∈ Ej such that ν |= gi ∧ gj, ν = ν[Yi ∪ Yj := 0] and ν |= I( 1, . . . , i , . . . , j , . . . , n) 56 UPPAAL Tool UPPAAL is a toolbox for modeling, simulation and veriﬁcation of real-time systems Uppsala University + Aalborg University = UPPAAL Modeling language: networks of timed automata (+ additional features) widely used for teaching several industrial case studies www.uppaal.org 57 Functionality of UPPAAL modeling – graphical tool for speciﬁcation of timed automata, templates simulation – simulation of the model (manual, random) veriﬁcation – veriﬁcation of simple properties (restricted subset of Computation Tree Logic), counterexample can be simulated Java user interface and C++ veriﬁcation engine 58 Extensions of Timed Automata (UPAAL) Bounded integer variables – declared as int[min,max] name where min and max are the lower and upper bound, respectively. Violating a bound leads to an invalid state that is discarded at run-time. Arrays Broadcast channels – One sender c! can synchronise with an arbitrary number of receivers c?. Any available receiver must synchronize. Broadcast sending is never blocking. 59 Fischer’s Algorithm With the following declarations (for 6 processes): int[0,6] id; const k 2; clock x and the following parameter (for 6 processes): int[1,6] pid 60 Extensions to TA in UPPAAL Urgent locations – time is not allowed to pass in the location, i.e., they are semantically equivalent to adding an extra clock x that is reset on all incoming edges, and having an invariant x ≤ 0 on the location Committed locations – even more restrictive than urgent locations. A state of a network is committed if any of its locations is committed. A committed state cannot delay and the next transition must involve an outgoing edge from at least one of the committed locations ... useful in modeling atomic actions 61 The properties UPPAAL tool uses a simple fragment of CTL as a speciﬁcation language Syntax: E P | A P | E P | A P | P−−>P P ::= A. | gc | gd | ¬P | P ∨ P where A. – a location of an automaton A (in a given network) gc – a clock constraint gd – a predicate over data variables (such as v ≥ 1, or v == v − 1) 62 Properties E P = it is possible to reach a state in which P is satisﬁed (written as E<>P) A P = P will inevitably become true, the automaton is guaranteed to eventually reach a state in which P is true. (written as A<>P) 63 Properties A P = P holds always and everywhere in the future (written as A[]P) E P = P is potentially always true; there is a run in which P is true in all states (written as E[]P) 64 Properties P−−>Q = P leads to Q; if P becomes true, Q will inevitably become true later on; P−−>Q ≡ A (P imply A Q) (written as P-->Q) 65 Timed CTL – Very Brieﬂy Syntax of TCTL state-formulas over a set of atomic propositions AP and a set of clocks C: Φ ::= true | a | g | Φ ∧ Φ | ¬Φ | EΦUJ Φ | AΦUJ Φ where a ∈ AP, g is a clock constraint and J is an interval in R≥0 with bounds in N 66 TCTL – Very Brieﬂy A run is divergent, if its total execution time is inﬁnite recall that a run is a maximal path; it can be convergent if either it makes ﬁnitely many transitions, or the length of delays converges to a ﬁnite number Let L be a function which to every location assigns a set of atomic propositions. For a state s = ( , ν) we deﬁne a satisfaction relation |= by s |= true s |= a iff a ∈ L( ) s |= g iff ν |= g s |= ¬Φ iff s |= Φ s |= Φ1 ∧ Φ2 iff (s |= Φ1) and (s |= Φ2) s |= EΦ1UJΦ2 iff ω |= Φ1UJΦ2 for some divergent run ω s |= AΦ1UJΦ2 iff ω |= Φ1UJΦ2 for all divergent runs ω 67 TCTL – Very Brieﬂy Let ω = ( 0, ν0) h1 −→ ( 1, ν1) h2 −→ ( 2, ν2) h3 −→ · · · be divergent run Here each hi is either a real number (delay), or an action of Σ Deﬁne δi =    hi if hi is a delay in R≥0 0 otherwise, i.e., hi ∈ Σ Given t ∈ R≥0 we denote by ωt the state “visited” by ω at time t: ωt = ( i, νi + δ) where i is the maximal number s. t. i j=1 δj ≤ t δ = t − i j=1 δj ω |= Φ1UJΦ2 if there is time t ∈ J such that ωt |= Φ2 ωt |= Φ1 ∨ Φ2 for all t < t 68 TCTL – Very Brieﬂy – Examples E trueU[0,1] a = there exists a run which reaches a location satisfying a during the ﬁrst time unit E bU[0,1] a = there exists a run which reaches a location satisfying a at some time t ∈ [0, 1] and before that visits only states that satisfy either b or a Deﬁne J Φ ≡ trueUJ Φ and J Φ = ¬ J ¬Φ A [0,3] a A [1,2] (E [0,1] a) A [0,1] ¬(E [0,11] a) 69