paradise Parameter Identification and Model Ranking of Thomas Networks CMSB 2012 The Royal Society, London Û¡¢£¤¥¦§¨ª«¬­Æ°±²³´µ·¸¹º»¼½¾¿Ý David Šafránek Adam Streck, Juraj Kolˇcák Masaryk University Brno Hannes Klarner, Heike Siebert Freie Universität Berlin 5th October, 2012 D. Šafránek (FI MU Brno) Parameter Identification of GRNs 1 / 31 paradise Outline 1 Motivation and Background 2 Proposed Methodology and its Implementation 3 Performance Evaluation and Case Studies D. Šafránek (FI MU Brno) Parameter Identification of GRNs 2 / 31 paradise Outline 1 Motivation and Background 2 Proposed Methodology and its Implementation 3 Performance Evaluation and Case Studies D. Šafránek (FI MU Brno) Parameter Identification of GRNs 3 / 31 paradise Motivation: Learn More about Regulatory Networks Modeling tools: C. Chaouiya, et al. 2003, GINsim., H. de Jong et al. 2002, GNA. Data processing: I. Shmulevich, et al. 2002. Binary analysis and optimization-based normalization of gene expression data.; E. Dimitrova, et al. 2010. Discretization of time series data. D. Šafránek (FI MU Brno) Parameter Identification of GRNs 4 / 31 paradise From Structure to Dynamics C: A: B: R. Thomas and R. d’Ari, CRC Press 1990. Biological feedback. D. Šafránek (FI MU Brno) Parameter Identification of GRNs 5 / 31 paradise Parameterization of Regulatory Networks Target values assigned to regulatory contexts for all nodes make a PARAMETER SET (parameterization). R. Thomas and R. d’Ari, CRC Press 1990. Biological feedback. D. Šafránek (FI MU Brno) Parameter Identification of GRNs 6 / 31 paradise Dynamics as a State Transition Graph R. Thomas and R. d’Ari, CRC Press 1990. Biological feedback. D. Šafránek (FI MU Brno) Parameter Identification of GRNs 7 / 31 paradise Dynamics as a State Transition Graph R. Thomas and R. d’Ari, CRC Press 1990. Biological feedback. D. Šafránek (FI MU Brno) Parameter Identification of GRNs 8 / 31 paradise Parameter Identification Problem Number of possible parameterizations of a single node is exponential w.r.t. the node’s in-degree. (more precisely w.r.t. the number of regulatory contexts) D. Šafránek (FI MU Brno) Parameter Identification of GRNs 9 / 31 paradise Parameter Identification Problem: Solutions G. Bernot et al. in JTB 2004: Application of formal methods to biological regulatory networks: extending Thomas asynchronous logical approach with temporal logic. Corblin et al. in BioSystems 2009: A declarative constraint-based method for analyzing discrete gene regulation networks. Batt et al. in Bioinf. 2010: Efficient parameter search for qualitative models of regulatory networks using symbolic model checking. Klarner et al. in CMSB 2011: Parameter inference for asynchronous logical networks using discrete time series. Barnat et al. in IEEE/ACM TCBB 2012: On Parameter Synthesis by Parallel Model Checking. D. Šafránek (FI MU Brno) Parameter Identification of GRNs 10 / 31 paradise Parameter Identification Problem: Solutions G. Bernot et al. in JTB 2004: Application of formal methods to biological regulatory networks: extending Thomas asynchronous logical approach with temporal logic. Corblin et al. in BioSystems 2009: A declarative constraint-based method for analyzing discrete gene regulation networks. Batt et al. in Bioinf. 2010: Efficient parameter search for qualitative models of regulatory networks using symbolic model checking. Klarner et al. in CMSB 2011: Parameter inference for asynchronous logical networks using discrete time series. Barnat et al. in IEEE/ACM TCBB 2012: On Parameter Synthesis by Parallel Model Checking. D. Šafránek (FI MU Brno) Parameter Identification of GRNs 11 / 31 paradise Outline 1 Motivation and Background 2 Proposed Methodology and its Implementation 3 Performance Evaluation and Case Studies D. Šafránek (FI MU Brno) Parameter Identification of GRNs 12 / 31 paradise Our Contribution a prototype tool chain: Parsybone – https://github.com/sybila/Parsybone.git ParameterFilter – https://github.com/sybila/ParameterFilter.git distributed computation of acceptable parameterizations employing witnesses (counterexamples) to rank obtained parameterizations visualization of the results (export to Cytoscape) D. Šafránek (FI MU Brno) Parameter Identification of GRNs 13 / 31 paradise Time-series Measurement as a Dynamic Constraint Encoded in LTL: σ(1) = 5 i=1 vi = 1 σ(2) = i∈{1,2,4} vi = 1 ∧ i∈{2,5} vi = 0 σ(3) = i∈{1,2,5} vi = 1 ∧ i∈{3,4} vi = 2 ϕ = σ(1) ∧ F(σ(2) ∧ F(σ(3))) D. Šafránek (FI MU Brno) Parameter Identification of GRNs 14 / 31 paradise Time-series Measurement as a Dynamic Constraint time−series walks Encoded in LTL: σ(1) = 5 i=1 vi = 1 σ(2) = i∈{1,2,4} vi = 1 ∧ i∈{2,5} vi = 0 σ(3) = i∈{1,2,5} vi = 1 ∧ i∈{3,4} vi = 2 ϕ = σ(1) ∧ F(σ(2) ∧ F(σ(3))) D. Šafránek (FI MU Brno) Parameter Identification of GRNs 14 / 31 paradise Time-series Measurement as a Dynamic Constraint monotonicity between 1st and 2nd measurement Encoded in LTL: σ(1) = 5 i=1 vi = 1 σ(2) = i∈{1,2,4} vi = 1 ∧ i∈{2,5} vi = 0 σ(3) = i∈{1,2,5} vi = 1 ∧ i∈{3,4} vi = 2 ϕ = σ(1) ∧ (σ(1)U(σ(2) ∧ F(σ(3)))) D. Šafránek (FI MU Brno) Parameter Identification of GRNs 14 / 31 paradise Time-series Measurement as a Dynamic Constraint ? ? monotonicity between 1st and 2nd measurement Encoded in LTL: σ(1) = 4 i=2 vi = 1 σ(2) = i∈{1,2,4} vi = 1 ∧ i∈{2,5} vi = 0 σ(3) = i∈{1,2,5} vi = 1 ∧ i∈{3,4} vi = 2 ϕ = σ(1) ∧ (σ(1)U(σ(2) ∧ F(σ(3)))) D. Šafránek (FI MU Brno) Parameter Identification of GRNs 14 / 31 paradise Identifying Parameters by Model Checking Naive approaches: G. Bernot et al. in JTB 2004; H. Klarner et al. in CMSB 2011 D. Šafránek (FI MU Brno) Parameter Identification of GRNs 15 / 31 paradise Effect of Parameters on State Transition Graph 10 11 0100 10 11 0100 10 11 0100 10 11 0100 D. Šafránek (FI MU Brno) Parameter Identification of GRNs 16 / 31 paradise Identifying Parameters by Coloured Model Checking Heuristics: J. Barnat et al. in TCBB 2012; refined in the presented paper D. Šafránek (FI MU Brno) Parameter Identification of GRNs 17 / 31 paradise Identifying Parameters by Coloured Model Checking we decide on all parameterizations at once respective parameter sets are acceptable never claim Buchi automatonparameterized Kripke structure of the model [A=1,B=0] & F([A=0,B=0]) 10 11 0100 return accepting paths of the product automaton parameter sets acceptable for the dynamic constraint time−series walks of acceptable parametrizations D. Šafránek (FI MU Brno) Parameter Identification of GRNs 18 / 31 paradise Model Checking on Coloured Graphs Idea represent each parameterization by a distinct colour find accepting cycles and get colours enabling accepting paths Procedure 1 compute initial mapping of colours to states ⇒ propagate colours through the entire graph (BFS reachability) ⇒ accepting states know all colours by which they are reached 2 for each reachable accepting cycle aggregate the valid colours D. Šafránek (FI MU Brno) Parameter Identification of GRNs 19 / 31 paradise Model Checking on Coloured Graphs Implementation explicit representation of indexed parameter sets (ordered bit vectors) parameter space split to exclusive blocks equal to size of integer type each block contains “close” parameter sets data-parallel distribution: blocks evenly distributed over the cluster . . . Pi−1 Pi Pi+1 . . . . . . . . . 1 0 0 . . . . . . 0 1 0 . . . . . . 0 0 1 . . . . . . D. Šafránek (FI MU Brno) Parameter Identification of GRNs 20 / 31 paradise Parameterization Ranking: Length Cost theoretically infinitely many time-series walks fix a dynamic constraint and focus on compatible shortest walks penalize unnecessarily higher energy cost avoid complex model realizations of the constraint assign each parameterization its length cost – the length of a shortest time-series walk consider parameterizations with minimum length cost D. Šafránek (FI MU Brno) Parameter Identification of GRNs 21 / 31 paradise Parameterization Ranking: Robustness non-deterministic dynamics caused by asynchronicity how can we interpret walks with less options to walk off the “optimal path” and miss the expected final state of the time-series? the property of the model, but... another classification of parameterizations local robustness: property of a state – number of valid successors out degree global robustness: property of a walk – product of local robustness over all states of the walk model robustness: property of a parameterization – average of global robustness over all time-series walks D. Šafránek (FI MU Brno) Parameter Identification of GRNs 22 / 31 paradise Parameterization Ranking: Robustness non-deterministic dynamics caused by asynchronicity how can we interpret walks with less options to walk off the “optimal path” and miss the expected final state of the time-series? the property of the model, but... another classification of parameterizations local robustness – approximated: Prob(x) = 1 out_degree(x) global robustness: property of a walk – product of local robustness over all states of the walk model robustness: property of a parameterization – average of global robustness over all time-series walks D. Šafránek (FI MU Brno) Parameter Identification of GRNs 22 / 31 paradise Parameterization Ranking: Overall Procedure INPUT: regulatory network, initial parameter space, static and dynamic constraints OUTPUT: subset of the initial parameter space containing optimal parameterizations 1 Remove parametrizations violating static constraints 2 Compute parameterizations acceptable by dynamic constraints 3 Select parametrizations with minimal length cost 4 Select parametrizations with maximal robustness D. Šafránek (FI MU Brno) Parameter Identification of GRNs 23 / 31 paradise Visualising Results Behaviour Maps and Expression Profiles projection to the 2nd component (gene) D. Šafránek (FI MU Brno) Parameter Identification of GRNs 24 / 31 paradise Outline 1 Motivation and Background 2 Proposed Methodology and its Implementation 3 Performance Evaluation and Case Studies D. Šafránek (FI MU Brno) Parameter Identification of GRNs 25 / 31 paradise Scalability Evaluation GRN of Bacteriophage λ Lysogenic time-series Lytic time-series [Thieffry et al. 1995] conjuction of both time-series lead to 537 parametrizations required memory: ≤ 3MB Process count Average runtime 1 5.315 s 2 2.634 s 3 1.767 s 4 1.332 s 5 1.048 s 6 0.884 s 7 0.754 s 8 0.657 s 0 1 2 3 4 5 6 7 8 0 2 4 6 8 Number of processes Speedup D. Šafránek (FI MU Brno) Parameter Identification of GRNs 26 / 31 paradise Performance Evaluation GRN of Mammalian Cell Cycle [Fauré et al. 2006] a time-series with 8 measurements 6.8 · 108 initial parametrizations 3.1 · 108 acceptable parametrizations computed setup: 8 processes running on 2 CPUs with 4 cores each required memory: ≤ 15MB Process ID Runtime Result set size Process ID Runtime Result set size 1 29.07 h 38,522,403 5 29.70 h 38,523,691 2 31.08 h 38,521,943 6 28.81 h 38,523,255 3 27.22 h 38,521,656 7 29.55 h 38,522,328 4 32.32 h 38,522,343 8 28.83 h 38,523,020 D. Šafránek (FI MU Brno) Parameter Identification of GRNs 27 / 31 paradise Case Studies Bacteriophage λ1 Rat neural system2 [Thieffry et al. 1995] [Wahde et al. 2001] Init. Parameter Space 6.9 · 109 2.6 · 105 Static Constraints 8.2 · 104 162 Dynamic Constraints 537 108 Length Cost (min) 28 (length 9) 108 (length 5) Robustness (max) 3 (9.7%) 4 (75%) 1 CMSB 2012 Proceedings 2 FI MU Technical Report D. Šafránek (FI MU Brno) Parameter Identification of GRNs 28 / 31 paradise Rat Neural System: Inferring New Hypothesis [Wan 1998, Wahde 2001] Shortest paths Maximally robust paths Predicted Hypothesis Genes in cluster 4 express before the cluster 1 expression starts to degrade. D. Šafránek (FI MU Brno) Parameter Identification of GRNs 29 / 31 paradise Conclusions Achievements computational improvement in model checking-based parameter identification for Thomas networks ranking procedures for distinguishing the models Future work vast amount of data generated...what to do next? more sophisticated model ranking (biologically relevant criteria) finding commonalities in models, e.g., for refining static constraints (CMSB 2011) D. Šafránek (FI MU Brno) Parameter Identification of GRNs 30 / 31 paradise Thank you for your attention! D. Šafránek (FI MU Brno) Parameter Identification of GRNs 31 / 31