Computer-Aided Systems Biology David Šafránek Masaryk University Czech Republic Outline Q Introduction and Motivation Q Methodology • Biological Networks • Modelling Problems Q Case Study Introduction and Motivation • Biological Networks • Modelling Problems Biology Domain Roots Biology • since ancient times • empirically studies life and living organisms • studied aspects: structure, function, growth, development and evolution • used concepts: • the cell - the unit of life • the gene - the unit of inherited information • the evolution - the mechanism of species creation Biophysics and Theoretical Biology Domain Roots Biophysics • since the mid of 19th century 9 living organism = open (thermodynamic) system • the goal: why and how the living matter works? • uses mathematical apparatus • a fascinating phenomenon: homeostasis • maintain a stable condition in a changing environment • robust (up-to certain limits) Holistic Thinking and Systems Theory Phylosophical Roots reductionism 6/54 Holistic Thinking and Systems Theory Phylosophical Roots holism 'A Whole is Greater Than the Sum of Its Parts." - Aristoteles 7/54 Development of Systems Biology molecular biology 1960-70 Z genome sequencing methods sequencing genome of many organisms 1980-90 1990-1998 experimental biology discovery of gene regulation lac operon (1961) Jacob and Monod ^ac systems theory analog simulations 1960-65 mtabolic networks simulators 1970-80 in silico models (black box) 1980-90 bioinformatics databases 1990-2000 SYSTEMS BIOLOGY Hiroaki Kitano genome-wide modelling 1995-2000 'Does the flap of a butterfly's wings in Brazil set off a tornado in Texas?" Philip Merilees 8/54 Motivation: Rigorously Answer Biological Questions • biology is goal-oriented 9 biological problems typically address complex processes Examples of biological problems How the bacteria cell utilises particular nutritions? Which nutritions imply fastest growth under given conditions? Motivation: Rigorously Answer Biological Questions • biology is goal-oriented 9 biological problems typically address complex processes Examples of biological problems How the bacteria cell utilises particular nutritions? Which nutritions imply fastest growth under given conditions? The answer should fullfil specific requirements • to formulate and analyse a biological problem holistically • to give mechanistic explanation based on known facts -mechanistic means in the context of laws of physics/chemistry • to project the mechanistic details onto the genetic information From Biologist's Table 10/54 From Biologist's Table Biological Problem Why a human fibroblast cell misinterprets a certain growth factor? slide credits: Pavel Krejčí (MUNI LF) Systems Approach: The Grand Challenge Complex Organism as a System fq(e, n, l) 11/54 Systems Approach: The Grand Challenge Complex Organism as a System fg(e, n, l) 11/54 Systems Approach: A Moderate Challenge Population of Bacteria as a System for a particular set of genes G Fq : (environment exposure, nutrition) —>> growth profile slide credits: Ralf Steuer (HU Berlin) 12/54 Systems View of Processes Driving the Cell 13/54 The Cell as a Complex Interaction Network OMIM System boundary metabolite KEGG c aMAZE^ DIP BIND egulonP Metabolic networks Signalling Pathways Protein-protein Interaction protein 1 [RANSPATH Gene Pathways V genel Metabolome space metabolite3 TlGi Brenda ENZYME Prqteome space ranscriptome space A|rrayExpre Genome space gene2 gene3 slide credits: David Gilbert (Brunei Univ.) 14/54 Outline 0 Methodology • Biological Networks • Modelling Problems 15/54 The Paradigm of Computational Systems Biology Assumptions 9 The biological reality (a biophysical process) is understood as a biological system. 9 A biological system is given as a network M of biochemical components connected by chemical/physical interactions. • The components include relevant genes and gene products. 16/54 Biological Networks CRNs basic form: chemical reaction networks (CRNs) • elementary chemical reactions • represent the flow of the mass example: S + Ei r2 r3 > P + E SBGN standard notation, see https://www.sbgn.org 17/54 Biological Networks Reaction-Influence Networks • simplified form: reaction-influence networks (RINs) • chemical reactions influenced by other molecules • represent the modulated flow of the mass • example: SBGN standard notation, see https://www.sbgn.org 18/54 Biological Networks Influence Networks abstract form: influence networks (INs) • represent positive/negative influences among molecules • well fit incomplete knowledge • typically gene regulatory networks, signalling pathways example: [ml: prole in }- TF1 TF2 [ mi: prole in }- SBGN standard notation, see https://www.sbgn.org 19/54 Biological Networks are Large genetic regulatory network of E. coli see https://reactome.org for more... 20/54 ...But Organised 3. osmotic stress 4. stationary phase 5.DNA metabolism 6. superoxide DOR DOR DOR DOR 111 l&h -V V ÖS'» ÖbÖÖÖÖÖÖÖrhöÖÖ 0 ööö4>fi)B A 6 /A 6 /l\/l 0 4 legend: o transcription factor (TF) o global TF dense overlapping regulons (DOR) - postive regulation □ single input module (SIM) negative regulation A coherent feedforward loop dual regulation A incoherent feedforward loop multi-input module o single operon genetic regulatory network of E. coli slide credits: Uri Alon 21/54 The Goal of Computational Systems Biology The General Goa For a biological system given by a network M reconstruct the system's dynamics: Define a function that encodes the information (signal) processing occuring in all components of the system in time. Fj\[ : (input stimuli, environment signals) —>> response signals 22/54 Model-Based Workflow network reconstruction biological knowledge databases aMAZE Brendel ENZYME Í model validation gene reporters, DNA microarray, mass spectrometry, ... Bacterial DNA o 1 biological network hypothesis emergent properties model questions model specification Petri Net, ODEs, rule-based, process calculus, Boolean network, ... Qnadph 4.1.2.15 4.6.1.3 4.2.1.10 1.1.1.25 \ erythrose-4- -u -►()-H--H--H -NADP phosphate ^\ _______— /~\* phospho- ,—^ phosphate f ) (J «—O*—-1 h——\\*—O ATP 4 2.5.1.19 2J.1.7lY Q adp 4ä = -Al[£][S] + Aä[£S] £rH = -fci[E][S] + /t2[ES] + /t3[£S] or ^ = ^[E][S] - %[£S] -/t3[£S] computer lab model analysis static analysis, numerical simulation, analytical methods, model checking 0.5 [mm o l/ml' min]; g = 1 [1/min] | [Y]|-[Z||-[X]J-Kyz hypothesis testing, property detection, new hypothesis inference 23/54 From Networks to Models How to reconstruct the dynamics Fj^ of a network Ml O choose a modelling framework Q associate every interaction in M with a suitable kinetic rule • describes how a state of affected components changes in time O build a computational model by combining kinetic rules 24/54 Modelling Frameworks state-transition systems states: discrete molecule numbers or qualities (on/off) stochastic model Continuous-Time Markov Chains states: discrete molecule numbers abstracted time modeled I Ordinary Differential Equations states: continuous concentrations states discrete continuous 25/54 Model Construction Continuous View of CRNs deterministic continuous-time dynamics of molecule population molecules dissolved in the cell volume (a well-stirred "pool") molar concentration [M]=[mo/ • A + B^tAB biophysical law of mass action kinetics: v = k[A][B] [M-s-1] 26/54 Model Construction Continuous View of CRNs deterministic continuous-time dynamics of molecule population molecules dissolved in the cell volume (a well-stirred "pool") molar concentration [M]=[mo/ • A + B ->• AB biophysical law of mass action kinetics: v = /c[4][e] [M-s"1] d[A\ _ dt d[B] dt d[AB] dt = —v = —V = V k [s -1] is a reaction-specific parameter Waage, P.; Guldberg, CM. (1864). Studier over Affiniteten. Forhandlinger i Videnskabs-selskabet i Christiania (Transactions of the Scientific Society in Christiania) (in Danish): 35-45. 26/54 Model Construction Stochastic View of CRNs stochastic continuous-time dynamics of molecule population molecules distributed uniformly in the cell volume (well-stirred) states describe molecules number (#) A + B ->• AB biophysical law of stochastic mass action describes an expected rate of reaction event occurence: v = k-#A-#B [s-1] / #A = 5 \ / #A = 4 \ #e=2 4 #e=i \#AB = 0/ \#AB = l) k [s_1] is a reaction-specific parameter, depends on cell volume Gillespie, Daniel T. (1977). "Exact Stochastic Simulation of Coupled Chemical Reactions". The Journal of Physical Chemistry. 81 (25): 2340-2361. 27/54 Model Construction Continuous View of INs controlled synthesis of X and its pontaneous degradation so-called (sigmoidal) Hill kinetics synthesis degradation /3,7 ... production and degradation parameters; 28/54 Model Construction Continuous View of INs controlled synthesis of X and its pontaneous degradation so-called (sigmoidal) Hill kinetics synthesis degradation /3,7 ... production and degradation parameters; K ... parameter of the influence 28/54 Model Construction Qualitative View of Influence Nets - Discrete Regulatory Networks Ae {0,1,2}, ße{0,l} tAA = 2, tAB = 1 «Aß = 2 Kam = 0 KB, apply) • all kind of models can be generated • BCSL example: S{u}::KaiC::KaiC6::cyt <=> S{p}::KaiC::KaiC6::cyt executes on any serine site of any KaiC in a hexamer Other Languages • Kappa, BNGL, Chromar, LBS, ... (rule-based) • SPiM, BioSPI, ... (process-algebraic) 33/54 Required Knowledge Required Knowledge 34/54 Model Construction Dealing with Unknown Parameters • traditional approaches - parameter estimation finding a single "optimal" value fitting experimental data biological process CTreco n st r u ct i o O computational model parameters Model Construction Dealing with Unknown Parameters • traditional approaches - parameter estimation finding a single "optimal" value fitting experimental data • computer science approach - parameter synthesis finding values satisfying given dynamical properties/hypotheses biological process CTreco n st r u ct i o nj> CTo bse rva t i o rT^> I observed properties computational model parameters c^synthesis/tu n i ng^> formalisation 35/54 Temporal Logics for Biological Systems • qualitative properties (LTL, CTL, HCTL) • modalities (possibilities/necessities in future behaviour) • reachability of particular (sets of) states • temporal ordering of events, monotonicity • temporal correlations of model variables • stability (attractors, basins of attraction) o quantitative properties • deterministic (MTL, MITL, STL, STL*) o enhance modalities with (dense) time information o exact timing of events, time-bounds 9 value-freezing (HSB 2012) o stochastic (PLTL, PCTL, CSL) • probability of property satisfaction • stochasticity combined with time 36/54 Problem Formulation Parameter Synthesis for Dynamical Systems behavior constraints C ^ Model M{p) : * = f(x, P) parameter constraints Pi 0 Parameter Synthesis Problem Assume V is the admissible parameter space. Given a behaviour constraint p, parameter constraint 0/, and a parameterised model A4, find the maximal set P CV of parameterisations such that p |= 0/ and M(p) |= p for all p e P. Parameter Synthesis Workflow behaviour constraints djormalisatioru^ parameter constraints valid parameter valuations model parameterised Kripke structure 38/54 Parameterised Kripke Structures State Transition Systems with Parameters Transitions with Parameters (coloured transitions) o each parameter valuation represents one Kripke structure • shared state space, different transition space 39/54 Parameterised Kripke Structures State Transition Systems with Parameters Transitions with Parameters (coloured transitions) o each parameter valuation represents one Kripke structure • shared state space, different transition space o symbolic representation of parameters o symbolic PKS: every transition is associated with a formula 39/54 From ODE Models to Kripke Structures Rectangular Abstraction of Reaction Kinetics [Belta, Habets, Schuppen] E f ES P K2 mass action kinetics with limitation of 1 molecule per each reactant species , multi—affine ODEs d[S] dt = -k![E][S] + ki[ES\ d[E] dt d[ES] dt ---k1[E][S} + k2[ES} + k3[ES} = ME1[S] - k2[ES] - k3[ES] d[P] dt kz[ES] set discrete value domains per each variable t N i t I IS- Rectangular Abstraction of Regulatory Kinetics [de Jong, Batt] Hill kinetics piece-wise affine ODEs dx dt ~dt - = Kas (xa,9l)s (xb,9l) - -yaxa, IbXb maxa - ---------1 )--------T-------- Xb ___i maxb 40/54 Coloured Model Checking Symbolic Parameter Space dA dt = -/(i • A + k2 ■ B dB dt = ki ■ A - k2 ■ B k2 = 0.8 *l = ? ß 5 |\ \ \ N. \ ^ ^ N * * 2.5© \ \ \ \ VS. ~* \ \ \ ^ * 's:—s—s—- \ u * ■* \ \ ^ \ \ \ \ ^ \ \ \ 5 A ^stateOO^statelO := -2.5 • /ci > 0 V -2.5 • /ci + 2.5 • k2 > 0 The transition exists if and only if the formula is satisfiable. Local parameter constraints are predicates over reals. 41/54 s/User/skola/pracovny/R/vector_field_multi_dim - Shiny http://127.0.0.1:5294 I £Q Open in Browser I ^Publish number of arrows pararnetery_pRB pararnetery_pRB height of plots a.5 2 an horizontal ax in plot 1 I 1 1 I 1 1 I 1 1 I 1 10 19 22. 37 46 55 64 73 52 91 100 0 0.01 0.02 0.03 0.04 0.06 0.07 0.0Ä 0.09 0.1 0 005 0.1 0.15 02 025 0.3 0.35 04 045 0.5 200 260 320 3 AO 440 500 560 620 6S0 740 £00 coloringthreshold zoom in pRE! pRB horizontal ax in plot 2 1 I 1 1 I 1 1 I 1 1 I 1 1 I 1 1 I 1 1 I 1 1 I 1 1 I 1 1 I 0 0.01 002 0.03 0.04 0 05 0.06 0.07 0 0£ 0.09 0.1 coloring direction C horizontal C vertical (* both choose .bio file Choose File ...b/model IP SQOR.bio choose .json file Choose File 1 I 1 1 I 1 1 I 1 1 I 1 1 I 0 1.5 3 4.5 6 7.5 9 10.5 12 13 5 15 zoom in E2F1 I 1 I 1 1 I 1 1 I 1 1 I 1 1 I 0 1.5 3 4.5 £ 7.5 9 10.5 12 135 15 vertical ax in plot 1 E2F1 vertical ax in plot 2 ^ Download image ^ Download image 3.9 4.1 4.3 4.6 4.7 4.9 6.1 6.3 6.6 6.7 6.9 6.1 6.3 6.5 _j«B_ m 3 IC Ufa 9 a http://pithya.ics.muni.cz * J 1 * i 1 • 1 1 1 i -It-«: 1 1 _______i f - - - -f-3 t ■ I—» I------ i 1 1 Y T i & + ———^ <■-i 4 t-----* 1 1 1 4 i I !.----, L-----» 1.-----* J______+ 1 1 I 1 *" l J VIII □ i 8||0 6 6.0 6.2 6.4 D: 5651.52 Kbit's 20:52 _ 29.5.16 ' [CAV 2017] 42/54 • Biological Networks • Modelling Problems Q Case Study Case Study: Cell Cycle Control System Example: decision making in living cells — to divide or not to divide? decisions implemented by circuits of positive and negative interactions modelling of cell cycle since 1970 [Goldbetter et al.] 44/54 Case Study: Cell Cycle Control System Parameterised Non-Linear Mathematical Model x = f(x(t),p) f ... phase space (vector field), f : Rn x Rm Rn x ... state vector (Rn) p ... parameter vector (Rm) ^E2F1 .-5 0,2 ✓ ✓ ✓ ✓ f S f * - - \ \ ^ ^ tf ar ^ ^ if «. it *" ^ ✓ ✓ . * *■ » 1? if v V ^ t f v \ 1 f ? 1/ 4 V V \ \ \ >. 4 * 4 t t V V V 4 4,8 5,6 ,\ ,\ \. \. N \ \ > > )> 48 6,4 E2F1 4 3^2 4 4,8 5,6 X ,\ ,\ |\,1>, X, 'H >. > > 1» >> 4 i» p» i> i 7 2 p = 0.006 p = 0.012 45/54 Properties Specification: HCTL • HCTL — hybrid CTL with past • state formulae (p ::= true | p | —up | cp A cp | E ip | A ip | | x | ^x.y? | ©x.p | 3x.(^9 path formulae ip ::= X | cp U 46/54 Properties Specification: HCTL Single-state patterns • sink (stable steady state): |s. AXs yv o source (only self-loops, no other incoming): |s. AXs Multi-state patterns • state in a nontrivial SCC: Is. EXEFs • state in a final SCC (generalised sink): Is. AGEFs Relations among patterns 9 at least two sinks in the whole system: 3s3t.(@s.^t A AX s) A (@£. AX t) 47/54 Case Study: Regulation of G\/S Cell Cycle Transition M (mitosis) S phase (DNA synthesis) vCell* that division d[PRB] dt d[E2Fl] dt [E2F1] Kml+[E2Fl] JnifpRB] ~ pRB[pRB] K2m2 + [E2Flp Jn + [pRB] -0E2F1[E2F1] [Swat et al. 2004] unknown parameter: ipi A I s. AG EF s A E2F1 < 4 • 4 48/54 Case Study: Results pRB cp2 PRB = 0.0115 [0.011,0.0136] 4>pRB = 0.014 [0.0136,0.5] results agree with numerical methods up-to precision of approximation/discretisation 49/54 Parameter Synthesis Chronology Related Work • Batt et al. 2007: RoverGene, BDD/Polytopes-based approach • Batt et al. 2010: GNA, symbolic approach, piecewise affine • Grosu et al. 2011: RoverGene revisited, approximation improved • Bogomolov et al. 2015, SpaceEx, multi-affine hybrid automata Our Contribution • HIBI 2010, TCCB 2012: coloured LTL model checking, piecewise multi-affine, parallel algorithm o CMSB 2015: coloured CTL model checking, piecewise multi-affine, parallel algorithm • ATVA 2016, CMSB 2016: parameters represented in first order logic, SMT solver employed, interdependent parameters o HSB 2015, FM 2016: discrete bifurcation analysis <* CMSB 2017, ICCTCS 2018: analysis of terminal SCCs, application to cyanobacteria models • TACAS 2019: application to bifurcation analysis of TCP J 50/54 Contribution Overview parameter synthesis for discrete networks — TREMPPI (with Heike Siebert) abstract interpretation of Boolean networks (with Stefan Haar) qualitative model 1 r stochastic model approximation ^ parametric uniformisation for CTMCs robustness analysis wrt CSL abstracted time modeled continuous model SCC detection and counting Pithya robust monitoring with STL* — Parasim 51/54 Conclusions • using methods of computer science we can specify biological systems rigorously formal methods allow exhaustive exploration of models under parameter uncertainty • use of formal methods is important for synthetic biology - we want to know what we construct! 9 analysis becomes a push-button technology • applications in cyber-physical systems 9 problems: • the grand challenge not yet targeted o modellers trained in biophysics and computer science needed • scalability • we need methods giving results up to given precision instead of insisting on exact results • Machine Learning to learn Fj^l 52/54 Computer Science Luboš Brim, Marta Kwiatkowska, Jiří Barnat, Thomas Henzinger, Loíc Paulevé, Ezio Bartocci, Luca Bortolussi, Jéróme Feret, Andrzej Mizera, Alessandro Abate, Jan Van Schuppen, Milan Češka, Nikola Beneš, Stefan Haar, Heike Siebert, Hidde de Jong, Ivana Černá Biology Ralf Steuer, Louis Mahadevan, Jan Červený, Dušan Lazár, Pavel Krejčí, Stefanie Hertel, Christoff Flamm Students • Samuel Pastva, Matej Troják, Sven Dražan, Jana Dražanová, Martin Demko, Matej Hajnal • Tomáš Vejpustek, Juraj Kolčák, Jan Papoušek, Vojtěch Brůža, Juraj Nižnan, Lukrécia Mertová, Petr Dluhoš, Simon Van Goethem Parameter Exploration of Stochastic Models Gene a interactions Gene b interactions a^ a+ A aB aB + A 1 1 b^b+ B bB^bB+ B 0.05 1 A + a aA B + a aB 100; 10 100; 10 A + b^ bA B + b^ bB 100; 10 100; 10 Proteins degradation A Ya B Yb • CTMC with 1078 states and 5919 transitions • hypothesis about stability of B in low/high population • expected time spent in states with low/high population of B • formalization in CSL using cumulative rewards « /?=?[C^1000](6 < 3), /?=?[C^1000](6 > 7) 54/54 Parameter Exploration of Stochastic Models Robustness analysis - stability in low population of B State #17 (A=0, B=1, a=1, b=1, aA=0, aB=0, bA=0, bB=0) 1000 900 800 —. 700 CO V m — 600 © o o II V o 0» II DC 500 400 300 200 100 Yb = 0.05 Robustness = 52.277895 ±0.812319 PLA of robustness = 52.276979 ± 0.023224 Yb = 0.10 Robustness = 304.658784 ± 2.900726 PLA of robustness = 304.674472 ±0.191669 Yb = 0.15 Robustness = 627.991011 ± 3.398032 PLA of robustness = 628.007720 ± 0.335658 r1000 900 800 700 ■600 500 400 300 200 100 0.05 0.1 0.15 0.2 0.25 Ya 0.3 0.35 0.4 in average 3.0 • 106 reaction steps, 100 subspaces, 7 hours 0.45 0.5 54/54 Parameter Exploration of Stochastic Models Robustness analysis - stability in high population of B State #17 (A=0, B=1, a=1, b=1, aA=0, aB=0, bA=0, bB=0) Yb = 0.05 Robustness = 745.90669 ±2.289315 PLA of robustness = 745.89886 ± 0.229235 -1000 900 Yb = 0.10 Robustness = 253.440577 ±3.133557 PLA of robustness = 253.400837 ± 0.238478 Yb = 0.15 Robustness = 49.212717 ±2.494463 PLA of robustness = 49.088951 ± 0.243332 800 700 600 500 ■400 300 200 100 0.4 0.45 0.5 9 in average 3.0 • 106 reaction steps, 100 subspaces, 7 hours 54/54