CG920 Genomics Lesson 11 Systems Biology Jan Hejátko Functional Genomics and Proteomics of Plants, CEITEC - Central European Institute of Technology And National Centre for Bimolecular Research, Faculty of Science, Masaryk University, Brno hejatko@sci.muni.cz, www.ceitec.eu 2  Literature sources for Chapter 12:  Wilt, F.H., and Hake, S. (2004). Principles of Developmental Biology. (New York ; London: W. W. Norton)  Eden, E., Navon, R., Steinfeld, I., Lipson, D., and Yakhini, Z. (2009). GOrilla: a tool for discovery and visualization of enriched GO terms in ranked gene lists. BMC Bioinformatics 10, 48.  The Arabidopsis Genome Initiative. (2000). Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408, 796-815.  Benitez, M. and Hejatko, J. Dynamics of cell-fate determination and patterning in the vascular bundles of Arabidopsis thaliana (submitted)  de Luis Balaguer MA, Fisher AP, Clark NM, Fernandez-Espinosa MG, Moller BK, Weijers D, Lohmann JU, Williams C, Lorenzo O, Sozzani R. 2017. Predicting gene regulatory networks by combining spatial and temporal gene expression data in Arabidopsis root stem cells. Proc Natl Acad Sci U S A 114(36): E7632-E7640. Literature 3  Definition of Systems Biology  Tools  Gene Ontology Analysis  Bayesian Networks  Molecular/Gene Regulatory Networks Modeling  Inferring Gene Regulatory Networks from Large Omics Datasets Outline 4 Systems biology is the computational and mathematical analysis and modeling of complex biological systems. It is a biology-based interdisciplinary field of study that focuses on complex interactions within biological systems, using a holistic approach (holism instead of the more traditional reductionism) to biological research (Wikipedia). Definition 5 Systems biology is the study of biological systems whose behaviour cannot be reduced to the linear sum of their parts’ functions. Systems biology does not necessarily involve large numbers of components or vast datasets, as in genomics or connectomics, but often requires quantitative modelling methods borrowed from physics (Nature). Definition 6 Nice explanatory video by Dr. Nathan Price, associate director of the Institute for Systems Biology at https://www.youtube.com/watch?v=OrXRl_8UFHU. Definition 7  Definition of Systems Biology  Tools  Gene Ontology analysis Outline 8 Results of –omics Studies vs Biologically Relevant Conclusions □ Results of –omics studies represent huge amount of data, e.g. genes with differential expression. But how to get any biologically relevant conclusions out of it? gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant AT1G07795 1:2414285-2414967 WT MT OK 0 1,1804 1.79769e+308 1.79769e+3 08 6.88885e-05 0,00039180 1 yes HRS1 1:4556891-4558708 WT MT OK 0 0,696583 1.79769e+308 1.79769e+3 08 6.61994e-06 4.67708e- 05 yes ATMLO14 1:9227472-9232296 WT MT OK 0 0,514609 1.79769e+308 1.79769e+3 08 9.74219e-05 0,00053505 5 yes NRT1.6 1:9400663-9403789 WT MT OK 0 0,877865 1.79769e+308 1.79769e+3 08 3.2692e-08 3.50131e- 07 yes AT1G27570 1:9575425-9582376 WT MT OK 0 2,0829 1.79769e+308 1.79769e+3 08 9.76039e-06 6.647e-05 yes AT1G60095 1:22159735-22162419 WT MT OK 0 0,688588 1.79769e+308 1.79769e+3 08 9.95901e-08 9.84992e- 07 yes AT1G03020 1:698206-698515 WT MT OK 0 1,78859 1.79769e+308 1.79769e+3 08 0,00913915 0,0277958 yes AT1G13609 1:4662720-4663471 WT MT OK 0 3,55814 1.79769e+308 1.79769e+3 08 0,00021683 0,00108079 yes AT1G21550 1:7553100-7553876 WT MT OK 0 0,562868 1.79769e+308 1.79769e+3 08 0,00115582 0,00471497 yes AT1G22120 1:7806308-7809632 WT MT OK 0 0,617354 1.79769e+308 1.79769e+3 08 2.48392e-06 1.91089e- 05 yes AT1G31370 1:11238297-11239363 WT MT OK 0 1,46254 1.79769e+308 1.79769e+3 08 4.83523e-05 0,00028514 3 yes APUM10 1:13253397-13255570 WT MT OK 0 0,581031 1.79769e+308 1.79769e+3 08 7.87855e-06 5.46603e- 05 yes AT1G48700 1:18010728-18012871 WT MT OK 0 0,556525 1.79769e+308 1.79769e+3 08 6.53917e-05 0,00037473 6 yes AT1G59077 1:21746209-21833195 WT MT OK 0 138,886 1.79769e+308 1.79769e+3 08 0,00122789 0,00496816 yes AT1G60050 1:22121549-22123702 WT MT OK 0 0,370087 1.79769e+308 1.79769e+3 08 0,00117953 0,0048001 yes Ddii et al., unpublished AT4G15242 4:8705786-8706997 WT MT OK 0,00930712 17,9056 10,9098 -4,40523 1.05673e-05 7.13983e-05 yes AT5G33251 5:12499071-12500433 WT MT OK 0,0498375 52,2837 10,0349 -9,8119 0 0 yes AT4G12520 4:7421055-7421738 WT MT OK 0,0195111 15,8516 9,66612 -3,90043 9.60217e-05 0,000528904 yes AT1G60020 1:22100651-22105276 WT MT OK 0,0118377 7,18823 9,24611 -7,50382 6.19504e-14 1.4988e-12 yes AT5G15360 5:4987235-4989182 WT MT OK 0,0988273 56,4834 9,1587 -10,4392 0 0 yes 9 Plant Vascular Tissue Development □ Vascular tissue as a developmental model for GO analysis and MRN modeling Lehesranta etal., Trends in Plant Sci (2010) 10 WT hormonal mutant Hormonal Control Over Vascular Tissue Development □ Plant Hormones Regulate Lignin Deposition in Plant Cell Walls and Xylem Water Conductivity WT mutant lignified cell walls Water Conductivity WT hormonal mutants 11 WT hormonal mutant Hormonal Control Over Vascular Tissue Development □ Transcriptional profiling via RNA sequencing mRNA Sequencing by Illumina and number of transcripts determination mRNA cDNA cDNA 12 Results of –omics Studies vs Biologically Relevant Conclusions □ Transcriptional profiling yielded more then 9K differentially regulated genes… gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant AT1G07795 1:2414285-2414967 WT MT OK 0 1,1804 1.79769e+308 1.79769e+3 08 6.88885e-05 0,00039180 1 yes HRS1 1:4556891-4558708 WT MT OK 0 0,696583 1.79769e+308 1.79769e+3 08 6.61994e-06 4.67708e- 05 yes ATMLO14 1:9227472-9232296 WT MT OK 0 0,514609 1.79769e+308 1.79769e+3 08 9.74219e-05 0,00053505 5 yes NRT1.6 1:9400663-9403789 WT MT OK 0 0,877865 1.79769e+308 1.79769e+3 08 3.2692e-08 3.50131e- 07 yes AT1G27570 1:9575425-9582376 WT MT OK 0 2,0829 1.79769e+308 1.79769e+3 08 9.76039e-06 6.647e-05 yes AT1G60095 1:22159735-22162419 WT MT OK 0 0,688588 1.79769e+308 1.79769e+3 08 9.95901e-08 9.84992e- 07 yes AT1G03020 1:698206-698515 WT MT OK 0 1,78859 1.79769e+308 1.79769e+3 08 0,00913915 0,0277958 yes AT1G13609 1:4662720-4663471 WT MT OK 0 3,55814 1.79769e+308 1.79769e+3 08 0,00021683 0,00108079 yes AT1G21550 1:7553100-7553876 WT MT OK 0 0,562868 1.79769e+308 1.79769e+3 08 0,00115582 0,00471497 yes AT1G22120 1:7806308-7809632 WT MT OK 0 0,617354 1.79769e+308 1.79769e+3 08 2.48392e-06 1.91089e- 05 yes AT1G31370 1:11238297-11239363 WT MT OK 0 1,46254 1.79769e+308 1.79769e+3 08 4.83523e-05 0,00028514 3 yes APUM10 1:13253397-13255570 WT MT OK 0 0,581031 1.79769e+308 1.79769e+3 08 7.87855e-06 5.46603e- 05 yes AT1G48700 1:18010728-18012871 WT MT OK 0 0,556525 1.79769e+308 1.79769e+3 08 6.53917e-05 0,00037473 6 yes AT1G59077 1:21746209-21833195 WT MT OK 0 138,886 1.79769e+308 1.79769e+3 08 0,00122789 0,00496816 yes AT1G60050 1:22121549-22123702 WT MT OK 0 0,370087 1.79769e+308 1.79769e+3 08 0,00117953 0,0048001 yes Ddii et al., unpublished AT4G15242 4:8705786-8706997 WT MT OK 0,00930712 17,9056 10,9098 -4,40523 1.05673e-05 7.13983e-05 yes AT5G33251 5:12499071-12500433 WT MT OK 0,0498375 52,2837 10,0349 -9,8119 0 0 yes AT4G12520 4:7421055-7421738 WT MT OK 0,0195111 15,8516 9,66612 -3,90043 9.60217e-05 0,000528904 yes AT1G60020 1:22100651-22105276 WT MT OK 0,0118377 7,18823 9,24611 -7,50382 6.19504e-14 1.4988e-12 yes AT5G15360 5:4987235-4989182 WT MT OK 0,0988273 56,4834 9,1587 -10,4392 0 0 yes 13 Gene Ontology Analysis □ One of the possible approaches is to study gene ontology, i.e. previously demonstrated association of genes to biological processes Ddii et al., unpublished 14 Gene Ontology Analysis □ Several tools allow statistical evaluation of enrichment for genes associated with specific processes Eden et al., BMC Biinformatics (2009) 15 Gene Ontology Analysis □ Several tools allow statistical evaluation of enrichment for genes associated with specific processes 16 Gene Ontology Analysis □ Several tools allow statistical evaluation of enrichment for genes associated with specific processes 17 Gene Ontology Analysis □ Several tools allow statistical evaluation of enrichment for genes associated with specific processes 18 Gene Ontology Analysis □ Several tools allow statistical evaluation of enrichment for genes associated with specific processes 19 Gene Ontology Analysis □ Several tools allow statistical evaluation of enrichment for genes associated with specific processes 20  Definition of Systems Biology  Tools  Gene Ontology analysis  Bayesian Networks Outline 21 Bayesian Networks  What are Bayesian networks?  Probabilistic Graphical Model that can be used to build models from data and/or expert opinion https://www.youtube.com/watch?v=4fcqyzVJwH M 22 Bayesian Networks  What are Bayesian Networks?  Probabilistic Graphical Model that can be used to build models from data and/or expert opinion  can be used for a wide range of tasks including prediction, anomaly detection, diagnostics, automated insight, reasoning, time series prediction and decision making under uncertainty  NODES  each node represents a variable such as someone's height, age or gender. A variable might be discrete, such as Gender = {Female, Male} or might be continuous such as someone's age  LINKS  added between nodes to indicate that one node directly influences the other 23 Bayesian Networks NODES LINKS/EDGES 24 Asia Bayesian Network https://www.bayesserver.com/ 25  Definition of Systems Biology  Tools  Gene Ontology analysis  Bayesian Networks  Molecular/Gene Regulatory Networks Modeling Outline 26 □ Vascular tissue as a developmental model for MRN modeling Benitez and Hejatko, PLoS One, 2013 Molecular Regulatory Networks Modeling 27 Molecular Regulatory Networks Modeling □ Literature search for published data and creating small database Interaction Evidence References A-ARRs –| CK signaling Double and higher order type-A ARR mutants show increased sensitivity to CK. Spatial patterns of A-type ARR gene expression and CK response are consistent with partially redundant function of these genes in CK signaling. A-type ARRs decreases B-type ARR6-LUC. Note: In certain contexts, however, some A-ARRs appear to have effects antagonistic to other A- ARRs. [27] [27] [13] [27] AHP6 –| AHP ahp6 partially recovers the mutant phenotype of the CK receptor WOL. Using an in vitro phosphotransfer system, it was shown that, unlike the AHPs, native AHP6 was unable to accept a phosphoryl group. Nevertheless, AHP6 is able to inhibit phosphotransfer from other AHPs to ARRs. [9] [9] Benitez and Hejatko, PLoS One, 2013 28 Molecular Regulatory Networks Modeling □ Formulating logical rules defining the model dynamics Network node Dynamical rule CK 2 If ipt=1 and ckx=0 1 If ipt=1 and ckx=1 0 else CKX 1 If barr>0 or arf=2 0 else AHKs ahk=ck AHPs 2 If ahk=2 and ahp6=0 and aarr=0 1 If ahk=2 and (ahp6+aarr<2) 1 If ahk=1 and ahp6<1 0 else B-Type ARRs 1 If ahp>0 0 else A-Type ARRs 1 If arf<2 and ahp>0 0 else Benitez and Hejatko, PLoS One, 2013 29 Molecular Regulatory Networks Modeling □ Specifying mobile elements and their model behaviour 30 Molecular Regulatory Networks Modeling □ Preparing the first version of the model and its testing 31 Molecular Regulatory Networks Modeling □ Specifying of missing interactions via informed predictions Interaction Evidence References A-ARRs –| CK signaling Double and higher order type-A ARR mutants show increased sensitivity to CK. Spatial patterns of A-type ARR gene expression and CK response are consistent with partially redundant function of these genes in CK signaling. A-type ARRs decreases B-type ARR6-LUC. Note: In certain contexts, however, some A-ARRs appear to have effects antagonistic to other A-ARRs. [27] [27] [13] [27] AHP6 –| AHP ahp6 partially recovers the mutant phenotype of the CK receptor WOL. Using an in vitro phosphotransfer system, it was shown that, unlike the AHPs, native AHP6 was unable to accept a phosphoryl group. Nevertheless, AHP6 is able to inhibit phosphotransfer from other AHPs to ARRs. [9] [9] CK → PIN7 radial localization Predicted interaction (could be direct or indirect) Informed by the following data: During the specification of root vascular cells in Arabidopsis thaliana, CK regulates the radial localization of PIN7. Expression of PIN7:GFP and PIN7::GUS is upregulated by CK with no significant influence of ethylene. In the root, CK signaling is required for the CK regulation of PIN1, PIN3, and PIN7. Their expression is altered in wol, cre1, ahk3 and ahp6 mutants. [18] [18,20] [19] CK→ APL Predicted interaction (could be direct or indirect) Consistent with the fact that APL overexpression prevents or delays xylem cell differentiation, as does CKs. Partially supported by microarray data and phloem-specific expression patterns of CK response factors. [21] (TAIR, ExpressionSet: 1005823559, [22]) 32 Molecular Regulatory Networks Modeling □ Preparing the next version of the model and its testing Benitez and Hejatko, PLoS One, 2013 33 □ Good model should be able to simulate reality Benitez and Hejatko, PLoS One, 2013 Molecular Regulatory Networks Modeling 34 □ Formulating equations describing the relationships in the model Molecular Regulatory Networks Modeling Static nodes: gn(t+1)=Fn(gn1(t),gn2(t),..., gnk(t)) Mobile nodes: g(t+1)T [i]= H(g(t) [i]+ D (g(t) [i+1]+g(t) [i-1] – N(g(t) [i]))-b) state in the time t+1 state in the time tlogical rule function state in the time t+1 Amount if TDIF or MIR165 in cell i proportion of movable element constant corresponding to a degradation term 35 □ Good model should be able to simulate reality Benitez and Hejatko, submitted Molecular Regulatory Networks Modeling Static nodes: gn(t+1)=Fn(gn1(t),gn2(t),..., gnk(t)) Mobile nodes: g(t+1)T [i]= H(g(t) [i]+ D (g(t) [i+1]+g(t) [i-1] – N(g(t) [i]))-b) 36 Molecular Regulatory Networks Modeling Benitez and Hejatko, submitted □ The good model should be able to simulate reality 37 Molecular Regulatory Networks Modeling Benitez and Hejatko, submitted □ Simulation of mutants 38  Definition of Systems Biology  Tools  Gene Ontology analysis  Bayesian Networks  Molecular/Gene Regulatory Networks Modeling  Inferring Gene Regulatory Networks from Large Omics Datasets Outline 39 Systems Biology in Cancer Research 40 miRNA/mRNA Profiling Guo et al., Mol Med Reports, 2017 41 Benkova and Hejatko,Plant Mol Biol (2008) proximalroot meristem distalroot meristem Inferring Gene Regulatory Networks 42 Klidové centrum Quiescent centre Kolumela Columella cell files Iniciály kolumely Columella initials Epiderims Epidermis Kortex Cortex Endodermis Endodermis Iniciály stéle Stele initials proximalroot meristem distalroot meristem Postranní kořenová čepička Lateral root cap Iniciály epidermis Epidermis initials Iniciála endodermis a kortexu Endodermis and cortex initial Gene Regulatory Networks 43 Birnbaum et al., Science, 2003 Gene Regulatory Networks - GENIST de Luis Balaguer et al., PNAS, 2017  Inferring GRNs via GENIST  GEne regulatory Network Inference from SpatioTemporal data algorithm  Combining spatial- and timespecific gene expression profiles 44 Combining Large Omics Datasets GENES TISSUE/TIME 45 Gene Regulatory Networks - GENIST  Inferring GRNs via GENIST  Clustering of genes  Expression similarity under various conditions/genetic backgrounds, time points, …  Inferring intra-cluster connections  Selection of potential regulators and co- regulators  Based on the time correlation in the change of expression and/or user specification  Dynamic Bayesian Network modeling Haeseleer, Computational Biology, 2005 46  Inferring GRNs via GENIST  Clustering of genes  Expression similarity under various conditions/genetic backgrounds, time points, …  Inferring intra-cluster connections  Selection of potential regulators and co- regulators  Based on the time correlation in the change of expression and/or user specification  Dynamic Bayesian Network modeling de Luis Balaguer et al., PNAS, 2017 Gene Regulatory Networks - GENIST 47 Gene Regulatory Networks - GENIST de Luis Balaguer et al., PNAS, 2017 48 de Luis Balaguer et al., PNAS, 2017 Gene Regulatory Networks - GENIST 49 Indirect PAN targets feed-back loops Gene Regulatory Networks - GENISTMODEL PREDICTION EXPERIMENTAL VERIFICATION 50  Definition of Systems Biology  Tools  Gene Ontology analysis  Bayesian Networks  Molecular/Gene Regulatory Networks Modeling  Inferring Gene Regulatory Networks from Large Omics Datasets Summary 51 Discussion