MASARYK UNIVERSITY
FACULTY OF INFORMATICS
}w¡¢£¤¥¦§¨!"#$%&123456789@ACDEFGHIPQRS`ye|
Modelling of cell signalling
patways by using boolean
networks
BACHELOR’S THESIS
Kateˇrina Hemalová
Brno, 2015
Declaration
Hereby I declare, that this paper is my original authorial work, which
I have worked out by my own. All sources, references and literature
used or excerpted during elaboration of this work are properly cited
and listed in complete reference to the due source.
Kateˇrina Hemalová
Advisor: RNDr. David Šafránek, Ph.D.
ii
Acknowledgement
I would like to thank my advisor RNDr. David Šafránek, Ph.D. for
guidance and many consultations. I would also like to thank Mgr.
Pavel Krejˇcí, Ph.D. for providing me with experimental data and Vojtˇech
Havel for critical reading of this thesis.
iii
Abstract
In this thesis, we study the characteristics of boolean networks which
mimic signalling pathways. We use the FGF (Fibroblast Growth Factor)
pathway as a case study. Exact molecular interactions and feedbacks
in the FGF pathway are unclear. Therefore, the boolean modelling
is suitable formalism for its analysis. Based on literature we
have built boolean models of the FGF pathway.
The activity of the FGF pathway can have either transient or sustained
proﬁle. However, the molecular mechanism responsible for
differences in the pathway activity is unknown. We have expressed
this two qualities formally in LTL. Using the NuSMV model checker
we have analysed our models for given qualities. Based on our analysis
we suggest that the negative feedbacks in the FGF pathway regulate
the pathway activity.
To comprehensively visualize the model dynamics we have implemented
a DFS algorithm which depicts in gnuplot all paths in the
state transition graph reachable from an initial node.
iv
Keywords
Cell Signalling Pathways, Boolean Networks, FGF Pathway, Transient
Activity, Sustained Activity
v
Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 FGF2 signalling pathway . . . . . . . . . . . . . . . . . . . . 3
2.1 Biological background of FGF2 signalling pathway . . 3
2.2 Behaviour of FGF2 signalling . . . . . . . . . . . . . . . 7
3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.1 Boolean models . . . . . . . . . . . . . . . . . . . . . . . 9
3.1.1 Boolean models deﬁnition . . . . . . . . . . . . . 9
3.1.2 Boolean models example . . . . . . . . . . . . . 10
3.1.3 Boolean models semantics . . . . . . . . . . . . . 11
3.2 Linear temporal logic . . . . . . . . . . . . . . . . . . . . 12
3.3 GINSim . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.4 NuSMV . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.5 Implementation of DFS algorithm . . . . . . . . . . . . 14
3.6 Experimental data . . . . . . . . . . . . . . . . . . . . . . 16
4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.1 Analysed properties . . . . . . . . . . . . . . . . . . . . 20
4.2 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.3 Model M1 . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.3.1 M1 behaviour . . . . . . . . . . . . . . . . . . . . 21
4.4 Model M2 . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.4.1 M2 behaviour . . . . . . . . . . . . . . . . . . . . 24
4.5 Model M3 . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.5.1 M3 behaviour . . . . . . . . . . . . . . . . . . . . 24
4.6 Model M4 . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.7 Model M5 . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.8 Summary of analysis . . . . . . . . . . . . . . . . . . . . 27
5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
A Attached ﬁles . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
B Signalling pathways . . . . . . . . . . . . . . . . . . . . . . . 37
vi
1 Introduction
Surviving of living cells depends on their ability to perceive and correctly
response to their environment. Cells accept chemical signals
by receptors on their plasma membrane. The receptors trigger chain
of biochemical reactions which result in complex cellular response.
The process of spreading the signal on particular molecules is called
signalling pathway.
Signalling pathway can be viewed as a network of mutual interactions
among signalling molecules. It is a complex system which
is not intuitive due to various feedbacks and crosstalks. Recently, the
computational modelling is more often used to analyse the signalling
pathways structure and behaviour. We can use the computational
models to make hypotheses about the role of signalling molecules
in diseases and to make predictions which can be experimentally
tested.
Continuous models using ODE (Ordinary Differential Equations)
provide accurate system analysis. However, their accuracy strongly
depends on the exact molecular concentrations and reaction rates
which are often unknown due to the limitations of today’s experimental
techniques. Therefore, in practice most of the model parameters
are based on estimation and inference. In many cases, we can
clarify the basic system mechanisms without the need of complicated
continuous model, by using logical models which discretize the unknown
continuous parameters.
In this thesis, we study the characteristics of boolean networks
which mimic signalling pathway. We use the FGF (Fibroblast Growth
Factor) pathway as a case study. Exact molecular interactions and
feedbacks in the FGF pathway are unclear and thus the boolean modelling
is suitable formalism for its analysis.
Cells respond differentially depending on how long the FGF pathway
is active [16]. The pathway activity is either transient (approximately
2 hours) or sustained (more than 12 hours). The molecular
mechanism responsible for different duration of FGF pathway activity
is not known.
To address this issue, we have built several models of FGF pathway
based on the literature and unpublished experimental data pro-
1
1. INTRODUCTION
duced by Pavel Krejˇcí’s group from the Department of Experimental
Biology MU. The models explain the behaviour of the pathway on
the logical level.
2
2 FGF2 signalling pathway
FGF signalling pathway is one of the most important pathway in human
physiology and development. During embryonic stage, it is involved
in cell proliferation, differentiation and migration. In adults,
it plays an important role in nervous system, tissue repair and tumor
angiogenesis[10].
Considering the number of biological processes that are regulated
by FGF pathway, it is clear that deregulation of the pathway has severe
impact on the organism. Mutations in FGF receptors (FGFRs)
were identiﬁed in a variety of human cancers as well as variety of
human skeletal dysplasias[1, 11].
The FGF protein family has 22 members and is recognized by 4
receptors (FGFRs). In this thesis, we focus on FGF2 and its receptor
FGFR3.
2.1 Biological background of FGF2 signalling
pathway
The following section introduces main proteins which are involved
in FGF2 signalling. They are summarised in the ﬁgure 2.1.
FGF2
FGF2 is external signal which binds to speciﬁc membrane receptors
and triggers the signal transduction in the cell. FGF2 activates a spectrum
of signal transduction pathways. Among those, MAPK pathway
which is necessary for the growth arrest [16].
FGFR3
FGF receptor 3 (FGFR3) is a cell surface receptor. It has extracellular
domain with high afﬁnity towards FGF2 and intracellular domain
with tyrosine kinase activity which in active state gains the ability to
phosphorylate other proteins on tyrosine residues.
3
2. FGF2 SIGNALLING PATHWAY
Figure 2.1: Schema of FGF2 pathway[17]. FGF2 activates Erk via the
FGFR3/Ras/MAPK pathway. FGFR3 utilizes at least three adaptors
(Frs2, Gab1, and Shc) to recruit Sos in order to activate Ras. Sos
is complexed with Grb2 and also might be complexed with Shp2.
Both Frs2 and Gab1 recruit Shp2-Grb2-Sos and Grb2-Sos complexes,
whereas Shc recruits mostly Grb2-Sos.
FGF2 binds to FGFR3 and promotes the formation of FGFR3 dimer
[14]. The FGFR3 changes its conformation which triggers its tyrosine
kinase activity [20]. It leads to transphosphorylation of key tyrosine
residues on receptor intracellular part. The activated receptor
then phosphorylates multiple intracellular proteins, including adaptor
proteins Frs2, Shc and Gab1.
Signal is propagated downstream by the recruitment of singalling
complexes Grb2-Sos or Shp2-Grb2-Sos which consequently activate
MAPK pathway[17]. The complexes bind to Frs2, Shc and Gab1 or
less frequently they bind directly to FGFR3.
Frs2
Fibroblast Growth Factor Receptor Substrate 2 (Frs2) functions as a
major mediator of signalling via FGFRs [18]. The Figure 2.2 depicts
4
2. FGF2 SIGNALLING PATHWAY
the structure of Frs2 protein. It has multiple tyrosine residues which
are phosphorylated by FGFR3. This residues serve as binding sites
for downstream molecules Grb2 and Shp2. Shp2-Frs2 binding leads
to MAPK pathway activation. The Grb2-Frs2 binding mostly leads
to the activation of different pathway (PI-3 kinase) and plays only
secondary role in activation of MAPK pathway.
Negative feedback on Frs2
Apart from tyrosine residues, phosphorylation on Frs2 occurs also
on several threonine residues. This phosphorylation is mediated by
downstream kinase ERK and has inhibitory effect on Frs2. Frs2 threonine
phosphorylation results in reduced tyrosine phosphorylation
and decreased recruitment of downstream molecules and consequent
attenuation of MAPK pathway [18].
Figure 2.2: Structure of Frs2 docking protein [18]. Myristyl group
(Myr) anchors the Frs2 in the plasma membrane. PTB (phosphotyrosine
binding domain) binds to FGFR3. Several tyrosine residues
function as binding sites for downstream molecules Grb2 and Shp2.
Several threonin residues are crucial for negative feedback loop.
Shc
Shc is an adaptor protein that binds through SH2 domain to FGFR3
and forms complex with Grb2-Sos under FGF2 stimulation[17].
5
2. FGF2 SIGNALLING PATHWAY
Gab1
GRB2-associated-binding protein 1 (Gab1) is adaptor protein that binds
Shp2-Grb2-Sos complex under the FGF2 stimulation[17].
Grb2
Growth factor receptor–bound protein 2 (Grb2) is an adaptor protein
composed of one SH2 domain and 2 SH3 domains [19]. The SH2 domain
binds to phosphotyrosine on receptors or adaptor proteins, e.g.
Frs2 and Shc. The SH3 domains bind to several proline-rich peptides,
manely Sos.
Sos
Son of sevenless (Sos) is guanine nucleotide exchange factor that activates
the Ras protein by catalyzing formation of Ras-GTP complex[9].
Sos is constitutively bound to Grb2 in a complex Grb2-Sos.
Negative feedback on Sos
Several Sos serin residues are phosphorylated by ERK[8]. The phosphorylation
results in disruption of the Grb2-Sos complex and inhibition
of Grb2-Sos binding to adaptors.
Shp2
Shp2 is involved in many signalling pathways, including FGF pathway,
and interacts with various proteins [18]. It contains 2 SH2 domains
which bind Frs2. FGF stimulation leads to tyrosine phosphorylation
of Shp2 and association with the Grb2-Sos complex [2].
Ras
Ras is a GTP binding protein which is connected to plasma membrane.
It is active if it binds GTP and inactive if it binds GDP. Active
6
2. FGF2 SIGNALLING PATHWAY
Ras is recognized by Raf protein [5]. They formate a complex which
leads to activation of MAPK pathway.
MAPK pathway
The MAPK pathway represents a major pathway utilized by FGF2
signalling [11]. It consists of serin/threonin kinases RAF, MEK, ERK
which transmit signal downstream in cascade-like manner. RAF phosphorylates
MEK and afterwards phosphorylated MEK phosphorylates
ERK. Phoshorylation of ERK results in transcription of genes
involved in cell growth and division.
P-ERK provides several negative feedbacks that attenuate FGF
signalling and thereby protects the cell from excessive activation of
FGFR[8, 18].
2.2 Behaviour of FGF2 signalling
The overtime behaviour of FGF2 pathway can be measured as a level
of phosphorylated downstream molecules, e.g. phosphorylated ERK
(P-ERK) [16]. The cell response depends on how long the P-ERK level
is increased.
In most of the cell types, the pathway activity is transient (approximately
2 hours). In this time range the level of P-ERK rapidly
increases and then decreases to minimum (Figure 2.4a).
In chondrocytes, the cell type responsible for bone growth, the
pathway activity is sustained (Figure 2.4b). The P-ERK level increases
and then stays increased (with slight ﬂuctuations) in the range of
12 hours (Figure 2.3). The sustained increased P-ERK level leads to
growth arrest. The molecular mechanism responsible for differences
in the FGF2 pathway activity is not clear.
7
2. FGF2 SIGNALLING PATHWAY
Figure 2.3: Sustained behaviour of the FGF pathway in chondrocytelike
cells [16]. The level of P-ERK was measured in the 24 hours long
range of incubation with FGF2. The data are expressed as ratio between
the optical density of P-Erk1/2 and total Erk1/2.
(a) (b)
Figure 2.4: Schematic time course of ERK activity. (a) Transient ERK
activity [23]. (b) Sustained ERK activity [27].
8
3 Methods
3.1 Boolean models
In this theses, we use the boolean model deﬁnition as it is described
in the paper by Barnat et al. [4]. The deﬁnition comes from Thomas’
extension of boolean networks which is implemented in GINsim tool
[24, 22]. The framework was designed to model gene regulatory networks
(GRN) but it is neverthless suitable for modelling of signalling
pathways as well since the GRN and signalling pathways share basic
principles. The approach has been applied to modelling signalling
pathways in the studies [25, 13].
GRN is a collection of genes which interact with each other by
gene products (RNA or proteins). Thus it can be presented as a network
where nodes are genes and edges are regulatory interactions.
Each node can gain binary value. It is either on (the gene is active,
transcribed) or off (the gene is inactive, nottranscribed).
In case of signalling pathway, the nodes represent signalling proteins
and the edges represent regulations, mostly phosphorylation or
dephosphorylation. The proteins are usually in on state when they
are phosphorylated and in off state when they are dephosphorylated,
but it can also be vice versa.
3.1.1 Boolean models deﬁnition
Here we present the boolean model deﬁnition according to Barnat
et al. [4]. The boolean model is formally described as a tuple B =
G, σ, θ, ρ, L where
• G = (V, E) is a directed regulatory graph with vertices V =
{g1, ..., gn} denoting proteins of signalling pathway and set of
edges E ⊆ V × V denoting regulations.
• σ(e) ∈ {+, −} denotes the type of regulation e ∈ E: positive (+)
or negative (−),
• θ(e) ∈ N≥1 denotes the activation threshold of e ∈ E,
9
3. METHODS
• ρ(gi) ∈ N≥1 denotes the maximum activity level of gi ∈ V determining
the activity domain {0, ..., ρ(gi)},
• L is regulatory logic deﬁned as L = {Ki,R | 1 ≤ i ≤ n, R ⊆ {v ∈
V | v, gi ∈ E}} where Ki,R denotes target activity level of gi
when regulated by all signalling proteins in R, 0 ≤ Ki,R ≤ ρ(gi).
3.1.2 Boolean models example
The most suitable for modelling signalling pathways is a special type
of boolean models, the purely boolean models. It has the maximum
activity level of all vertices set to 1 [4]. Figure 3.1a depicts an example
of simple purely boolean model B = G, σ, θ, ρ, L where
• G = (V, E) is the graph with the vertices V = {A, B} and the
edges E = { A, A , A, B , B, A }.
• σ( A, A ) = +
σ( A, B ) = +
σ( B, A ) = −
• θ( A, A ) = 1
θ( A, B ) = 1
θ( B, A ) = 1
• ρ(A) = 1
ρ(B) = 1
• L is regulatory logic deﬁned as the following set of rules
KA, ∅ = 1
KA, {A} = 1
KA, {A, B} = 0
KA, {B} = 0
KB, ∅ = 0
KB, {A} = 1
The rule KB, {A} = 1 declares that when activity level of A is higher
than threshold then it activates B. The rule KA, {A} = 1 declares
autoactivation of A. The rule KA, {A, B} = 0 deﬁnes the activity
A when interactions with both A and B are active. And the rules
10
3. METHODS
KA, ∅ = 1 and KB, ∅ = 0 deﬁne the basal activity level of A and B
which corresponds to the situation when all incoming interactions
are inactive.
The regulatory logic we choose here is one of 26
possible parametrizations
since each regulatory rule can be set to 0 or 1.
(a)
(b)
Figure 3.1: (a) Simple boolean model example. (b) Dynamics of the
model depicted by the transition graph.
3.1.3 Boolean models semantics
The directed regulatory graph of a boolean model B consists of nodes
g0, ..., gn. The node gi can gain one of several different values 0, ..., ρ(gi).
A state of the model B is n-tuple of the values of all nodes B [4].
For example the model B from 3.1.2 consists of two nodes A and
B which both can gain the value of either 0 or 1. Therefore, the model
B can occur in four distinct states 00, 10, 11, 01. The semantics of the
model B can be depicted by transition graph where the nodes denote
the states and the edges denote the transitions between them
(ﬁgure 3.1b). The construction of transitions is deﬁned later in the
text.
Formally the semantics of a model B is deﬁned as the tuple BTS(B) =
S, T, S0 where
• S = n
i=1{0, ..., ρ(gi)} is the state space.
• S0 ⊆ S is the set of speciﬁed initial states.
• T ⊆ S × S is the transition relation which is deﬁned later.
11
3. METHODS
The state space is the set of all distinct states of the model. The
transition relation determines the transitions between the states in
the transition graph. Several strategies how to construct the transition
relation have been developed [12]. In this thesis, we use the
nondeterministic asynchronous approach which allows a change of
the activity level of only one signalling protein in a single transition
[26, 15].
The transition relation is deﬁned as follows [4]. First we denote
the level of gi in the state s ∈ S by li(s). Assume there is a regulation
e = gi, gj ∈ E between genes gi, gj. We say that gi is a resource for
gj in s if σ(e) = + and li(s) ≥ θ(e), or σ(e) = − and li(s) < θ(e). Let
Re(s, gi) denote the set of all resources of gi in s. There is transition
s → s according to the following rules:
• If there exists u such that Ku,Re(s,gu) > lu(s) then
lu(s ) = lu(s) + 1.
• If there exists u such that Ku,Re(s,gu) < lu(s) then
lu(s ) = lu(s) − 1.
3.2 Linear temporal logic
In this section we deﬁne the Linear Temporal Logic (LTL) which we
use to formally describe characteristics of boolean model behaviour.
Fist we denote P(B) the set of all runs π in the transition graph
BTS(B). Each inﬁnite run π = s0, s1, ... ∈ BTS(B) belongs to P(B).
For each ﬁnite run s0, ..., sn ∈ BTS(B) such that there is no outgoing
edge from sn, there is an inﬁnite run π = s0, ..., sn, sn, sn, ... ∈ P(B).
Let πi
for π = s0, s1, . . . denote the run si, si+1, . . . where ﬁsrt i
states are removed and π0 denote the ﬁrst state of the run π.
Syntax of LTL is deﬁned by the following equation in BNF:
ϕ ::= p | ϕ ∧ ϕ | ¬ϕ | Xϕ | ϕUϕ,
where p belongs to the set of atomic propositions AP [7]. Atomic
propositions are statements in form gi ◦ t where ◦ ∈ {=, >, <, ≤, ≥}
and gi belongs to set of the vertices of the regulatory graph of model
B and t belongs to the activity domain {0, . . . , ρ(gi)}.
12
3. METHODS
Semantics of LTL is deﬁned over the set of all runs P(B) as follows
[7]. We say that a run π = s0, s1, . . . satisﬁes an LTL formula, written
π |= ϕ, if and only if:
• π |= p iff p ∈ L(π0) for p ∈ AP where L(s) is a set of all atomic
propositions gi ◦t such that the constrain gi ◦t holds in the state
s.
• π |= ϕ ∧ ψ iff π |= ϕ and π |= ψ
• π |= Xϕ iff π1
|= ϕ
• π |= ϕUψ iff ∃k ≥ 0 such as πk
|= ψ and ∀i ∈ {0, . . . , k − 1}
holds πi
|= ϕ
• π |= ¬ϕ iff not π |= ϕ
In addition, LTL can be extended with the temporal operators
Fϕ ≡ trueUϕ and Gϕ ≡ ¬F¬ϕ and with the boolean connectives
∨, ⇒ where ϕ ∨ ψ ≡ ¬(¬ϕ ∧ ¬ψ) and ϕ ⇒ ψ ≡ ¬ϕ ∨ ψ.
LTL can be directly interpreted over P(B). Given a dynamic system
BTS(B) with a particular set of initial states S0 we can say that
BTS(B) satisﬁes a formula ϕ, written BTS(B) |= ϕ, only if for each
s ∈ S0 all runs π ∈ P(B) such that π0 = s satisfy ϕ [3].
Automated process of verifying that a given model satisﬁes a
given logical formula is called model checking[7].
3.3 GINSim
GINsim (Gene Interaction Network simulation) is a computer tool
for the modelling and simulation of GRN which implements the Thomas’
extension of boolean networks [22, 24]. GINsim leans on two
main types of graphs: logical regulatory graphs, which model regulatory
networks, and state transition graphs, which represent their
dynamical behaviour.
GINsim offers several tools to analyse the state transition graph.
The tool "search path" allows to ﬁnd a path between two different
states of the state transition graph. The result of "search path" is the
ﬁrst encountered shortest path. However, for the purposes of our
13
3. METHODS
analysis we needed to extract all the paths reachable from initial
states and visualize them in an automated manner. GINsim does not
support this functionality therefore we implemented our algorithm
to search paths by DFS. We present it later in the section 3.5.
3.4 NuSMV
NuSMV is a symbolic model checker [6]. It is used to verify that a
given boolean model satisﬁes a given LTL formula. If the model does
not satisfy the formula then the model checker gives one counterexample
of a path which does not satisfy the formula.
Models deﬁned in GINsim can be directly exported into a NuSMV
ﬁle [21]. Before the export the initial states can be set which reduce
the transition graph only to paths reachable from initial states. If the
initial states are not set then a given LTL formula is interpreted over
the full transition graph where all states are initial.
3.5 Implementation of DFS algorithm
For the purposes of our analysis we needed to extract all the paths
in the state transition graph reachable from initial states and visualize
them in an automated manner. We implemented algorithm to
search paths by DFS(Depth First Search, see Algorithm 1). The algorithm
takes as the input the state transition graph G and one arbitrary
initial state s0 ∈ G. The algorithm ﬁnds all paths s0, . . . , st
where there is no outgoing edge from st. And it also ﬁnds all pairs
p(prefﬁx, cycle) ∈ G where:
• prefﬁx = s0, . . . , sn is a path connected to the cycle by the edge
sn → s0.
• cycle = s0, . . . , sn is a cycle such that sn = s0 and there is no
state twice in the cycle.
Correctness of the Algorithm 1 can be proved by the fact that each
path occurs on the stack S exactly once. The while cycle has at most
n! iterations because there is at most n! paths in the graph with n
vertices. Since there is at most n! iterations of while cycle and single
14
3. METHODS
iteration is linear in graph size, the complexity of the algorithm is
n!. The algorithm is not scalable and it is intended for small graphs.
To analyse large graphs, we use different approach based on model
checking.
Algorithm 1 DFS(G, initialNode)
1: S ← empty stack
2: S.push((initialNode, successors(initialNode)))
3: while S not empty do
4: if second(S.top()) = empty then1
5: S.pop(); continue
6: next ← any element from second(S.top())
7: remove next from second(S.top())
8:
9: if ∃n ∈ S such that first(n) = next then2
10: report cycle
11: else
12: S.push((next, successors(next)))
13: if successors(next) = empty then
14: report path
Implementation details
The implementation uses the state transition graph exported from
GINsim in ".layout" ﬁle where each line of the ﬁle deﬁnes two nodes
connected to each other by directed edge. The program parses the
".layout" ﬁle then it processes the DFS algorithm and visualizes each
path by the multiplot of the proteins’ activity time courses.
The program is written in c++11. It uses Gnuplot-iostream interface
which involves sending commands to Gnuplot. The Gnuplot
4.6. version is required. The source code was written in Visual Studio
which is required for the compiling. The program can be run
by the command "./NPF.exe [ﬁleName].layout" from the command
1. second(s) returns second element of tuple s
2. first(s) returns ﬁrst element of tuple s
15
3. METHODS
line. The user is asked to choose one of the nodes of the state transition
graph as the initial node. For each processed path the program
creates a ﬁle "path.data" with the plot coordinates. For each path is
generated a graph saved as "plot.png".
Figure 3.2 is example of program output. Program takes as input
the state transition graph (Figure 3.2a) and initial node 1000 and
returns all paths reachable from 1000. Only two of them are shown
here.
3.6 Experimental data
Western blot
Western blot is one of the most commonly used method in the studies
of signalling pathways. The proteins extracted from the cell lysate
are loaded on electrophoretic gel where they are separated by 3D
structure or by length. Then they are transferred to a nitrocellulose
or PVDF membrane where they keep the exact same position as they
had on electrophoretic gel. Afterwards they are bound with stained
antibodies speciﬁc to the target protein. The protein levels are evaluated
by densitometry.
The ﬁnal image consists of several bands, each band corresponds
to single protein (Figure 3.4). Western blot is used as a qualitative
indicator if protein activity is up or downregulated based on how
intense the band is. The band intensity depends on several factors
which are stochastic. Therefore, the intensity cannot be reproduced
in two different measurements on different gels. Western blot generates
qualitative data. It allows to ﬁnd out general trends in data
rather than absolute quantiﬁcation of proteins.
Experimental data
We have been provided with western blot data by Pavel Krejˇcí’s group
from the Department of Experimental Biology MU. They measured
the overtime behaviour of the FGF pathway as the level of phosphorylated
downstream molecules, phosphorylated ERK (P-ERK) and
phosphorylated FRS2 (P-FRS2).
16
3. METHODS
(a) The state transition graph which the program uses as an input.
(b) Path (1000,1100,1110,1010) (c) Path (1000,1001,1101,1111,1011)
Figure 3.2: Demonstrative run of the DFS implementation. The program
takes as the input the state transition graph (a) and the initial
node 1000 and returns all paths reachable from the initial node. Only
two path are shown here (b and c). The grey background denotes a
cycle. The path (b) consists of a single cycle. The path (c) ends with
terminal state 1011 which is shown as a single state cycle.
17
3. METHODS
Western blot images (Figure 3.4) were evaluated by software to
quantify the relative amount of proteins by both the area and the intensity
of bands in terms of optical density. The Figure 3.3 depicts the
summary of 12 independent measurements of P-ERK. The data are
variable and does not allow absolute quantiﬁcation of P-ERK. We can
only extract general trend in data which corresponds to sustained PERK
activity (section 2.2). The level of P-ERK rapidly increases in 2
hours and slowly decreases in the range of 12 hours. The same trend
can be found in P-FRS2 measurements (data not shown).
Regarding the qualitative character of data it is eligible to abstract
them as boolean values.
Figure 3.3: Boxplot of 12 independent measurements of ERK activity
in hESC and RCS cells.
18
3. METHODS
Figure 3.4: Example of western blot image. RCS cells were treated for
indicated time with FGF. Control comes from untreated cells.
19
4 Results
Based on the biological knowledge we have built several models of
the FGF signalling pathway. We have implemented the models in
GINsim and we have analysed properties of their dynamics with our
implementation of DFS and with the model checker NuSMV.
4.1 Analysed properties
We use the activity of ERK as a main indicator of the FGF pathway
behaviour. We focus on two properties of the FGF pathway, transient
and sustained ERK activity. Additionally, we analyse transient and
sustained FRS2 activity. We formally deﬁne the properties by the following
LTL formulae:
• F(ERK > 0)∧GF(ERK < 1) expresses transient ERK activity.
• F(FRS2 > 0) ∧ GF(FRS2 < 1) expresses transient FRS2 activ-
ity.
• FG(ERK > 0) expresses sustained ERK activity.
• FG(FRS2 > 0) expresses sustained FRS2 activity.
• FG(ERK > 0) ∧ FG(FRS2 > 0) expresses sustained ERK and
FRS2 activity.
We say that a given model B allows transient/sustained behaviour
if there is at least one run in BTS(B) which satisﬁes the LTL formula
describing transient/sustained ERK activity.
4.2 Models
We present the models in the form of schema of the regulatory graph
G(V, E) where positive interactions are marked by symbol → and
negative interactions by symbol . The following two constraints
hold for all models: ∀g ∈ V : ρ(g) = 1 and ∀e ∈ E : θ(e) = 1. Therefore,
all models are simply boolean and each node can gain binary
value, either 0 or 1.
20
4. RESULTS
For each model we set the topology, the regulatory logic and the
initial states for construction of the state transition graph. We explain
the model settings based on the biology of the FGF pathway.
As initial, we choose all states where FGF is 1 because we want to
mimic the dynamics of the system in FGF stimulated cells.
4.3 Model M1
The Figure 4.1a depicts the topology of the model M1. It is a basic
model which consists of three proteins (FGF, FRS2, ERK).
The edge <FGF, FGF> simulates the constitutive stimulation of
cells by FGF. The edge <FGF, FRS2> simulates the signal transduction
through receptor and consequent FRS2 activation. The edge
<FRS2, ERK> is an abstraction of signal transduction by the MAPK
phosphorylation cascade. The edge <ERK, FRS2> mimics that ERK
phosphorylates FRS2 and consequently inhibits the FRS2 activity[18].
We set the basal activity of the proteins to zero by the rules Kgi
, ∅ =
0 because we want to ﬁt the models to the control western blot measurements
of unstimulated cells (Chapter 3.6) which show that the
FRS2 and ERK is inactive when there is no FGF in the environment.
Due to this settings, the activation of FRS2 results from the stimulation
by FGF. Similarly, ERK is not activated spontaneously but
it is activated by FRS2. We set it by the rules KFRS2, {FGF} = 1,
KERK, {FRS2} = 1.
The model is divided into two models M1.1 and M1.2 which
differ in the settings of the rule KFRS2, {FGF, ERK}. M1.1 sets the
rule to 0 which means that the ERK negative feedback is stronger
than positive inﬂuence of FGF. M1.2 sets the rule to 1 which means
that the negative feedback is not strong enough to inactivate FRS2 if
there is positive stimulation by FGF.
4.3.1 M1 behaviour
Our analysis shows that the models M1.1 and M1.2 behave differentially
as a result of the regulatory logic on the node FRS2 (Figure 4.1).
The model M1.2 behaves as if there is not the feedback ERK FRS2
due to the steady FGF stimulation. Therefore, the ERK activity is sus-
21
4. RESULTS
(a)
KF GF , ∅ = 0
KF GF , {FGF} = 1
KF RS2, ∅ = 0
KF RS2, {FGF} = 1
KF RS2, {ERK} = 0
KF RS2, {FGF, ERK} = 0
KERK, ∅ = 0
KERK, {FRS2} = 1
(b)
KF GF , ∅ = 0
KF GF , {FGF} = 1
KF RS2, ∅ = 0
KF RS2, {FGF} = 1
KF RS2, {ERK} = 0
KF RS2, {FGF, ERK} = 1
KERK, ∅ = 0
KERK, {FRS2} = 1
(c)
(d) (e)
Figure 4.1: The regulatory graph of the model M1 (a). The regulatory
logic of the model (b) M1.1 (c) M1.2. The state transition graph
of the model (d) M1.1 (e) M1.2. The compoments in a single state
correspond to FGF, FRS2 and ERK, in this order.
22
4. RESULTS
(a)
KF RS2, {FGF, ERK} = 0
KERK, ∅ = 0
KERK, {FRS2} = 1
KERK, {SHC} = 1
KERK, {FRS2, SHC} = 1
KShc, ∅ = 0
KShc, {FGF} = 1
(b)
KF RS2, {FGF, ERK} = 1
KERK, ∅ = 0
KERK, {FRS2} = 1
KERK, {SHC} = 1
KERK, {FRS2, SHC} = 1
KShc, ∅ = 0
KShc, {FGF} = 1
(c)
(d) (e)
Figure 4.2: The regulatory graph of model M2 (a). The regulatory
logic of the model (b) M2.1 (c) M2.2. Rules which are not mentioned
are the same as in M1. The state transition graph of the model (d)
M2.1 (e) M2.2. The compoments in a single state correspond to FGF,
FRS2, ERK and SHC, in this order.
tained. On the contrary, the model M1.1 allows the feedback ERK FRS2
which results in the transient ERK activity.
4.4 Model M2
The model M2 (Figure 4.2a) is an extension of M1. There is one additional
protein SHC which allows an alternative ﬂow of a signal. The
M2 is again devided in two models M2.1 and M2.2 which differ in
the rule KFRS2, {FGF, ERK} = 0/1.
23
4. RESULTS
4.4.1 M2 behaviour
Our analysis shows that the model M2.1 allows both transient and
sustained ERK activity (Figure 4.2). The transient activity is caused
by the signal transduction via FRS2 which is in cycles switched off by
the negative feedback ERK FRS2. The sustained ERK activity results
from the signal transduction via SHC.
Compared to M1.1, the model M2.1 gains new properties due to
the SHC which allows an alternative ﬂow of a signal.
Similarly as M1.1, the model M2.2, allows only sustained behaviour
because the negative feedback on FRS2 cannot take place
while there is positive stimulation by FGF.
4.5 Model M3
The model M3 is an extension of M2.2 (Figure 4.3). There is one
additional negative edge <ERK,SHC> which simulates the negative
feedback ERK Grb2-Sos [8].
4.5.1 M3 behaviour
Our analysis shows that M3 allows both the transient and sustained
ERK activity. The transient/sustained ERK activity results from the
signal transduction via SHC/FRS2, respectively.
In addition, the FRS2 activity is also sustained. Therefore, the
model ﬁts to our data (Chapter 3.6) which show that the ERK and
FRS2 sustained activity occur at the same time.
4.6 Model M4
The model M4 extends the model M3. It contains three new components
Gab1, the complex Grb2-Sos (here as GS), and the complex
Shp2-Grb2-Sos (Shp2-GS). Gab1 binds GS and Shp2-GS and SHC
binds GS [17]. FRS2 creates a complex with both GS and Shp2-GS.
However, the complex FRS2-GS does not affect the activation of ERK
according to literature [18]. Therefore it is not model here.
24
4. RESULTS
(a)
KF RS2, {FGF, ERK} = 1
KSHC, ∅ = 0
KSHC, {FGF} = 1
KSHC, {ERK} = 0
KSHC, {FGF, ERK} = 0
(b)
(c)
Figure 4.3: (a) The regulatory graph of the model M3 (b) The regulatory
logic of M3. Rules which are not mentioned are the same as
in M2.2. (c) The state transition graph of the model M3.The compoments
in a single state correspond to FGF, FRS2, ERK and SHC, in
this order.
The state transition graph of the model M4 has 64 nodes therefore
it is not shown here. We have analysed M4 properties using NuSMV.
4.7 Model M5
The model M5 is deﬁned in the Figure 4.5. Compared to M4, the
model M5 contains four additional components FGFR, Ras, Raf and
MEK which are known to be part of the FGF pathway. Therefore, the
model M5 is less abstract and more realistic than M4.
As initial, we choose all states where FGF and FGFR is 1 because
we want to mimic the dynamics of a cell with persistently active re-
ceptors.
The additional components do not change the properties of M4
analysed in this theses. However, they might be important for future
extension of the model and analysis of other properties of the FGF
pathway.
25
4. RESULTS
(a)
Kgi
, ∅ = 0
KF GF , {FGF} = 1
KSHC, {FGF} = 1
KGab1, {FGF} = 1
KF RS2, {FGF} = 1
KF RS2, {FGF, ERK} = 1
KF RS2, {ERK} = 0
KGS, {SHC} = 1
KGS, {Gab1} = 1
KGS, {SHC, Gab1} = 1
KGS, {ERK, Gab1} = 0
KGS, {ERK, SHC} = 0
KGS, {ERK, SHC, Gab1} = 0
KGS, {ERK} = 0
KShp2−GS, {FRS2} = 1
KShp2−GS, {Gab1} = 1
KShp2−GS, {FRS2, Gab1} = 1
KShp2−GS, {ERK, FRS2} = 1
KShp2−GS, {ERK, Gab1} = 1
KShp2−GS, {ERK, FRS2, Gab1} = 1
KShp2−GS, {ERK} = 0
KERK, {GS} = 1
KERK, {Shp2 − GS} = 1
KERK, {GS, Shp2 − GS} = 1
(b)
Figure 4.4: (a) The regulatory graph of the model M4 (b) The regulatory
logic of M4.
26
4. RESULTS
(a)
Kgi
, ∅ = 0
KF GF R, {FGF} = 1
KSHC, {FGFR} = 1
KGab1, {FGFR} = 1
KF RS2, {FGFR} = 1
KF RS2, {FGFR, ERK} = 1
KF RS2, {ERK} = 0
KRas, {GS} = 1
KRas, {Shp2 − GS} = 1
KRas, {GS, Shp2 − GS} = 1
KRaf , {Ras} = 1
KMEK, {Raf} = 1
KERK, {MEK} = 1
(b)
Figure 4.5: (a) The regulatory graph of the model M5 (b) The regulatory
logic of M5. Rules which are not mentioned are the same as in
model M4.
4.8 Summary of analysis
The Table 4.1 summarises analysed properties of the models.
Model M1 is a basic regulatory unit which consists of three nodes.
It allows either sustained or transient behaviour based on the settings
of logic rules on the node FRS2 which has incoming positive
and negative interaction. M1 cannot allow both types of behaviour.
In the model M2 we have implemented additional node SHC
which provides an alternative signal ﬂow. As a result, the model
M2.1 allows both types of behaviour, however, there is not a run in
M2.1 such that it satisﬁes the property of sustained ERK and FRS2.
Therefore, the model M2.1 does not ﬁt to our data that sustained
ERK activity occurs together with the sustained FRS2 activity.
The model M3 is the simplest complex model which allows transient
and sustained ERK activity and which contains a run such that
it satisﬁes the property of sustained ERK and FRS2. The model M4
and M5 extend M3 based on the literature. They have the same
properties as M3. However, the models should be validated with
respect to other properties beyond our analysis.
27
4. RESULTS
Model ERK activity FRS2 activity sustained ERK and FRS2
M1.1 transient transient no
M1.2 sustained sustained yes
M2.1 transient and sustained transient no
M2.2 sustained sustained yes
M3 transient and sustained sustained yes
M4 transient and sustained sustained yes
M5 transient and sustained sustained yes
Table 4.1: Properties of the models.
28
5 Discussion
Using GINsim we have built seven boolean models of the FGF pathway
based on literature and unpublished western blot data. The models
should be validated by additional in vitro experiments. We have
expressed the properties of transient and sustained behaviour of the
pathway by LTL formulae and analysed the models with an implementation
of DFS and with the NuSMV model checker. The DFS program
was useful to clearly visualize the model behaviour. However,
it is not scalable and we used it for analysis of less complex models.
The models M4 and M5 have been analysed only by NuSMV.
Even though the boolean modelling is highly abstract formalism
it is suitable for analysis of the FGF pathway because the exact interactions
in the pathway are unclear. To model this system with more
expressive formalism, e.g. ODE, would require using enormous number
of parameters based only on estimation. Using the boolean models
we avoid this problem and clarify the basic questions about the
FGF pathway behaviour without the knowledge of biochemical dynamics
underlying the pathway. The models we have built can very
well serve as a scaffold for creating more expressive models in ODE.
Our analysis implies that the sustained behaviour is established
by reaching the terminal state without any outgoing edge and the
transient behaviour is established by reaching a cycle of states where
the pathway is periodically switched on and off. The increasing complexity
of our models allowed us to identify the regulatory rules
which model the negative feedback as a key element affecting model
properties. By a change of the regulatory rules we can evoke a change
between transient and sustained behaviour.
Based on our analysis, we suggest two hypotheses about the mechanism
underlying the switch between transient and sustained behaviour
of the FGF signalling pathway.
• Single cell is capable of both, transient and sustained behaviour,
in response to FGF. The sustained behaviour is maintained by
signalling via FRS2 and the transient behaviour is maintained
by signalling via other adaptor, e.g. SHC or an unknown adaptor.
We can model this case as a single model with two adaptor
proteins which differ in the regulatory rules, as in the case of
29
5. DISCUSSION
the model M3
• Single cell has only one behaviour proﬁle in response to FGF,
transient or sustained. The proﬁle depends on the cell type.
Specialized cells differ in proteins which they produce. Differences
in protein concentration overstepping a threshold result
in the change of regulation of signal transduction. We can
model this case as two models with the same topology and
different settings of the regulatory rules, as in the case of the
model M1.1 and M1.2.
30
6 Conclusion
In this thesis, we study the behaviour of the FGF signalling pathway
using boolean networks. In the Chapter 2 we have summarised a
brief overview of the FGF pathway. It documents the biological background
based on which we have implemented seven models in GINsim,
a software for modelling and simulation of boolean networks.
The activity of the FGF pathway can have either transient or sustained
proﬁle. The molecular mechanism responsible for differences
in the pathway activity is unknown. We have expressed this two
qualities formally in LTL. Using the NuSMV model checker we have
analysed our models for given qualities. To comprehensively visualize
the model dynamics we have implemented DFS algorithm which
takes as an input a model simulation exported from GINsim and an
initial node and returns all paths reachable from the initial node depicted
by gnuplot.
Analysis of our models implies that the sustained proﬁle is established
by reaching the stable state, the terminal state without any
outgoing edge. The transient proﬁle is caused by reaching a cyclic attractor,
a cycle of states where the pathway is periodically switched
on and off. The switch between transient and sustained proﬁle can
be induced by a change of the regulatory rules which model the negative
feedback.
Based on our results, we suggest two hypotheses about the mechanism
responsible for differences in the pathway activity. Firstly, for
a single cell to show both proﬁles there has to be two or more signalling
proteins each of them maintaining separate signal ﬂow which
is differentially affected by the negative feedback. Secondly, if a specialized
cell type has only one proﬁle, either transient or sustained,
then all of the separate signal ﬂows are affected to the same level by
the negative feedback.
The models we have built can serve as scaffold for more complicated
models for future analysis. The models are suitable for creating
hypotheses about the mechanisms of signalling pathway in genetic
diseases, as we are currently doing for the signal transduction
in achondroplasia disease, most common form of human dwarﬁsm.
31
Bibliography
[1] I. Ahmad, T. Iwata, and H. Y. Leung. Mechanisms of FGFRmediated
carciongenesis. Biochimica et Biophysica Acta Molecular
Cell Research, 1823(4):850–860, 2012.
[2] T. Araki, H. Nawa, and B. G. Neel. Tyrosyl phosphorylation
of Shp2 is required for normal ERK activation in response to
some but not all, growth factors. Journal of Biological Chemistry,
278(43):41677–84, 2003.
[3] J. Barnat, L. Brim, S. Cerna, S. Drazan, J. Fabrikova, J. Lanik, and
D. Safranek. BioDiVinE: A Framework for Parallel Analysis of
Biological Models. Electronic Proceedings in Theoretical Computer
Science, 15, 2009.
[4] J. Barnat, L. Brim, A. Krejci, A. Streck, D. Safranek, M. Vejnar,
and T. Vejpustek. On parameter synthesis by parallel model
checking. IEEE/ACM transactions on computational biology
and bioinformatics, 9(3):693–705, 2012.
[5] T. Bondeva, A. Balla, P. Várnai, and T. Balla. Structural determinants
of ras-raf interaction analyzed in live cells. Molecular
Biology of the Cell, pages 2323–2333.
[6] A. Cimatti, E. M. Clarke, E. Giunchiglia, F. Giunchiglia, M. Pistore,
M. Rovere, R. Sebastiani, and A. Tacchella. Nusmv 2: An
opensource tool for symbolic model checking. Proceeding CAV
’02 Proceedings of the 14th International Conference on Computer
Aided Veriﬁcation, 2002.
[7] E. M. Clarke, O. Grumberg, and D. Peled. Model Checking.
1999.
[8] T. Corbalan-Garcia, S. Yang, K. R. Degenhardt, and D. Bar-Sagi.
Identiﬁcation of the Mitogen-Activated Protein Kinase Phosphorylation
Sites on Human Sos1 That Regulate Interaction
with Grb2. Molecular and cellular biology, 16(10):5674–5682,
1996.
32
6. CONCLUSION
[9] L. Erzsébet, S. Welti, and K. Scheffzek. Inhibition and Termination
of Physiological Responses by GTPase Activating Proteins.
Physiological Reviews, 92(1):237–272, 2012.
[10] V. P. Eswarakumar, I. Lax, and J. Schlessinger. Cellular signaling
by ﬁbroblast growth factor receptors. Cytokine & Growth Factor
Reviews, 16(2):139–149, 2005.
[11] S. Foldynova-Trantirkova, W. R. Wilcox, and P. Krejci. Sixteen
years and counting: the current understanding of ﬁbroblast
growth factor receptor 3 (FGFR3) signaling in skeletal dysplasias.
Human Mutation, 33:29–41, 2012.
[12] L. Grieco, L. Calzone, I. Bernard-Pierrot, F. Radvanyi, B. KahnPerles,
and D. Thieffry. Boolean modeling of biological regulatory
networks: A methodology tutorial. Methods, 62(1), 2013.
[13] L. Grieco, L. Calzone, I. Bernard-Pierrot, F. Radvanyi, B. KahnPerles,
and D. Thieffry. Integrative modelling of the inﬂuence
of mapk network on cancer cell fate decision. PLOS Computational
Biology, 9(10), 2013.
[14] O. A. Ibrahimi, B. K. Yeh, A. V. Eliseenkova, S. K. Zhang, F.
ad Olsen, M. Igarashi, Aarson S. A., R. J. Linhardt, and M. Mohammadi.
Analysis of mutations in ﬁbroblast growth factor
(FGF) and a pathogenic mutation in FGF receptor (FGFR) provides
direct evidence for the symmetric two-end model for
FGFR dimerization. Mol Cell Biol, 25:671–684.
[15] H. Klarner, A. Streck, D. Safranek, J. Kolcak, and H. Sibert.
Parameter Identiﬁcation and Model Ranking of Thomas Networks.
Computational Methods in Systems Biology, 2012.
[16] P. Krejci, V. Bryja, J. Pachernik, A. Hampl, R. Pogue, P. Mikikian,
and R. Wilcox. FGF2 inhibits proliferation and alters the
cartilage-like phenotype of RCS cells. Experimetal Cell Research,
297:152–164, 2004.
[17] P. Krejci, B. Masri, L. Salazar, C. Farrington-Rock, H. Prats, L. M.
Thompson, and W. R. Wilcox. Bisindolylmaleimide I Suppresses
33
6. CONCLUSION
Fibroblast Growth Factor-mediated Activation of Erk MAP Kinase
in chondrocytes by Preventing Shp2 Association with the
Frs2 and Gab1 Adaptor Proteins. Journal of Biological Chemistry,
282(5):2929–36, 2007.
[18] I. Lax, A. Wong, B. Lamothe, A. Lee, A. Frost, J. Hawes, and
J. Schlessinger. The Docking Protein FRS2 α Controls a MAP
Kinase-Mediated Negative Feedback Mechanism for Signaling
by FGF Receptors. Molecular Cell, 10:709–719, 2002.
[19] B. J. Mayer and D. Baltimore. Signaling through sh2 and sh3
domains. Trends in Cell Biology, 3:8–13.
[20] M. Mohammadi, S. K. Olsen, and O. A. Ibrahimi. Structural
basis for ﬁbroblast growth factor receptor activation. Cytokine
& Growth Factor Reviews, 16(2):107–37, 2005.
[21] P. Monteiro and Chaouiya. C. Efﬁcient veriﬁcation for logical
models of regulatory networks. Proc. 6th Intl. Conf. on Practical
Applications on Computational Biology & Bioinformatics
PACBB’12 (Salamanca, Spain), 154:259–267, 2012.
[22] A. Naldi, D. Berenguier, A. Faure, F. Lopez, D. Thieffry, and
C. Chaouiya. Logical modelling of regulator networks with
GINsim 2.3. Biosystems, 97(2):134–9, 2009.
[23] A. D. Sharrocks. Cell Cycle: Sustained ERK Signalling Represses
the Inhibitors. Current Biology, 16:540–542, 2006.
[24] D. Thieffry and R. Thomas. Dynamical Behavior of Biological
Regulatory Networks. Bulletin of Mathematical Biology,
57(2):277–97, 1995.
[25] K. Thobe, A. Streck, H. Klarner, and H. Siebert. Model Integration
and Crosstalk Analysis of Logical Regulatory Networks.
Computational Methods in Systems Biology. Lecture Notes in
Computer Science, 8859:32–44, 2014.
[26] R. Thomas, D. Thieffry, and M. Kaufman. Dynamical behaviour
of biological regulatory networks–I. Biological role of feedback
34
6. CONCLUSION
loops and practical use of the concept of the loop-characteristic
state. Bulletin of Mathematical Biology, 57(2), 1995.
[27] S. Yamada, T. Taketomi, and A. Yoshimura. Model analysis of
difference between EGF pathway and FGF pathway. Biochemical
and Biophysical Research Communications, 314:1113–1120,
2004.
35
A Attached ﬁles
The following directories are attached to this thesis.
• NPF contains an implementation of DFS algorithm.
• Ginsim contains models implemented in GINsim
• layout contains the state transition graphs exported from GINsim
which can be used as input for NPF.exe.
• Nusmv contains models in the smv format.
36
B Signalling pathways
The following text summarises basic principles of signalling pathways.
It is intended for educational purposes. It is written in Czech.
The chapters are numbered separately from the rest of the thesis.
37
1 Signální dráhy
Tento výukový text poskytuje obecný přehled principů signálních drah.
1.1 Buněčná komunikace
Buněčná komunikace u mnohobuněčných živočichů udržuje stabilní vnitřní prostředí
organizmu a řídí jeho růst a vývoj. Buňky vysílají chemický signál do svého okolí, kde
jej detekují ostatní buňky pomocí receptorů, specializovaných proteinů, které mají vazebná
místa specificky uzpůsobená pro konkrétní signál. Signály jsou velmi různorodé.
Patří sem např. proteiny, malé peptidy, aminokyseliny, nukleotidy, steroidy, retinoidy,
deriváty mastných kyselin nebo i plyny oxid dusnatý a uhelnatý (Alberts et al., 2008).
1.2 Receptory a příjem signálu
Rozhraní v komunikaci mezi buňkou a okolím tvoří cytoplazmatická membrána. Ta je
tvořená hlavně z fosfolipidů, což ji činí nepropustnou pro velké nebo hydrofilní molekuly
(obrázek 1a) (Alberts et al., 2008). Většina signálů tedy nemůže projít cytoplazmatickou
membránou a je přijímána povrchovými receptory. Povrchový receptor je tvořen
z vnější části, která váže signální molekuly, transmembránové části, která receptor ukotvuje
v membráně, a vnitřní části, která přenáší signál v cytoplazmě. Vazba signálu mění
prostorovou konformaci receptoru, čímž se zpráva přenáší do buňky.
Naopak malé hydrofobní molekuly vstupují do cytoplazmy nebo až do jádra buňky,
kde se váží na vnitrobuněčné receptory a aktivují je (obrázek 1b). Aktivní receptory následně
podněcují změny v transkripci genů. Vnitrobuněčné receptory mají např. steroidní
a thyroidní hormony.
Existují tři hlavní typy povrchových receptorů, a to receptory spojené s iontovými
kanály, receptory vázané na G-protein a receptory vázané na enzym. Receptory spojené
s iontovými kanály po vazbě signálu umožňují průchod iontů přes plazmatickou
membránu (obrázek 3a). Výsledkem je rychlá změna rozložení celkového náboje okolo
membrány. Tyto receptory přenášejí informaci na nervových synapsích.
Receptory vázané na G-protein jsou nejpočetnější rodina receptorů. Uvnitř buňky
receptor interaguje s G-proteinem, který je v neaktivním stavu tvořen ze tří podjednotek
a váže GDP. V aktivním stavu G-protein váže GTP a rozpadá se na dvě části, které aktivují
další složky signální dráhy. Signalizace přes G-protein probíhá cyklicky, protože
po aktivaci se G-protein sám deaktivuje (obrázek 2).
Receptory spojené s enzymy obsahují část orientovanou do cytoplazmy, která má
enzymovou aktivitu. Většina těchto receptorů jsou tyrozinkinázové receptory (RTK),
1
(a)
(b)
Obrázek 1: Převzato z (Alberts et al., 2008). (a) Hydrofilní molekuly nejsou schopné procházet
cytoplazmatickou membránou do buňky. Proto je detekují povrchové receptory
ukotvené v cytoplazmatické membráně. Po vazbě na signál mění receptor konformaci,
čímž se signál přenáší do cytoplazmy. (b)Naopak hydrofobní molekuly cytoplazmatickou
membránou procházejí a váží se na vnitrobuněčné receptory, které tím aktivují.
Aktivované vnitrobuněčné receptory regulují transkripci genů v jádře.
jejichž enzymovou aktivitou je schopnost katalyzovat připojení fosfátové skupiny na
další proteiny v signální dráze (viz sekce 1.3.1) (Robinson et al., 2000). RTK jsou receptory
některých růstových faktorů např. EGF a FGF (Epidermal/Fibroblast Growth Factor).
Je pro ně typické, že při vazbě signálu vytvářejí dimery (obrázek 3b)(Yarden a Ullrich,
1988).
1.3 Přenos a zpracování signálu
V cytoplazmě přenášejí a zpracovávají signál hlavně signální proteiny. Tyto proteiny
signál zesilují, integrují se signály z různých drah, distribuují do více paralelních toků a
přenášejí ho až k efektorovému proteinu, který podnítí specifickou buněčnou odpověď
(obrázek 4a). Efektory jsou transkripční faktory řídící přepis genů do mRNA, proteiny
metabolismu nebo cytoskeletu (Alberts et al., 2008).
Informace šířící se po signálních dráhách je navíc regulovaná pomocí negativních
a pozitivních zpětných vazeb. Negativní zpětné vazby zajišťují vypnutí aktivní dráhy.
Bez utlumení dráhy by buňka ztratila citlivost a schopnost reagovat na nově příchozí
signál. Neschopnost utlumit některé signální dráhy může vést k nekontrolovanému dělení
buněk a vzniku rakoviny.
Informace se šíří pomocí navození změn v signálních proteinech, nejčastěji v jejich
konformaci. Díky těmto změnám se aktivuje konkrétní funkce proteinu. Např. se aktivuje
schopnost fosforylovat jiné proteiny. Rovněž se tyto změny mohou promítnout ve
spektru proteinů, které je signální protein schopen vázat a vytvářet s nimi komplex.
V následujících podsekcích jsou rozepsány nejčastější mechanizmy, které se využívají
k navození změn v proteinech signálních drah. Jsou to fosforylace/defosforylace,
2
Obrázek 2: Cyklus signalizace přes G-protein. Převzato z (Khazifov a Lattazi, 2009).
Receptor interaguje s G-proteinem, složeným z podjednotek 𝛼, 𝛽 a 𝛾. V neaktivním
stavu jsou všechny tři podjednotky v komplexu a podjednotka 𝛼 váže molekulu GDP.
Změna konformace receptoru po detekování signálu vyvolá změnu konformace v Gproteinu.
G-protein uvolní GDP, naváže GTP a rozpadne se na dvě části, podjednotku
𝛼-GTP a 𝛽-𝛾. Obě části difundují od receptoru a aktivují další složky v signální dráze.
GTP samovolně hydrolyzuje na GDP a G-protein se opět deaktivuje spojením všech tří
podjednotek.
(a)
(b)
Obrázek 3: Převzato z (Alberts et al., 2008). (a) Receptory spojené s iontovými kanály.
(b) Receptory spojené s enzymovou (kinázovou) aktivitou.
3
(a) Převzato z (Alberts et al., 2008).
(b) Převzato z (Berg et al., 2002).
Obrázek 4: Princip signální dráhy. (a) Vnější signál je přes povrchový receptor přenesen
do cytoplazmy, kde se pomocí signálních proteinů propaguje a distribuuje k efektorovým
proteinům. Výstupem signální dráhy je specifická buněčná odpověď např. změna
metabolismu, genové exprese nebo tvaru buňky. (b) Signál z prostředí např. růstový
faktor či hormon je detekován nejčastěji povrchovým receptorem. Uvnitř buňky je zesílen
a přeměněn na jiné chemické formy. Dráhu mohou regulovat pozitivní i negativní
zpětné reakce.
vazba GTP/GDP, vazba jiných signálních proteinů, vazba na sekundární přenašeče signálu
(cAMP, Ca2+
, diacylglycerol) a ubikvitinace.
1.3.1 Fosforylace a defosforylace proteinů
Běžným způsobem přenosu informace je fosforylace a defosforylace proteinů signální
dráhy. Fosforylaci zprostředkovávají enzymy kinázy, které přenášejí fosfátovou skupinu
z ATP na aminokyselinu v proteinu. Nejčastěji fosforylované aminokyseliny jsou
serin, threonin a tyrosin (Hanks a Hunter, 1995). Kinázy lze rozdělit do dvou skupin
podle aminokyseliny, kterou modifikují na serin/threonin kinázy a tyrosin kinázy. Antagonisté
kináz jsou fosfatázy, které fosfátovou skupinu z proteinu odstraňují.
Fosforylace a defosforylace probíhají v řádu milisekund a umožňují tak rychlé aktivování
a utlumení signální dráhy. Fosforylace nejčastěji aktivuje funkce proteinu, jak je
to uvedeno na obrázku 5a, např. mění neaktivní kinázy na aktivní. Nicméně může na
funkci proteinu působit i inhibičně. Fosforylace se často vyskytuje i na několika různých
4
(a) (b)
Obrázek 5: Převzato z (Alberts et al., 2008). Signální proteiny fungují jako molekulární
přepínače mezi aktivním a neaktivním stavem (ON/OFF) pomocí (a) připojení/odstranění
fosfátové skupiny (b) nebo pomocí vazby a hydrolýzy GTP. Hydrolýza
GTP a navázání GTP na G protein probíhá samovolně, ale může být urychleno pomocí
proteinů GEF a GAP.
místech daného proteinu, přičemž jedna fosforylace může být aktivační a jiná inhibiční.
Běžně se v signálních dráhách vyskystují celé kaskády fosforylačních reakcí, kdy je
první kináza aktivovaná fosforylací a následně fosforyluje druhou kinázu v kaskádě,
která fosforyluje třetí kinázu atd. Velmi známá je např. MAPK fosforylační kaskáda
(Kyriakis, 2014).
1.3.2 G proteiny v signálních dráhách
G proteiny přepínají mezi aktivní a neaktivní konformací pomocí vazby GTP/GDP (obrázek
5b). Proteiny GEF (guanine nucleotide-exchange factor) podporují uvolnění GDP
z proteinu (Erzsébet et al., 2012). K vazbě GTP dochází samovolně, protože je v cytoplazmě
v nadbytku. Rychlost hydrolýzy GTP na GDP ovlivňují proteiny GAP (GTPaseaccelerating
protein).
V sekci 1.2 jsme si představili velké G proteiny složené ze tří podjednotek, které interagují
s buněčnými receptory a disociují na dvě části. Kromě toho existuje i druhá
skupina G proteinů. Jsou to malé proteiny tvořené jen jednou podjednotkou, která je
podobná podjednotce 𝛼 velkých G proteinů. Do této skupiny G proteinů patří proteinová
rodina Ras, která aktivuje MAPK signální kaskádu.
1.3.3 Sekundární přenašeče a amplifikace signálu
Koncentrace receptorů v membráně je nedostatečná ke spuštění buněčné odpovědi.
Proto musí být signál zesílen. K tomu signální dráhy často využívají malých nitrobuněčných
mediátorů tzv. sekundárních poslů. Jsou vytvořeny ve velkém množství po
aktivaci receptoru, difundují od místa svého vzniku a šíří signál dál do cytoplazmy.
Mezi sekundární posly patří cyklický AMP, Ca2+
. Sekundární posel diacylgycerol je
5
Obrázek 6: Převzato z (Ciechanover, 2005). Proteazom je válcovitý molekulární komplex,
který rozeznává polyubikvitinované proteiny a štěpí je na krátké peptidy.
rozpustný v tucích a difunduje v membráně. Sekundární poslové aktivují některé enzymy,
např. proteinové kinázy. K zesílení signálu napomáhají i fosforylační kaskády.
Důsledkem amplifikace signálu je, že už i nízká koncentrace ligandu spustí významnou
odpověď uvnitř buňky, čímž se zvyšuje schopnost buňky reagovat na malé změny
prostředí.
1.3.4 Ubikvitinace
Ubikvitinace je stejně jako fosforylace dynamická, reverzibilní modifikace proteinu (Chen
a Sun, 2009). Během této modifikace je na cílový protein připojen ubikvitin, krátký protein
obsahující asi 70 aminokyselin. Ubikvitin obsahuje 7 aminokyselin, na které se mohou
připojit další molekuly ubikvitinu tak, že vytvářejí různě propojené řetězce. Různé
typy polyubikvitinace mají různý význam, protože jsou specificky rozeznávány jinými
proteiny. Nejznámější je polyubiktivinace lysinu 48, která označuje protein k degradaci
v proteazomu (obrázek 6).
Fosforylace stejně jako ubikvitinace umožňují vzájemné interakce proteinů. Modifikace
proteinu slouží jako značka, kterou specificky rozeznávají a váží jiné proteiny,
čímž dojde k vytvoření proteinových komplexů, které dále šíří stimul po signální dráze.
Fosforylaci tyrosinu typicky rozeznává doména SH2 a ubikvitinaci rozeznává doména
UIM (obrázek 7). Proteiny často obsahují více domén rozeznávajících různé značky,
čímž se zvyšuje kombinatorický prostor pro vznik komplexů.
1.3.5 Proteinové komplexy v signálních dráhách
Interakce proteinů signálních drah často umožňují prostředníci, kteří udržují signální
proteiny ve vzájemné blízkosti a ve správné orientaci (obrázek 8). Výsledkem je přesné
a efektivní řízení směru toku informace v síti signálních drah.
Pro tyto proteiny existuje několik termínů, které se běžně zaměňují a používají synonymně.
Jsou to proteinové adaptory, lešení, kotvy a doky (ang. scaffold, adaptor, anchor,
docking proteins) (Buday a Tompla, 2010). Jako adaptory se obvykle označují malé pro-
6
Obrázek 7: Srovnání fosforylace a ubikvitinace. Převzato z (Woelk et al., 2007). Ubikvitinaci
katalyzují ubikvitin ligázy E1, E2 a E3. Odstranění ubikvitinu katalyzují enzymy
označované jako DUB. Fosfátovou skupinu rozeznávají interagující proteiny pomocí
své SH2 domény, zatímco ubikvitin rozeznávají proteiny obsahující doménu UIM. Obě
modifikace tak slouží jako lepidlo, které spojuje proteinové komplexy.
Obrázek 8: Prostředníci ve tvorbě proteinových komplexů jsou adaptor, scaffold/anchor,
docking proteiny. Šipky označují vzájemné interakce a modifikace. Převzato z (Buday a
Tompla, 2010)
teiny, které spojují dohromady dva interakční partnery např. proteinkinázu a její substrát.
Proteinové lešení (či proteinová kotva) je větší než adaptor a je schopné vázat více
než dva proteiny signální dráhy. Proteinové doky rovněž interagují s více než dvěma
partnery, ale navíc jsou ukotvené v cytoplazmatické membráně.
Významnou skupinou adaptorů jsou adaptory složené z SH2/SH3 domén. Jsou to
např. proteiny Grb2, Shc a Crk (Buday, 1999). Protein Grb2 obsahuje jednu doménu
SH2, která rozeznává fosforylovaný tyrosin na aktivovaných proteinkinázových receptorech
růstových faktorů (př. EGFR, FGFR) či na proteinových docích (Gab1, FRS2).
Grb2 dále obsahuje dvě domény SH3, které interagují s motivem bohatým na prolin v
proteinu Sos. Sos je protein GEF ativující Ras.
7
Obrázek 9: Příklad chování buňky v závislosti na kombinaci vstupních signálů. Převzato
z (Alberts et al., 2008). Buňka může odpovědět růstem a dělením nebo diferenciací
v jiný typ buňky. Nepřichází-li k buňce žádné signály k přežití dojde k programované
buněčné smrti (apoptóze).
1.4 Buněčná odpověď
Typicky jsou buňky mnohobuněčného organizmu v daný moment vystaveny stovkám
rozdílných signálů z prostředí, které na ně mohou působit stimulačně či inhibičně. Pro
danou kombinaci a koncentraci vstupních signálů působících na konkrétní buňku je
výstupem specifická buněčná odpověď (obrázek 9). Buněčná odpověď závisí na spektru
receptorů, které jsou u buňky vytvořeny. Buňky různých tkání se liší v zastoupení
receptorů a následně i ve schopnosti reagovat na vnější podněty.
Buňky se stejným receptorem mohou odpovědět jiným způsobem z toho důvodu,
že u nich proběhne rozdílné zpracování signálu. Vstupní signál sám o sobě nese velmi
málo informace o tom, jakou vyvolá buněčnou odpověď.
Poměrně rychlá buněčná odpověď (v řádu sekund až minut) nastává, v případě,
že konečným efektorem signální dráhy je enzym metabolismu nebo protein buněčné
kostry (cytoskeletu) (Alberts et al., 2008). Takto působící signální dráhy pouze mění
funkci již existujících proteinů v buňce. Pomalejší odpověď (v řádu minut až hodin) nastává
u signálních drah, které mění genovou expresi. Signál se přenese na transkripční
faktor, který prostupuje do jádra buňky, kde mění genovou expresi.
8
2 Literatura
Alberts B., Johnson A., Lewis J., Raff M., Roberts K., Walter P. 2008. Molecular Biology
of the Cell. Garland Science. 5th edition, chapter 15.
Berg J. M., Tymoczko J., Stryer L. 2002. Biochemistry. W H Freeman. 5th edition,
chapter 15.
Buday L. 1999. Membrane-targeting of signalling molecules by SH2/SH3 domain containing
adaptor proteins. BBA. 1422(2): 187–204.
Buday L., Tompla P. 2010. Functional classification of scaffold proteins and related
molecules. FEBS J. 277: 4348–4355.
Chen Z. J., Sun L. J. 2009. Nonproteolytic Functions of Ubiquitin in Cell Signaling. Mol
Cell. 33(3): 275–86.
Ciechanover A. 2005. Proteolysis: from the lysosome to ubiquitin and the proteasome.
Nat Rev Mol Cell Biol. 6: 79–87.
Erzsébet L., Welti S., Scheffzek K. 2012. Inhibition and Termination of Physiological
Responses by GTPase Activating Proteins. Physiol Rev. 92(1): 237–272.
Hanks S. K., Hunter T. 1995. In the Beginning, There Was Protein Phosphorylation.
JBC. 9(8): 576–596.
Khazifov K., Lattazi G. 2009. G protein inactive and active forms investigated by
simulation methods. Proteins. 75(4): 919–930.
Kyriakis J. M. 2014. In the Beginning, There Was Protein Phosphorylation. JBC. 289(14):
9460–9462.
Reece J., Campbell N. 2002. Biology. Benjamin Cummings. 5th edition.
Robinson D. R., Wu Y., Lin S. 2000. The protein tyrosine kinase family of the human
genome. Oncogene. 19(49): 5548–5557.
Woelk T., Sigismund S., Penengo L., Polo S. 2007. The ubiquitination code: a signalling
problem. Cell Division. 2(11): doi:10.1186/1747–1028–2–11.
Yarden Y., Ullrich A. 1988. Growth factor receptor tyrosine kinases. Annu Rev Biochem.
57: 443–478.
9