Topics in Current Genetics, Vol. 13 L. Alberghina, H.V. Westerhoff (Eds.): Systems Biology DOI 10.1007/b137746 / Published online: 12 May 2005 Springer-Verlag Berlin Heidelberg 2005 Systems biology of apoptosis Martin Bentele and Roland Eils Abstract New approaches are required for the mathematical modelling and system identification of complex signal transduction networks, which are characterized by a large number of unknown parameters and partially poorly understood mechanisms. Here, a new quantitative system identification method is described, which applies the novel concept of 'Sensitivity of Sensitivities' revealing two important system properties: high robustness and modular structures of the dependency between state variables and parameters. This is the key to reduce the system's dimensionality and to estimate unknown parameters on the basis of experimental data. The approach is applied to CD95-induced apoptosis, also called programmed cell death. Defects in the regulation of apoptosis result in a number of serious diseases such as cancer. With the estimated parameters, it becomes possible to reproduce the observed system behaviour and to predict important system properties. Thereby, a novel regulatory mechanism was revealed, i.e. a threshold between cell death and cell survival. 1 Systems biology: paradigm shift from reductionism to holism in biology? The whole is greater than the sum of its parts It is hard to believe that only 40 years ago Watson and Crick discovered the structure of DNA. Their work was based on a number of studies dating back to the work of Oswald Avery, Erwin Chagraff, and others that discovered the four letters A,C,G,T of the DNA and Chagraff's rule that the amount of nucleotides followed a simple relationship. Since the discovery of the DNA biology was mostly oriented towards a reductionist approach. Triggered by modern high-throughput sequencing technologies the code of life was deciphered letter-by-letter, word-byword. It is now well known that the early hope of the genomics age could not be met that knowledge of the genetic code would help to comprehensively understand complex biological processes from the function of genes to the pathological mechanisms of genetic diseases such as cancer. Thus, the mistrust in the worldwide sequencing projects by Chagraff, who ironically paved the way for the later discovery of the DNA, has been fulfilled: "Niemand hat uns je gelesen, niemand wird uns je lesen" (Nobody has ever read us, nobody will ever read us). After almost two decades of genome research the leitmotif of reductionism that life is just 350 Martin Bentele and Roland Eils chemistry and physics and that once we have found the smallest components of life we will be able to understand the whole is widely rejected. The obvious limitations of the reductionist's approach to biology has led to a revival of holistic approaches in biology. The approach to biology using the tools of systems engineering is not new (Bertalanffy 1973). In the first half of the last century, Jan C. Smuts laid the foundation for a holistic approach in natural sciences that has even inspired philosophers in their reflection of nature (Smuts 1926). However, success of a holistic approach has only become possible in the postgenomic era since the reductionists have provided information on the basic building blocks of many important cellular processes in a quantitative form. Accordingly, it is probably not correct to name the transition from reductionism to holism a paradigm shift in biology. Before the components of life were deciphered a holistic approach to biology was condemned to be restricted to a mystic holism that was prevalent at the beginning of the 20th century. With the help of the reductionism, systems biology has now become an analytically rigorous discipline that aims at a description of complex biological processes by mathematical models. Systems biology can, thus, be considered a new discipline in biology that puts the theoretical foundations of system level dissection of living matter into the context of modern high-throughput quantitative experimental data, mathematics and in silico simulations in order to gain a holistic view on the complex workings of life. In this chapter, we will describe a mathematical model of programmed cell death called apoptosis. Motivated by this fundamental process prevalent in almost all higher organisms we developed a variety of modern methods for systems biology that are required to tackle complex biochemical pathways involved in cellular signalling. 2 Modelling signal transduction networks A better understanding of signal transduction networks is one of the most challenging areas in systems biology. Cells show information processing by the biochemical interaction between molecules. Signals of external stimuli are, for example, passed into the nucleus to regulate gene expression, resulting in proliferation, mitosis (nuclear division), changes in metabolism or cell death (Alberts et al. 2002). Interactions like phosphorylation, exchange of smaller molecules, binding or cleavage, are the fundamental mechanisms, which form the signal transduction networks. Complexity arises from the huge number of different molecules and interactions between them. In eucaryotic cells, for example, the steadily growing number of known signalling molecule species is in the order of magnitude of 104 - 105 . Different methods of transcribing signal transduction networks into the language of mathematics exist. Dynamic pathway models are constructed using a diversity of mathematical and computational methods. Petri Nets (Reisig 1985) are well suited to describe the state transition process of distributed systems. In agentbased approaches and cellular automata (Wolfram 1994), macroscopic system Systems biology of apoptosis 351 properties emerge from the individual properties of the single entities, interacting with each other. Other methods originate from the analysis of biochemical systems ranging from the examination of steady states and flux modes to a large variety of control theories (Kell and Westerhoff 1986; Heinrich and Schuster 1996; Schilling et al. 1999). More recently, theoretical models for describing the signalling behaviour on system levels have been developed, using modular approaches (Kitano 2002; Csete and Doyle 2002; Lauffenburger 2000). Loosely speaking, simulation of signal transduction networks is either based on discrete models describing signalling as information processing, or on continuous models, where the information flux is modelled by a biochemical control system. In the latter approach, which goes back to the pioneering work of Garfinkel and Hess in the mid 60s (Garfinkel and Hess 1964; Garfinkel 1968), the reaction network is translated into a system of ordinary differential equations (Bhalla and Iyengar 1999; Mendes 1997). Today, there is a variety of sophisticated simulation methods to analyze complex biochemical reaction systems (Mendes 1993; Tomita et al. 1999; Sauro and Fell 1991). 3 CD95-induced apoptosis Programmed Cell Death, called apoptosis, is the natural and controlled death of cells, in which the cell and its nucleus shrinks, condenses, and fragments (Alberts et al. 2002; Evan and Littlewood 1998). Apoptosis is one of the most complex signalling pathways and an essential property of higher organisms. Defects in apoptosis result in a number of serious diseases such as cancer, autoimmunity, and neurodegeneration (Krammer 2000; Peter and Krammer 2003). To develop efficient therapies, fundamental questions about molecular mechanisms and regulation of apoptosis remain to be answered. Apoptosis is triggered by a number of factors, including UV-light, -radiation, chemotherapeutic drugs, growth factor withdrawal ('death by neglect'), and signalling from the death receptors (Nagata 1999; Ashkenazi and Dixit 1999). Apoptosis pathways can generally be divided into signalling via the death receptors at the membrane (extrinsic pathway) or the mitochondria (intrinsic pathway). Both pathways imply caspases as effector molecules (Salvesen 2002). Caspases belong to the family of proteases, which are enzymes that degrade proteins by hydrolyzing some of their peptide bonds (Thornberry and Lazebnik 1998). Caspases mostly exist in their inactive proforms (procaspases) and become active after getting cleaved. Various caspases are involved in both the initiation of the apoptotic process and the execution of the final apoptotic program. CD95-induced apoptosis is one of the best-studied apoptosis pathways. A detailed overview on this mechanism is for example given in Krammer (2000) and Danial and Korsmeyer (2004). 352 Martin Bentele and Roland Eils 3.1 The CD95-receptor and the DISC CD95 is a member of the death receptor family, a subfamily of the TNF-receptor superfamily (Nagata 1997). Crosslinking of the CD95-receptor either with its natural ligand CD95L or with agonistic antibodies such as anti-APO-1 induces apoptosis in sensitive cells. Upon CD95 stimulation with CD95L or anti-APO-1, the Death-Inducing Signalling Complex (DISC) is formed. The DISC consists of oligomerized CD95, the death domain-containing adaptor molecule FADD, procaspase-8, procaspase-10 and c-FLIP. The interactions between the molecules in the DISC are based on homophilic contacts. The death domain (DD) of CD95 interacts with the DD of FADD, while FADD interacts with procaspase-8 via the socalled death effector domain. Once the DISC is formed, procaspase-8 is autocatalytically cleaved: two procaspase-8 molecules bound at the DISC form the intermediate product p43/p41, followed by generation of an active caspase-8 complex p18/p10 (Lavrik et al. 2003). This process can be inhibited by c-FLIP, which binds to the DISC in various ways and blocks the latter mechanism (Krueger et al. 2001). 3.2 The caspase cascade After formation of active caspase-8, the apoptotic signalling cascade starts. Caspase-8 cleaves and activates caspase-3 and -7; caspase-3 itself activates caspase-6, which again activates caspase-8, thereby, establishing a self-amplifying activation loop. Caspase-3, -6, and -7 are involved in the execution of the death process, for example, the chromosomal degradation of DNA and, therefore, called executioner caspases, whereas the others, responsible for transferring the death signal, are referred to as initiator caspases. The DNA degradation plays an important role in the cell death process. It is started after ICAD gets cut off the CAD-ICAD complex by caspase-3 and -7, thereby, terminating the inhibition of CAD, which directly fragments the DNA (Nagata 1999). In parallel, PARP, a molecule, which repairs broken DNA strands is cleaved by executioner caspases as well. Once the DNA fragmentation process is triggered, a complete degradation of the cell starts irreversibly. 3.3 Type I versus type II cells and the regulation of apoptosis Two different CD95-signaling pathways are established in different cell types (Schmitz et al. 1999; Scaffidi et al. 1998). Type I cells are characterized by intensive DISC formation and mitochondria independent caspase-3 activation. In Type II cells, the formation of the DISC complex is reduced and the activation of caspase-3 occurs downstream of the mitochondria: the active form of caspase-8 cleaves Bid, followed by translocation of the cleavage product tBid to mitochondria, which results in the release of cytochrome-C. Subsequently, apoptosome, a complex consisting of Apaf-1, cytochrome-C and caspase-9, is formed (Zou et al. Systems biology of apoptosis 353 2002), leading to the activation of caspase-9, which then activates caspase-3, triggering the subsequent apoptotic events. Here, a feedback loop is established by caspase-2: it is activated by caspase-3 downstream of mitochondria and it cleaves Bid, which in response leads to mitochondrial cytochrome-C release. Furthermore, CD95-induced signalling is influenced and regulated by many other molecules, which mostly inhibit or amplify the apoptotic process, like XIAP, IAP1/2 and survivin (inhibitors of caspase-3,-7,-9) (Salvesen and Duckett 2002) or the BCL-2 family (Chao and Korsmeyer 1998) consisting of pro-apoptotic (e.g. Bak, Bax) and anti-apoptotic (e.g. BCL-XL) members, regulating the critical cytochrome-C release. An overview about the molecule families, which play an important role in this pathway, is given in (Westphal and Kalthoff 2003). 4 Mathematical models of apoptosis Despite the steadily increasing number of biological papers on apoptosis, mathematical models of this complex process are very scarce. In a first attempt to theoretically describe apoptotic signalling, a mathematical model including more than 20 reactions was proposed (Fussenegger et al. 2000). However, this model was based on ad hoc fixed parameters and, thus, its potential for understanding the regulation of apoptosis remains very limited. More recently, the caspase cascade was translated into a reductionist model and analytical mathematical methods were applied to evaluate the system behaviour within a wide range of parameters (Eissing et al. 2004). System identification methods like parameter estimation (Deuflhard 1983) based on reliably measured time series of data, as it has been successfully applied for chemical reaction system (Bock 1981), are suggested (Swameye et al. 2003). However, system identification is severely impaired by the high number of unknown parameters and the curse of dimensionality. Curse of dimensionality refers to the problem that the space of possible parameter value sets grows exponentially with the number of unknown parameters impairing the search for the globally most probable parameter values. The high number of unknown parameters is mainly due to the complexity of signal transduction networks and the absence of reliable quantitative information about the underlying mechanisms. Consequently, data-based studies are typically restricted to small models in which the biochemical interactions are well understood. Despite the ever-increasing number of studies on CD95-induced apoptosis, a systemic understanding of this complex signalling pathway is still missing. For this reason, we reconstructed the network topology of CD95-induced apoptosis by critically searching databases (Schacherer et al. 2001) and the literature. Molecules and reactions directly or indirectly interacting with the main components of this pathway were incorporated, leading to a network topology with more than 60 molecules and about 100 interactions. This complexity cannot be matched by experimental data at present. 354 Martin Bentele and Roland Eils Fig. 1. Model of CD95-induced apoptosis combining subsystems of different information levels. The grey scale level of the boxes corresponds to the information quality (for details see text). To tackle the high dimensionality of such systems, we developed an approach to large-scale modelling of signal transduction networks, which combines three methods (Bentele et al. 2004). Information on different levels of quality are combined in a unified form leading to the `Structured Information Models'. Then, a global approach to parametric sensitivity analysis is introduced by which the dimensionality of the system can be significantly reduced. On this basis, a clusterbased and sensitivity-controlled parameter estimation method is set up. 5 Structured information models - The information problem Information about signal transduction networks can be divided into different levels of information quality (Fig. 1). In most cases, interactions between two molecules are known on the semantic level only (e.g. A inhibits B or A activates B), thereby providing a network topology. For some well-investigated molecules and interactions the biochemical mechanism is also known (e.g. enzymatic process, formation of complexes) allowing quantitative modelling. However, information about the underlying biochemical parameters like reaction rate constants, MichaelisMenten constants or dissociation rates, is mostly missing. Even if quantitative information is available, its usability is limited if it refers to different experimental settings, cell types or states of cells. Systems biology of apoptosis 355 5.1 Network decomposition based on information quality To reduce the complexity of the model without sacrificing essential components of the network, subsystems of different information qualities were identified and incorporated: subsystems mainly consisting of interactions with well-understood biochemical mechanisms are modelled as chemical reaction systems, whereas all others are modelled as `black boxes', defined by their experimentally observed input-output behaviour. The subsystems are identified according to the following criteria: * The input/output behaviour should be measurable. * The number of input/output variables should be low. * Subsystems should represent real functional systems (e.g. mitochondria). * The information within one subsystem should be on the same level. Notably, the black boxes do not assume knowledge of the exact underlying mechanisms. Instead, they reproduce the behaviour of the respective subsystems in a simplified way. Moreover, minimum sets of state variables and `effective' parameters are introduced that do not necessarily correspond to molecule concentrations and biochemical parameters. As a consequence, the number of unknown parameters can be drastically reduced a priori. Note that this concept is motivated by a typical situation in system identification of biochemical networks: due to missing information and limited experimental data, the models should be small and restricted to those network parts which are well understood. On the other hand, parts of biochemical networks can generally not be regarded independently of their environment. Thus, black boxes are introduced to reproduce the relevant effects of the surrounding network parts on a mechanistically well-understood subsystem, rather than for system identification of the surrounding network itself, for which the data basis would be missing any- way. The degradation process of CD95-induced apoptosis is, for example, modelled as a decay function depending on a virtual state variable describing the `apoptotic activity', which is influenced by executioner caspases. This function approximates the experimental observations, thereby, requiring a few parameters only. The decomposition of the complete system into subsystems is an iterative and adaptive process. Based on new information, a subsystem might be split into further subsystems. A great advantage of the so-obtained `Structured Information Models' is that it combines heterogeneous information in one model instead of dealing with isolated models. 5.2 Combined model definition For the mathematical description of the mechanistic part of structured information models, interactions are modelled based on reaction rate equations. The state of a system is described by the concentration of l relevant signal transduction molecules (x1, ..., xl). The reaction rates depend on these concentrations and also on biochemical parameters (1, ..., r) like binding constants. To describe the tem- 356 Martin Bentele and Roland Eils poral behaviour, a system of Ordinary Differential Equations is generated as linear combinations of the reaction rates jv : = = n j rjiji xvdtdx 1 1 ),...,,(/ r , where ij denotes the stoichiometric matrix linking the reactions with the molecules affected. Given the initial concentrations, the time evolution of the reaction system can be propagated using an ODE Solver (Deuflhard and Bornemann 2002). Note that the initial concentrations xi(t=0) are often unknown and are considered unknown parameters as well. Black boxes are defined by their experimentally observed input-output behaviour. Additional state variables (xl+1, ..., xm) that do not necessarily correspond to molecule concentrations and additional parameters (r+1, ..., s) can be introduced. The q-th black box is represented by the functions fi q (x1, ..., xm, 1, ..., s, t), i=(1,...,m) that describe the changes of molecule concentrations and other state variables it affects. Boundary conditions like conservation laws have to be taken into account. Thus, the combination of subsystems modelled as chemical reaction systems and black boxes leads to an ODE system, which reads ),...,,(),...,,(/ 1 11 1 sr N q q i n j rjiji xfxvdtdx + = + == rr for ),...1( li = , ),...,,(/ 1 1 sr N q q ii xfdtdx = + = r for ),...1( mli += , where n denotes the number of reactions and N the number of black boxes. 5.3 The model of CD95-induced apoptosis A quantitative model of CD95-induced apoptosis was derived from the network topology, thereby, taking into account the information on all underlying mechanisms (Fig. 2). A detailed reaction mechanism was established for the DISC- and the caspase-system. The mechanisms at the DISC are largely described by elementary reactions, whereas the caspase cleavage process is considered an enzymatic process (e.g. Stennicke and Salvesen 1999). In principle, these interactions could have been modelled in a more simplified way. The influence of caspase-3 on Bid, for example, could have been modelled directly thereby using 'effective' parameters without accounting for the intermediate caspase-2 cleavage. However, since time series about the concentration of caspase-2 have been available, this molecule was kept in the system in order to gain more information for system identification. In contrast, many molecules and interactions with equivalent properties were replaced by 'effective' molecules and interactions based on the analysis of parameter sensitivity correlations. The molecules XIAP, IAP1/2, survivin and their interactions with caspase-3,-7, and -9 are for example reduced to one 'effective' molecule Systems biology of apoptosis 357 called IAP and the interactions are described by effective binding parameters. Details about the model are given in Bentele et al. (2004). 5.4 Black boxes Two subsystems with a significant influence on the signalling system were identified. The death process is described by the degradation of all molecules. It is modelled as an exponential decay-function dependent on the executioner caspase activity. The cytochrome-C release of the mitochondria is based on experimental observations (Goldstein et al. 2000), which describe a complete release within 5 minutes as soon as Bid reaches a certain level in comparison to Bcl-2/Bcl-XL. To model the degradation process, a virtual state variable called xapop was introduced, which quantifies the 'apoptotic activity', by which the strength of the final death process is characterized. It is assumed that the velocity of cell degradation is directly influenced by this activity. It is also assumed that the activity itself is caused by active caspase-3, -6, and -7 and that the increase of the activity runs in parallel to the experimentally observable PARP cleavage. Thus, xapop represents the activity of the apoptotic processes triggered by executioner caspases. The degradation process is modelled by a decay function depending on xapop, thus, decreasing the concentration of all molecules of the pathway. 5.5 Experimental data A set of experiments to measure time series of concentrations of 14 different molecules and complexes (see framed molecules in Fig. 2) after activation of CD95-receptors was designed. Cells were stimulated with different concentrations of agonistic anti-APO-1 antibody, also referred to as ligands in the following, for various periods of time (from 5 minutes to 4 days). Each sample was evaluated by three independent approaches. See Bentele et al. (2004) for experimental details. In a first set of experiments, time series were measured for a 'fast' activation scenario with an oversaturated ligand concentration corresponding to more than one ligand per CD95-receptor. To gain additional information about the system's dynamic, several experiments with much lower ligand concentrations were performed resulting in a slower activation of apoptosis. 6 Model reduction by sensitivity analysis The above-described model consists of 41 molecules and molecule complexes, 32 reactions, and 2 black boxes. It contains more than 50 missing parameters. Therefore, it is still too complex for reliable system identification and requires further reduction of complexity considering the limited number of data points. Here, we developed an approach for model reduction by sensitivity analysis. 358 Martin Bentele and Roland Eils Fig. 2. Structured information model of CD95-induced apoptosis. In the mechanistic part (DISC, caspases, IAP), interactions are modelled as elementary reactions including competitive inhibitions and enzymatic reactions. Receptors are activated by ligands initiating the DISC formation. After binding to the DISC binding site (DISCbs), procaspase-8 is cleaved (initiator caspase), followed by the activation of executioner caspases (3, 6, 7). PARP cleavage was chosen as experimental end-point of the pathway. The mitochondria and the degradation process, which influences all molecules, are modelled as black boxes defined by their input-output behaviour. Experimental time series were measured for the molecules framed in red. 6.1 The sensitivity matrix Parametric sensitivity analysis determines the changes of the system behaviour as a result of parameter variations (Varma et al. 1999). In a system with m state variables (x1, ..., xm) and n parameters (1, ..., n), the relative sensitivities sij = ( xi /xi) / ( j / j ) describe the relative changes of the state variables as a result of changes of the parameters. In the signal transduction systems, the state variables mostly correspond to molecule concentrations. Note that the sensitivities are time-dependent (sij = sij(t)) and that the time points, for which sensitivities are Systems biology of apoptosis 359 Fig. 3. Sensitivity Matrix. The sensitivity matrix elements ijs^ show the relative changes of each state variable i (left to right), mostly referring to molecule concentrations, with respect to relative changes of each parameter j (front to back). The indices refer to the model definition of (Bentele et al. 2004). computed, have to be chosen carefully. In Metabolic Control Analysis (Kell and Westerhoff 1986; Fell 1992) steady states depending on parameter variations are investigated, whereas in signal transduction systems, the transient behaviour is of high interest to analyse the regulation of a system. As a consequence, the complete time period, during which, for example, a signalling pathway is active and exhibits a dynamical behaviour, is relevant rather than a distinct time point: = + tt t ijtij dttss 0 0 )(^ 1 . The time point t0 corresponds to the start of the investigated scenario and t to the period, in which the system shows reactions to parameter variations. For the apoptosis system, t0 is the time point at which the pathway becomes activated by CD95-ligands and the time interval ends when the cell is completely degraded. 360 Martin Bentele and Roland Eils A sensitivity matrix with elements { ijs^ } is visualized in Fig. 3, showing two important facts: * Sensitivities are low in general indicating high robustness and a lower 'effective' dimensionality of the parameter space since many parameters have only little impact on most molecules. * Apparently, clusters can be identified that contain a subset of molecules, whose concentrations depend on a subset of parameters only. This inherent system property is an important feature for further modularization. 6.2 Local versus global sensitivity analysis Usually, sensitivities of huge models can be computed numerically only and a general relation between sensitivities )(^ r ijs and the parameter set (1, ..., n), at which the sensitivities are determined, cannot be deduced analytically. Instead, sensitivities are determined for specific points in parameter space only and are, therefore, called local sensitivities. In this study, however, sensitivity analysis is used as an essential tool for model reduction, which is required for system identification. As a consequence, it has to be performed in a virtual experiment prior to determination of parameters (see stochastic approach to sensitivity analysis be- low). Global sensitivity analysis, which provides information about sensitivities for the complete space of possible parameter values, is impaired by the high dimensionality of the parameter space. Although this situation is ubiquitous for complex biological systems, a general solution to this problem does not exist. 6.3 Stochastic approach to global sensitivity analysis In a virtual experiment, sensitivity analysis is performed for a large number of randomly chosen points in parameter space within specified ranges. The ranges are defined for each parameter type (e.g. bimolecular reaction rate constants, initial concentrations, Michaelis-Menten constants, etc.), unless more precise information was available. Thereby, the concept of `Sensitivity of Sensitivities' is introduced, which examines the robustness of sensitivities with respect to parameter variations. This is motivated by two important facts: * Biological systems often keep their system properties constant, although they are subject to high parameter fluctuations (Barkai and S 1997; Meir et al. 2002; Alon et al. 1999) suggesting that at least some sensitivities are insensitive with respect to parameter fluctuations (low Sensitivity of Sensitivities). * Structure and connectivity of biochemical networks suggests that the influence of some parameters on distant network regions is limited and that the corresponding sensitivities are extremely low - independently of the parameter values. Systems biology of apoptosis 361 Fig. 4. Sensitivity Histograms: Sensitivity of Sensitivities. Each box shows a histogram for a specific sensitivity ijs^ , computed for a large number of randomly chosen points in parameter space. Parameter and molecule indices refer to the model definition of Bentele et al. (2004). The X-axis represents the relative sensitivity values from 0 to 2 and the Y-axis corresponds to the density of occurrences. The blue plot shows the uniformly weighted distribution of sensitivities, whereas for the red one, each contribution was weighted with the Boltzmann factor, resulting in sharper and sometimes slightly shifted peaks. The histograms are exemplary for all matrix elements { ijs^ }. Typically, histograms show clear peaks close to zero - an important property for further modularization. However, distributions like C, D, J, or M are not informative for modularization. The distribution of the computed sensitivities )(^ r ijs for a large number of different points in the parameter space, }{ q r , are plotted in form of histograms for each sensitivity to show their distribution (Fig. 4). The histograms are generated in two different ways. In a first approach, all random parameter sets are equally weighted, independent of how much the resulting systems dynamics deviate from the real system. As an extension, information from experimental data was incorporated by introducing a Boltzmann factor: based on experimental time series of molecule concentrations, an objective function Eq was calculated for each parameter set q r based on the differences between the experimental and simulation data: 362 Martin Bentele and Roland Eils - = ki ik q k el iik q txx E , 2 2modexp )),(( r , exp ikx : experimental values of the concentration of molecule i at time tk, ),(mod q k el i tx r : simulated values of the concentration at time tk for parameter set q r , ik : standard deviation according to experimental data, {tk}: time points of experimental data. Then, it is assumed that the probability pq that a system with parameter set q r produces the experimental output { exp ikx } follows a Boltzmann distribution with the objective function as energy term: )/exp( kTEp qq The assumption is motivated as follows: considering Gaussian random measurement errors, each characterized by ij, the probability pq can be written as product (Gershenfeld 1999) - - ki ik q k el iik q txx p , 2 2modexp )),(( 2 1 exp r , which is equivalent to the upper Boltzmann distribution using the definition of Eq. For generation of sensitivity histograms, the Boltzmann factor was used as weighting factor: instead of counting the number of parameter sets with sensitivities within a certain sensitivity interval, the contribution of each parameter set q r is given by )/exp( kTEq- . Thereby, the statistical impact of sensitivities for parameter sets that are more consistent with the experimental observations are am- plified. Additional information can be gained by varying the factor kT . If a sensitivity value is insensitive with respect to parameter variations, particularly within the subspace of possible solutions (areas in parameter space with a low objective function), `cooling down' the system by decreasing kT will result in histograms with a much sharper peak. Sensitivity histograms showing more than one distinct peak when kT is decreased indicate that the respective sensitivity strongly depends on the exact parameter set within the parameter subspace of probable solutions. 6.4 Sensitivity of sensitivities The most crucial outcome of the 'global' sensitivity analysis approach presented here is the fact that sensitivities of sensitivities are extremely low in most cases, as shown in Fig. 4 for some exemplary sensitivity histograms. Obviously, most distributions show distinct and narrow peaks, indicating high robustness towards Systems biology of apoptosis 363 large variations of the parameter values. Whenever a sensitivity value ijs^ is close to zero, an extremely sharp peak indicates that the state variable xi is likely not to be influenced by parameter j, regardless of the exact parameter value set. By introduction of the Boltzmann weighting (Fig. 4, red line), most peaks become even sharper. Only in few cases, more than one peak remains or the distribution broadens as a consequence of the `cooling', indicating that the system runs in different modes for certain areas of the parameters space. As a consequence of the low sensitivity of sensitivities, subsets of parameters can be determined, which are unlikely to influence certain state variables. Considering the high number of sensitivities with 0^ ijs , this step is crucial to reduce the system's dimensionality even without knowledge of the true parameter values. Thus, this method provides a basis for high-dimensional parameter estimation. 7 Sensitivity-controlled parameter estimation Parameters are estimated based on experimental time series of measurable molecule concentrations using the maximum likelihood estimation (Gershenfeld 1999), which leads to the least-square problem: .min )),(( )( , 2 2modexp - = ki ik k el iik txx E r r Thus, the objective function E, defined as the sum of squares of differences between experimentally measured and simulated molecule concentrations, divided by the standard deviation in order to lower the impact of experimentally less reliable values, has to be minimized. A review about methods commonly applied for biochemical reaction systems is given in Mendes and Kell (1998). 7.1 Cluster-based parameter estimation This approach takes advantage of the fact that clusters of state variables and parameters can be identified in such a way as to have subsets of state variables whose temporal behaviour depends on a subset of parameters only. Considering the sensitivity matrix )^( ijs of m state variables, n parameters, and an average sensitivity s , it is assumed that sensitivities fulfilling