BIOINFORMATICS Vol. 19 no. 3 2003, pages 336–344 DOI: 10.1093/bioinformatics/btf851 Genetic Network Analyzer: qualitative simulation of genetic regulatory networks Hidde de Jong1,∗, Johannes Geiselmann2, C´eline Hernandez1 and Michel Page1, 3 1INRIA Rhˆone-Alpes, 655 avenue de l’Europe, Montbonnot, 38334 Saint Ismier CEDEX, France, 2PEGM, Universit´e Joseph Fourier, Grenoble and 3ESA, Universit´e Pierre Mend`es France, Grenoble Received on February 1, 2002; revised on June 5, 2002; accepted on August 12, 2002 ABSTRACT Motivation: The study of genetic regulatory networks has received a major impetus from the recent development of experimental techniques allowing the measurement of patterns of gene expression in a massively parallel way. This experimental progress calls for the development of appropriate computer tools for the modeling and simulation of gene regulation processes. Results: We present Genetic Network Analyzer (GNA), a computer tool for the modeling and simulation of genetic regulatory networks. The tool is based on a qualitative simulation method that employs coarse-grained models of regulatory networks. The use of GNA is illustrated by a case study of the network of genes and interactions regulating the initiation of sporulation in Bacillus subtilis. Availability: GNA and the model of the sporulation network are available at http://www-helix.inrialpes.fr/gna. Contact: Hidde.de-Jong@inrialpes.fr INTRODUCTION The study of genetic regulatory networks has taken a qualitative leap through the use of modern genomic techniques that allow simultaneous measurement of the expression levels of all genes of an organism. In addition to experimental tools, computer tools for the modeling and simulation of gene regulation processes will be indispensable. As most networks of interest involve many genes connected through interlocking positive and negative feedback loops, an intuitive understanding of their dynamics is difficult to obtain and may lead to erroneous conclusions. Modeling and simulation tools, with a solid foundation in mathematics and computer science, allow the behavior of large and complex systems to be predicted in a systematic way (e.g. de Jong, 2002). Several computer tools for the simulation of biochemical reaction networks by means of differential equations are currently available, such as DBsolve (Goryanin et al., ∗To whom correspondence should be addressed. 1999), GEPASI (Mendes, 1993), and PLAS (Voit, 2000). These tools can be used to simulate genetic, metabolic, and signal transduction networks described by differential equations. In addition, they allow the user to perform tasks like the stability analysis of steady states and the estimation of parameter values. The currently available tools are essentially restricted to quantitative models of reaction networks, in the sense that numerical values for the kinetic parameters and molecular concentrations need to be specified. However, since this information is usually absent, especially in the case of systems that are not well understood, the above-mentioned tools may be difficult to apply. This paper presents Genetic Network Analyzer (GNA), a computer tool for the qualitative simulation of genetic regulatory networks. GNA employs piecewise-linear (PL) differential equation models that have been well studied in mathematical biology (Glass and Kauffman, 1973; Mestl et al., 1995; Snoussi, 1989). While abstracting from the precise molecular mechanisms involved, the PL models capture essential aspects of gene regulation. Their simple mathematical form permits a qualitative analysis of the dynamics of the genetic regulatory systems to be carried out. Instead of numerical values for parameters and initial conditions, GNA asks the user to specify qualitative constraints on these values in the form of algebraic inequalities. Unlike precise numerical values, these constraints can usually be inferred from the experimental literature. GNA supports a qualitative simulation method that recasts the mathematical analysis of PL models of genetic regulatory networks in a computational form (de Jong et al., 2002b). In this paper we will give only a brief outline of the simulation method, abstracting away advanced mathematical concepts and algorithmic details. We will focus on the computer tool supporting the method and its practical use. That is, we will explain how GNA can be used by biologists and bioinformaticians to formulate models of genetic regulatory networks, simulate the behavior emerging from the interactions in the network, and interpret the simulation results in biological terms. 336 Bioinformatics 19(3) c Oxford University Press 2003; all rights reserved. Genetic Network Analyzer Readers interested in details of the qualitative simulation method are referred to (de Jong et al., 2002b). The use of GNA will be illustrated in the context of a regulatory network of biological interest, consisting of the genes and interactions regulating the initiation of sporulation in the Gram-positive soil bacterium Bacillus subtilis (Burkholder and Grossman, 2000; Grossman, 1995; Hoch, 1993). Under conditions of nutrient deprivation, B.subtilis can decide not to divide and form a dormant, environmentally resistant spore instead. The decision to either divide or sporulate is controlled by a regulatory network integrating various environmental, cell-cyle, and metabolic signals. The aim of the example is to show that GNA is able to reproduce experimental findings in the case of a large and complex network that is well-understood by molecular biologists. In the next section, we describe the PL models of genetic regulatory networks and the basics of the simulation method. We then discuss the computer tool GNA, followed by an illustration of its application to the sporulation network. The paper concludes with a discussion of the simulation tool and an outline of ideas for further work. QUALITATIVE SIMULATION OF GENETIC REGULATORY NETWORKS PL models The dynamics of genetic regulatory networks can be modeled by a class of PL differential equations originally proposed by Glass and Kauffman (1973), and generalized by Mestl et al. (1995). The equations have the form ˙xi = fi (x) − gi (x) xi , xi ≥ 0, 1 ≤ i ≤ n, (1) where x = (x1, . . . , xn) is a vector of cellular protein concentrations. The state equations (1) define the rate of change of the concentration xi as the difference of the rate of synthesis fi (x) and the rate of degradation gi (x) xi of the protein. The function fi : Rn ≥0 → R≥0 is defined as fi (x) = l∈L κil bil(x), (2) where κil is a rate parameter (κil > 0), bil : Rn ≥0 → {0, 1} a regulation function, and L a possibly empty set of indices of regulation functions. A regulation function bil is the arithmetic equivalent of a Boolean function expressing the logic of gene regulation (Mestl et al., 1995; Thomas et al., 1995). In the simplest case, fi (x j ) = κi s+(x j , θj ), where s+(x j , θj ) = 0, if x j < θj , and s+(x j , θj ) = 1, if x j > θj . The function fi (x j ) states that below a threshold concentration θj (θj > 0), gene i is not expressed, whereas above this threshold it is expressed at a rate κi . If protein J, encoded by gene j, is a negative regulator of gene i, we have fi (x j ) = κi s−(x j , θj ), with s−(x j , θj ) = 1 − s+(x j , θj ). More complex regulation functions can express the combined effects of several regulatory proteins (de Jong et al., 2002a). The use of step functions in gene regulation models has been motivated by the observation that the rate of activation of a gene, as a function of the concentration of a regulatory protein, often follows a steep sigmoidal curve. The function gi expresses the regulation of protein degradation. It is defined analogously to fi , except that we demand that gi (x) is strictly positive. In addition, in order to formally distinguish degradation rates from synthesis rates, we will denote the former by γ instead of κ. As can be easily verified, with the above definitions of fi and gi , the differential equations (1) are piecewise linear (PL) (Glass and Kauffman, 1973; Mestl et al., 1995; Snoussi, 1989). Properties of PL models The dynamical properties of the PL models can be analyzed in the n-dimensional phase space box = 1 × · · · × n, where every i , 1 ≤ i ≤ n, is defined as i = {xi ∈ R≥0 | 0 ≤ xi ≤ maxi }. maxi is a parameter denoting a maximum concentration for the protein. Given that the protein encoded by gene i has pi threshold concentrations, the n − 1-dimensional threshold hyperplanes xi = θki i , 1 ≤ ki ≤ pi , partition into hyperrectangular regions that are called domains. More precisely, a domain D ⊆ is defined by D = D1 × · · · × Dn, where every Di , 1 ≤ i ≤ n, is given by one of the following equations: Di ={xi | 0 ≤ xi < θ1 i }, Di ={xi | xi = θ1 i }, Di ={xi | θ1 i < xi < θ2 i }, . . . , (3) Di ={xi | θ pi i < xi ≤ maxi }. If for a domain D, there are some i, j, 1 ≤ i ≤ n, 1 ≤ j ≤ pi , such that Di = {xi | xi = θ j i }, then D is called a switching domain. Otherwise, D is called a regulatory domain. When evaluating the step function expressions in (2) in a regulatory domain, fi and gi reduce to sums of rate constants. More precisely, in every regulatory domain D ⊆ , fi reduces to some µD i ∈ Mi ≡ { fi (x) | 0 ≤ x ≤ max}, and gi to some νD i ∈ Ni ≡ {gi (x) | 0 ≤ x ≤ max}. Mi and Ni collect the synthesis and degradation rates of the protein in different domains of . It can be easily shown that all trajectories in D monotonically tend towards a stable steady state x = µD/νD, the target equilibrium, lying at the intersection of the n hyperplanes 337 H.de Jong et al. xi = µD i /νD i (Glass and Kauffman, 1973; Mestl et al., 1995; Snoussi, 1989). The target equilibrium level µD i /νD i of the protein concentration xi gives an indication of the strength of gene expression in D. Call (D) = {µD/νD} the target equilibrium set of D. If (D) ∩ D = {}, then all trajectories will remain in D. If not, they will leave D at some point. In switching domains, fi and gi may not be defined, because some concentrations assume their threshold value. Moreover, fi and gi may be discontinuous in switching domains. In order to cope with this problem, the system of differential equations (1) is extended into a system of differential inclusions, following an approach widely used in control theory (Gouz´e and Sari, 2002). Using this generalization, it can be shown that, in the case of a switching domain D, the trajectories either traverse D instantaneously or tend towards a target equilibrium set (D). Here, (D) is the smallest closed convex set including the target equilibria of regulatory domains having D in their boundary, intersected with the hyperplane containing D. If (D) ∩ D = {}, then the trajectories may remain in D. If not, they will leave D at some point. Qualitative constraints on parameters Most of the time, precise numerical values for the threshold and rate parameters in (1) are not available. Rather than numerical values, we specify qualitative constraints on the parameter values. These constraints, having the form of algebraic inequalities, can usually be inferred from biological data. They are exploited by the simulation method to predict the qualitative dynamics of the regulatory system. The first type of constraints, the threshold inequalities, are obtained by ordering the pi threshold concentrations of the protein encoded by gene i, i.e., 0 < θ1 i < · · · < θ pi i < maxi , (4) The threshold inequalities determine the partitioning of into regulatory and switching domains. The second type of constraints, the equilibrium inequalities, order the possible target equilibrium levels of xi in different regulatory domains D ⊆ with respect to the threshold concentrations. Biologically speaking, the equilibrium inequalities define the strength of gene expression in the domain in a qualitative way, on the scale of ordered threshold concentrations. More precisely, for every µi ∈ Mi , νi ∈ Ni , and µi , νi = 0, we specify one of the following pairs of inequalities: 0 < µi /νi < θ1 i , θ1 i < µi /νi < θ2 i , . . . θ pi i < µi /νi < maxi . (5) The equilibrium inequalities constrain the relative position of D and its target equilibrium set (D). The models of genetic regulatory networks treated by the simulation method consist of state equations (1), supplemented by parameter inequalities (4) and (5). Every such qualitative PL model corresponds to a set of quantitative PL models consisting of state equations (1) and a particular combination of numerical parameter values consistent with the parameter inequalities. Qualitative simulation Let x, defined on some time-interval [0, τ[, be a solution of a quantitative PL model describing a genetic regulatory network. Furthermore, at some time-point t, 0 ≤ t < τ, x(t) ∈ D. A qualitative description of x at t consists of the domain D, supplemented by the relative position of D and (D). We call this the qualitative state of the system. On [0, τ[ the solution traverses a sequence of domains D0, . . . , Dm in . Whenever x enters a new domain, the system makes a transition to a new qualitative state. The sequence of qualitative states corresponding to the sequence of domains is called the qualitative behavior of the system on the time-interval. Every solution of a quantitative PL model can be uniquely abstracted into a qualitative behavior. Given a qualitative PL model and initial conditions in a domain D0, the aim of qualitative simulation is to determine the possible qualitative behaviors of the system. More precisely, denoting by X the set of solutions x(t) of all quantitative PL models corresponding to the qualitative model, such that x(0) = x0 ∈ D0, the aim of qualitative simulation is to find the set of qualitative behaviors abstracting from some x ∈ X. In de Jong et al. (2002b) a simulation algorithm is described that generates a set of qualitative behaviors by recursively determining qualitative states and transitions from qualitative states, starting at the qualitative state associated with the initial domain D0. This results in a transition graph, a directed graph of qualitative states and transitions between qualitative states. The transition graph contains qualitative equilibrium states or qualitative cycles. These may correspond to equilibrium points or limit cycles reached by solutions in X, and hence indicate functional modes of the regulatory system. A sequence of qualitative states in the transition graph represents a predicted qualitative behavior of the system. It has been demonstrated that the transition graph generated by the simulation algorithm covers all qualitative behaviors abstracting from some x ∈ X (de Jong et al., 2002b). That is, whatever the exact numerical values for the parameters may be, if these values are consistent with the threshold and equilibrium inequalities specified in the qualitative PL model, the qualitative shape of the solution is described by a sequence of states in the transition graph. 338 Genetic Network Analyzer conditions initialmodel simulator results simulation model file initial conditions file viewer graph transition viewer behavior qualitative controller simulation interaction graph viewer user VisualGNA parser Fig. 1. The global architecture of GNA. GENETIC NETWORK ANALYZER (GNA) The qualitative simulation method has been implemented in Java 1.3 in the program GNA. The global architecture of GNA is shown in Figure 1. The core of the system is formed by the simulator, which generates a transition graph from a qualitative PL model and initial conditions. The input of the simulator is obtained by reading and parsing text files specified by the user. A graphical user interface (GUI), named VisualGNA, assists the user in specifying the model of a genetic regulatory network as well as in interpreting the simulation results (Figure 2). The current version of GNA consists of about 110 classes, one third of which concerns VisualGNA. We have chosen Java to implement the qualitative simulation method, because it has become a standard for object-oriented programming and is widely used for biological applications. In addition, the powerful graphical capabilities provided by Swing and other graphical libraries facilitate the development of the GUI, which is a critical component of GNA. The choice for Java is further motivated by its portability, allowing GNA to be used without special adaptations on different platforms (the program has been tested under Windows NT 4.0, 98, 2000, Solaris 6, and Linux 2.4.2). GNA is available for non-profit academic research purposes at http://www-helix.inrialpes.fr/gna. Parser The simulation of a genetic regulatory network requires the user to prepare text files containing the PL model and the initial conditions. For each protein, the model file contains declarations of the state variable and the production, degradation, and threshold parameters, as well as definitions of the state equations and threshold and equilibrium inequalities. A fragment of the model file for the sporulation network is shown in Figure 2(a). The initial conditions file contains the inequalities defining the initial domain (not shown). The input files are read and compiled by a parser generated by Java CUP. The parser transforms the information in the input file into internal representations of the model and initial conditions, which are accessed by the simulator and the GUI. Simulator The essential tasks of the simulator consist in computing the qualitative states associated with a domain and in determining the possible transitions from these states. Instead of performing extensive numerical calcultions, the simulator reaches its goal through symbolic computation, by exploiting the parameter inequalities (4)–(5) (de Jong et al., 2002b). Whereas previous versions of GNA used a simple inequality reasoner to achieve this, the current version employs an integer coding scheme that permits more efficient computation. The output of the simulator is a transition graph consisting of qualitative states and transitions between qualitative states. Various graph algorithms operate on the internal representation of this graph to detect, among other things, qualitative equilibrium states and qualitative cycles. Graphical user interface: VisualGNA VisualGNA, the GUI component of GNA, actually consists of a simulation controller, an interaction graph viewer, a transition graph viewer, and a qualitative behavior viewer (Figure 1). The simulation controller allows the user to load a model, launch a simulation, and analyze the results in a menu-driven way. The interaction graph viewer extracts the network structure from the model and visualizes it in the form of an interaction graph. The transition graph and qualitative behavior viewers help the user in interpreting the output of a qualitative simulation. The transition graph viewer displays the transition graph with the qualitative states and transitions generated by GNA. In addition, it offers the facility to focus on interesting parts of the graph by selecting, filtering, and expanding subgraphs. The qualitative behavior viewer explores the contents of the transition graph by visualizing a sequence of qualitative states as a set of concentration profiles, one for each protein. Alternatively, this information can be displayed as a sequence of projections on the interaction graph, providing an intuitive representation of the global change 339 H.de Jong et al. (a) (b) Fig. 2. Modeling and simulation of a genetic regulatory network by means of GNA. (a) Excerpt of the model file of the sporulation network (see de Jong et al. (2002a) for the complete model). The figure shows the parameter declarations, state equations, and parameter inequalities concerning the state variables xse (concentration Spo0E). (b) Screen capture of GNA. The window on the left shows a part of the state transition graph resulting from a simulation of the network under initial conditions inducing sporulation, while the window on the right visualizes the temporal sequence of qualitative states in one selected path in the state transition graph. of gene expression over time. Figure 2(b) illustrates some of the functionalities of VisualGNA (for more details, see the HTML documentation of GNA). MODELING AND SIMULATION: INITIATION OF SPORULATION IN B.SUBTILIS The use of GNA will be illustrated by modeling and simulating the regulatory network underlying the initiation of sporulation in B.subtilis. The modeling and simulation process consists of four steps: (i) collecting information on relevant genes, proteins, and regulatory interactions; (ii) describing the network by a qualitative PL model; (iii) simulating the system from appropriate initial conditions; and (iv) interpreting the simulation results. Although presented in a linear fashion below, in practice the four steps are usually carried out iteratively. Sporulation in B.subtilis is one of the best-understood model systems for prokaryotic development. However, notwithstanding the enormous amount of work devoted to the elucidation of the network of interactions underlying the sporulation process, very little quantitative data on kinetic parameters and molecular concentrations are available. The aim of the example is to show that GNA is able to reproduce the observed qualitative behavior of wild-type and mutant bacteria from a model that is a synthesis of available data in the literature (de Jong et al., 2002a). This provides a validation of the tool in the case of a large and complex genetic regulatory network. Sporulation network While nutrients are plentiful, B.subtilis cells divide as fast as possible in order to efficiently compete with their neighbors. However, when conditions become unfavorable, the bacterium protects itself by forming environmentally resistant spores (Burkholder and Grossman, 2000; Grossman, 1995; Hoch, 1993; Stragier and Losick, 1996). A graphical representation of the regulatory network controlling the initiation of sporulation is shown in Figure 3, displaying key genes and their promoters, proteins encoded by the genes, and the regulatory action of the proteins. References to the experimental literature having been used to compile the network are given in de Jong et al. (2002a). The network is centered around a phosphorelay, which integrates a variety of environmental, cell-cycle, and metabolic signals. Under conditions appropriate for sporulation, the phosphorelay transfers a phosphate to the Spo0A regulator, a process modulated by kinases and phosphatases. The phosphorelay has been simplified in this paper by ignoring intermediate steps in the transfer of phosphate to Spo0A. However, this simplification does not affect the essential function of the phosphorelay: modulating the phosphate flux as a function of the competing action of kinases and phosphatases (here KinA and Spo0E). Under conditions conducive to sporulation, such as nutrient deprivation or high population density, the concentration of Spo0A∼P may reach a threshold value above which it activates various genes that commit the bacterium to sporulation. The choice between vegetative growth and 340 Genetic Network Analyzer spo0E kinA sigF sinRsinI spo0A abrB hpr sigH Hpr Spo0E AbrB SinR phosphorelay SinI Signal KinA Spo0A Fig. 3. Network of key genes, proteins, and regulatory interactions involved in B.subtilis sporulation. The graphical representation follows the conventions proposed in Kohn (2001). sporulation in response to adverse environmental conditions is the outcome of competing positive and negative feedback loops, controlling the accumulation of Spo0A∼P (Grossman, 1995; Hoch, 1993). Modeling of the sporulation network For modeling, the graphical representation of Figure 3 has to be translated into state equations (1), supplemented by parameter inequalities (4) and (5). The Spo0E protein serves as an example to illustrate the procedure. The phosphatase Spo0E involved in the phosphorelay is encoded by the gene spo0E (Perego and Hoch, 1987). Transcription of spo0E is directed by Eσ A (Perego and Hoch, 1987) and repressed by AbrB (Perego and Hoch, 1991; Strauch et al., 1989). Spo0E degradation is not known to be regulated. Denoting by xse the Spo0E concentration, and by κse and γse the synthesis and degradation rate constants, we obtain ˙xse = κse s+ (xa, θa) s− (xab, θ1 ab) − γse xse. (6) The state equation expresses that spo0E is transcribed at a rate κse, if the concentration xab of AbrB is below the threshold θ1 ab (s−(xab, θ1 ab) = 1), while the concentration of σ A is above the threshold θa (s+(xa, θa) = 1). Four threshold concentrations are defined for Spo0E, θ1 se, . . . , θ4 se, indicating four levels of activation of the phosphorelay. The thresholds are ordered by the threshold inequalities 0 < θ1 se < θ2 se < θ3 se < θ4 se < maxse. (7) The possible synthesis and degradation rates of Spo0E are given by Mse = {0, κse} and Nse = {γse}, respectively. We will set κse/γse > θ3 se. If this were not the case, Spo0E would not be able to reach a concentration at which it can exert any influence on the sporulation decision (Perego and Hoch, 1991). In fact, above its threshold concentration θ3 se, Spo0E is able to reduce the activation of the phosphorelay to the point that the expression of σ F , and hence the initiation of sporulation, is inhibited (de Jong et al., 2002a). This motivates the following equilibrium inequalities: θ3 se < κse/γse < θ4 se. (8) Figure 2(a) shows the model file entry for Spo0E. The state equations and parameter inequalities for the other proteins in the network of Figure 3 can be formulated in an analogous way (de Jong et al., 2002a). The resulting model consists of nine state variables and two input variables. The state equations are often complex expressions of step functions, reflecting the dense pattern of interactions in the regulatory network. The 49 parameters are constrained by 58 threshold and equilibrium inequalities, the choice of which is determined by biological data. If the parameter inequalities cannot be unambiguously determined, all possible combinations have to be systematically tested. Simulation of wild-type and mutant bacteria GNA has been used to simulate the dynamics of the initiation of sporulation. Starting from initial conditions representing vegetative growth, the system is perturbed by a sporulation signal that causes KinA to autophosphorylate. Simulation of the network takes less than a few seconds to complete on a PC (500 MHz, 128 MB), and gives rise to a transition graph of 465 qualitative states. Many of these states are associated with switching domains that the system traverses instantaneously. Since the biological 341 H.de Jong et al. (a) (b) Fig. 4. Results of simulating wild-type strain under sporulation conditions. The pictures have been generated by GNA. (a) Temporal evolution of selected protein concentrations in a typical qualitative behavior leading to the spo+ phenotype. (b) Idem, but for a typical qualitative behavior leading to the spo− phenotype. relevance of the latter states is limited, they can be eliminated from the transition graph. This leads to a reduced transition graph with 82 qualitative states, part of which is shown in the left window in Figure 2(b). The transition graph contains a single qualitative equilibrium state that the system eventually reaches in all qualitative behaviors. A crucial test for the pertinence and validity of our model consists in analyzing the behavior of sporulation mutants. These can be modeled by adapting the state equations. In the case of a sinI deletion, for example, we have ˙xsi = −γsi xsi , with xsi denoting the cellular concentration of SinI. The equation expresses that in the absence of the sinI gene, SinI cannot be synthesized. Simulating the behavior of the sinI mutant under sporulation conditions produces a reduced transition graph of 31 qualitative states (not shown). In (de Jong et al., 2002a) a dozen examples of the simulation of B.subtilis sporulation mutants are discussed. Analysis of simulation results The simulated behavior of the network should reflect the essential biological characteristics of the sporulation initiation process. Analysis of the state transition graphs by means of VisualGNA allows one to investigate in detail the predicted qualitative equilibrium state, as well as qualitative behaviors leading to the equilibrium state. Consider first the state transition graph for the wildtype strain. The qualitative equilibrium state corresponds to the spo− phenotype, because the concentration of σ F , a sigma factor essential for the development of the forespore (Stragier and Losick, 1996), has not reached the threshold above which it directs the transcription of its target genes. Whereas in some qualitative behaviors the system directly reaches the qualitative equilibrium state, in others it first passes through a period of transitory sigF expression, which corresponds to the spo+ phenotype. A typical qualitative behavior for sporulation as well as for vegetative growth are shown in Figure 4. For ease 342 Genetic Network Analyzer of presentation, only four of the nine state variables are represented. The initial phosphorylation of Spo0A leads to downregulation of abrB, which favors expression of sigH and sinI. Increasing levels of σ H and SinI lead to an increase in spo0A and kinA expression. As a consequence, Spo0A∼P levels continue to rise and activate sigF after some time, as can be seen in (a). On the other hand, as shown in (b), abrB downregulation also causes the Spo0E concentration to rise, which prevents Spo0A∼P from accumulating to the point where sinI is expressed at a level sufficient to inactivate SinR. This leaves SinR repression of the σ H -dependent promoter of Spo0A intact, and thus prevents the initiation of sporulation. The transition graph faithfully represents the two possible responses to nutrient depletion that are observed for B.subtilis: either the bacterium continues vegetative growth or it enters sporulation. The initiation of sporulation is determined by positive feedback loops acting through Spo0A and KinA, and a negative feedback loop involving Spo0E. When the rate of accumulation of the kinase outpaces the rate of accumulation of the phosphatase, we observe transient expression of sigF, i.e. a spo+ phenotype (Figure 4(a)). If the kinetics of these processes are inversed, sigF is never activated and we observe a spo− phenotype (Figure 4(b)). The expression of sigF in the sporulation behavior is transitory, because the concentration of Spo0E continues to rise and therefore the concentration of Spo0A∼P eventually falls below the level required for sigF activation (Figure 4(b)). The reversion of the sporulation decision through the continuing accumulation of Spo0E is due to the fact that the genes and interactions required for later stages of sporulation have not been included in our model. In the case of the sinI deletion strain, no expression, not even transitory, of sigF occurs. That is, a sinI mutant is not predicted to sporulate, an observation supported by biological data (Bai and Mandi´c-Mulec, 1993). Deletion of sinI disrupts the positive feedback loops, because SinR repression of the σ H -dependent promoter of Spo0A cannot be relieved, thus preventing further Spo0A∼P accumulation. DISCUSSION We have presented the computer tool GNA for the qualitative simulation of genetic regulatory networks and illustrated its use in the analysis of the network of interactions controlling the initiation of sporulation in B.subtilis. GNA implements a simulation method that is based on a class of PL differential equation models described in mathematical biology (de Jong et al., 2002b). While abstracting from precise molecular mechanisms, the PL models capture essential aspects of gene regulation. Instead of giving numerical values to the parameters and initial conditions, which are usually not available, we use qualitative constraints in the form of algebraic inequalities. These are obtained by directly translating biological data into a mathematical formalism. The predictions obtained through qualitative simulation have the form of a graph of qualitative states and transitions between qualitative states. A path of qualitative states in the transition graph is a prediction of the qualitative shape of the solution obtained for some combination of parameter values consistent with the parameter inequalities. Although the predictions obtained through qualitative simulation are less precise than their numerical counterparts, it has been demonstrated that no solution of a quantitative model subsumed by the qualitative model is omitted. That is, the qualitative behavior abstracted from each such solution is covered by the transition graph (de Jong et al., 2002b). GNA thus provides guaranteed coverage of the solutions of the class of quantitative models consistent with the qualitative model, without performing exhaustive numerical simulation. The lack of quantitative information on kinetic parameters and molecular concentrations has stimulated an interest in other methods and tools for qualitative simulation (de Jong, 2002). For example, the method QSIM (Kuipers, 1994) has been used to model and simulate λ phage growth in E.coli (Heidtke and Schulze-Kremer, 1998). A major problem in the application of qualitative simulation approaches is their lack of upscalability. GNA differs from traditional qualitative simulators in that it has been tailored to a class of PL models with favorable mathematical properties. This allows it to deal with large and complex genetic regulatory networks, as illustrated by the sporulation example. In comparison with the logical method of Thomas and colleagues (Thomas et al., 1995), which is based on comparable abstractions of regulatory interactions, GNA has been developed for differential equation models. We believe that the latter formalism is intuitively clear and of large generality. Moreover, it facilitates the integration of any quantitative data becoming available through advances in measurement technologies. The graphical user interface of GNA allows the user to analyze the simulation results, by manipulating the transition graph and by exploring the predicted qualitative behaviors of the system on the level of individual protein concentrations. GNA also visualizes the interactions between the genes and proteins in the regulatory network, as described by the model. In order to facilitate the specification of the model and initial conditions, a graphical model editor is currently under development. Simulation of the sporulation network by means of GNA reveals that essential features of the initiation of sporulation can be reproduced by means of a model constructed from the experimental literature (de Jong et al., 2002a). Because sporulation in B.subtilis is one 343 H.de Jong et al. of the best-studied prokaryotic model systems, it is an excellent case study for the validation of the simulation tool. However, the real interest of tools like GNA comes from the simulation of genetic regulatory networks that are less understood and the use of the predictions thus obtained for guiding further experimental work. We are currently applying GNA in the context of studies of the global regulation of transcription in E.coli and Synechocystis. ACKNOWLEDGMENTS The authors would like to thank Gr´egory Batt, Eric Gauthier, Jean-Luc Gouz´e, S´ebastien Maza, Anne Morgat, S´ebastien Provencher, Franc¸ois Rechenmann, Marie-France Sagot, Tewfik Sari, Denis Thieffry, Ivayla Vatcheva, Alain Viari, and three anonymous referees for comments on previous versions of this paper and other contributions. REFERENCES Bai,U. and Mandi´c-Mulec,I. (1993) SinI modulates the activity of SinR, a developmental switch protein of Bacillus subtilis, by protein–protein interaction. Genes Dev., 7, 139–148. Burkholder,W.F. and Grossman,A.D. (2000) Regulation of the initiation of endospore formation in Bacillus subtilis. In Brun,Y.V. and Shimkets,L.J. (eds), Prokaryotic Development. ASM, pp. 151–166. de Jong,H., Geiselmann,J., Batt,G., Hernandez,C. and Page,M. (2002a) Qualitative simulation of the initiation of sporulation in B.subtilis. Technical Report RR-4527. INRIA. de Jong,H., Gouz´e,J.-L., Hernandez,C., Page,M., Sari,T. and Geiselmann,H. (2002b) Qualitative simulation of genetic regulatory networks using piecewise-linear models. Technical Report RR-4407. INRIA. de Jong,H. (2002) Modeling and simulation of genetic regulatory systems: a literature review. J. Comput. Biol., 9, 69–105. Glass,L. and Kauffman,S.A. (1973) The logical analysis of continuous non-linear biochemical control networks. J. Theor. Biol., 39, 103–129. Goryanin,I., Hodgman,T.C. and Selkov,E. (1999) Mathematical simulation and analysis of cellular metabolism and regulation. Bioinformatics, 15, 749–758. Gouz´e,J.-L. and Sari,T. (2002) A class of piecewise linear differential equations arising in biological models. Dyn. Syst, in press. Grossman,A.D. (1995) Genetic networks controlling the initiation of sporulation and the development of genetic competence in Bacillus subtilis. Annu. Rev. Genet., 29, 477–508. Heidtke,K.R. and Schulze-Kremer,S. (1998) Design and implementation of a qualitative simulation model of λ phage infection. Bioinformatics, 14, 81–91. Hoch,J.A. (1993) Regulation of the phosphorelay and the initiation of sporulation in Bacillus subtilis. Ann. Rev. Microbiol., 47, 441– 465. Kohn,K.W. (2001) Molecular interaction maps as information organizers and simulation guides. Chaos, 11, 1–14. Kuipers,B. (1994) Qualitative Reasoning. MIT Press. Mendes,P. (1993) GEPASI: a software package for modelling the dynamics, steady states and control of biochemical and other systems. CABIOS, 9, 563–571. Mestl,T., Plahte,E. and Omholt,S.W. (1995) A mathematical framework for describing and analysing gene regulatory networks. J. Theor. Biol., 176, 291–300. Perego,M. and Hoch,J.A. (1987) Isolation and sequence of the spo0E gene: Its role in initiation of sporulation in Bacillus subtilis. Mol. Microbiol., 1, 125–132. Perego,M. and Hoch,J.A. (1991) Negative regulation of Bacillus subtilis sporulation by the spo0E gene product. J. Bacteriol., 173, 2514–2520. Snoussi,E.H. (1989) Qualitative dynamics of piecewise-linear differential equations. Dyn. Stab. Syst., 4, 189–207. Stragier,P. and Losick,R. (1996) Molecular genetics of Bacillus subtilis. Ann. Rev. Genet., 30, 297–341. Strauch,M.A., Spiegelman,G.B., Perego,M., Johnson,W.C., Burbulys,D. and Hoch,J.A. (1989) The transition state transcription regulator abrB of Bacillus subtilis is a DNA binding protein. EMBO J., 8, 1615–1621. Thomas,R., Thieffry,D. and Kaufman,M. (1995) Dynamical behaviour of biological regulatory networks: I. Biological rule of feedback loops. Bull. Math. Biol., 57, 247–276. Voit,E.O. (2000) Computational Analysis of Biochemical Systems. Cambridge University Press, Cambridge. 344