Presentation on theme: "School of Computer Science, Tel-Aviv University, Tel-Aviv, Israel"— Presentation transcript:
1School of Computer Science, Tel-Aviv University, Tel-Aviv, Israel Constraint-Based Modeling of Metabolic Networks based on: “Genome-scale models of microbial cells: Evaluating the consequences of constraints”, Price, et. al (2004)Tomer ShlomiSchool of Computer Science, Tel-Aviv University, Tel-Aviv, IsraelJanuary, 2006
2Outline Metabolism and metabolic networks Kinetic models vs. constraints-based modelingFlux Balance AnalysisExploring the solution spaceAltering phenotypic potential: gene knockouts
3Cellular Metabolism The essence of life.. Catabolism and anabolism The metabolic core – production of energy – anaerobic and aerobic metabolismProbably the best understood of all cellular networks: metabolic, PPI, regulatory, signalingTremendous importance in Medicine; antibiotics, metabolic disorders, liver disorders, heart disordersBioengineering; efficient production of biological products.
4Metabolites and Biochemical Reactions Metabolite: an organic substance, e.g. glucose, oxygenBiochemical reaction: the process in which two or more molecules (reactants) interact, usually with the help of an enzyme, and produce a productGlucose + ATPGlucokinaseGlucose-6-Phosphate + ADP
6Kinetic Models Dynamics of metabolic behavior over time Metabolite concentrationsEnzyme concentrationsEnzyme activity rate – depends on enzyme concentrations and metabolite concentrationsSolved using a set of differential equationsImpossible to model large-scale networksRequires specific enzyme rates dataToo complicated
7Constraint Based Modeling Provides a steady-state description of metabolic behaviorA single, constant flux rate for each reactionIgnores metabolite concentrationsIndependent of enzyme activity ratesAssume a set of constraints on reaction fluxesGenome scale modelsFlux rate:μ-mol / (mg * h)
8Constraint Based Modeling Find a steady-state flux distribution through allbiochemical reactionsUnder the constraints:Mass balance: metabolite production and consumption rates are equalThermodynamic: irreversibility of reactionsEnzymatic capacity: bounds on enzyme ratesAvailability of nutrients
10Mathematical Representation Stoichiometric matrix – network topology with stoichiometry of biochemical reactionsGlucokinaseGlucose + ATPGlucokinaseGlucose-6-Phosphate + ADPGlucoseATPG-6-PADPMass balanceS·v = 0Subspace of RThermodynamicvi > 0Convex coneCapacityvi < vmaxBounded convex conen
11Growth Medium Constraints Exchange reactions enable the uptake of nutrients from the media and the secretion of waste productsLower bound Upper boundGlucoseOxygen InfCO InfG-Ex O-Ex Co2-ExGlucoseOxygenCO
12Determination of Likely Physiological States How to identify plausible physiological states?Optimization methodsMaximal biomass production rateMinimal ATP production rateMinimal nutrient uptake rateExploring the solution spaceExtreme pathwaysElementary modesOptimization methods are used for several purposes,
13Outline: Optimization Methods Predicting the metabolic state of a wild-type strainFlux Balance Analysis (FBA)Predicting the metabolic state after a gene knockoutMinimization Of Metabolic AdjustmentRegulatory On/Off Minimization
14Biomass Production Optimization Metabolic demands of precursors and cofactors required for 1g of biomass of E. coliClasses of macromolecules:Amino Acids, CarbohydratesRibonucleotides, DeoxyribonucleotidesLipids, PhospholipidsSterol, Fatty acidsThese precursors are removed from themetabolic network in the corresponding ratiosWe define a growth reactionZ = VATP VNADH VNADPH + ….
15Biomass Composition Issues Varies across different organismsDepends on the growth mediumDepends on the growth rateThe optimum does not change much with changes in composition within a class of macromoleculesThe optimum does change if the relative composition of the major macromolecules changes
16Flux Balance Analysis (FBA) Successfully predicts:Growth ratesNutrient uptake ratesByproduct secretion ratesSolved using Linear Programming (LP)Finds flux distribution with maximal growth rateMax vgro, maximize growths.tS∙v = 0, mass balance constraintsvmin v vmax capacity constraintsgrowthFell, et al (1986), Varma and Palsson (1993)
29Alternative Optima The optimal FBA solution is not unique One solution Optimal solutions Near-optimal solutionsgrowthgrowthgrowthBasic solutions enumeration – MILP (Lee, et. al, 2000)Flux variability analysis (Mahadevan, et. al. 2003)Hit and run sampling (Almaas, et. al, 2004)Uniform random sampling (Wiback, et. al, 2004)
30What Do Multiple Solutions Represent ? Some of the solutions probably do not represent biologically meaningful metabolic behaviors as there are missing constraintsPrevious studies tackled this problem by:Incorporating additional constraints: regulatory constraints (Covert, et. al., 2004)Looking for reactions for which new constraints may significantly reduce the solution space (Wiback, et. al., 2004)FBA solution spaceMeaningfulsolutions
31Interpretations of Metabolic Space Effect of exogenous factors – the metabolic space corresponds to growth in a medium under various external conditions that are beyond the model’s scope such as stress or temperatureHeterogeneity within a population - the metabolic space represents heterogenous metabolic behaviors by individuals within a cell population (Mahadevan, et. al., 2003, Price, et. al., 2004)Alternative evolutionary paths – the metabolic space represents different metabolic states attainable through different evolutionary paths (Mahadevan, et. al., 2003, Fong, et. al., 2004)The three interpretations are obviously not mutually exclusive
32Alternative Optima: Basic Solutions Enumeration Lee, et. al, 2000Basic solutions – metabolic states with minimal number of non-zero fluxesDifferent solutions differ in at least a single zero fluxUse Mixed Integer Linear ProgrammingFormulate optimization as to identify new solutions that are different from the previous onesApplicable only to small scale modelsgrowth
33Alternative Optima: Flux Variability Analysis Mahadevan, et. al. 2003Find metabolic states with extreme values of fluxesUse linear programming to minimize and maximize the flux through each reaction while satisfying all constraintsMax / Min vi, maximize growths.tS∙v = 0, mass balance constraintsvmin v vmax capacity constraintsVgro = Vopt set maximal growth rate
34Alternative Optima: Hit and Run Sampling Almaas, et. al, 2004Based on a random walk inside the solution space polytopeChoose an arbitrary solutionIteratively make a step in a random directionBounce off the walls of the polytope in random directions
35Alternative Optima: Uniform Random Sampling Wiback, et. al, 2004The problem of uniform sampling a high-dimensional polytope is NP-HardFind a tight parallelepiped object that binds the polytopeRandomly sample solutions from the parallelepipedCan be used to estimate the volume of the polytope
36Topological Methods Not biased by a statement of an objective Network based pathways:Extreme Pathways (Schilling, et. al., 1999)Elementary Flux Modes (Schuster, el. al., 1999)Decomposing flux distribution into extreme pathwaysExtreme pathways defining phenotypic phase planesUniform random sampling
37Extreme Pathways and Elementary Flux Modes Unique set of vectors that spans a solution spaceConsists of minimum number of reactionsExtreme Pathways are systematically independent (convex basis vectors)A pathway is a metabolic state which satisfies stoichiometric and thermodynamic constrains.Extreme pathways and elementary flux modes are both unique sets of pathways.Both type of pathways are minimal – i.e. there is no other pathway with a subset of the reactions.Only in extreme pathways the pathways are systematically independent. No EP can be described as a combination of the others. EP are the basic of the convex space.Each metabolic state can be described as non-negative combination of EPs.EP and EM are the same where all exchange fluxes are unidirectional.Example network.There was an ongoing debate on which method is better in describing the phenotypic potential of the network. In a recent (and not so convincing paper) of the main supporters of both methods it was agreed that:The main advantage of EP is that there are far fewer of them in a typical network and that they have a mathematical justification being the basis of the convex space. EM are suitable for studying network properties such as redundancy as it includes all minimal pathways.
38Extreme Pathways and Elementary Flux Modes Inherent redundancy in metabolic networks (Price, et. al., 2002)Robustness to gene deletion and changes in gene expression (Stelling, et. al., 2002)Enzyme subsets (correlated reaction sets) in yeast (Papin, et. al., 2002)Design strains (Carlson, et. al., 2002)Assign functions to genes (Forster, et. al, 2002)Both methods were used in numerous papers to study different network properties:The redundancy of the network in producing different nutrients (amino acids) was studied by Price et al. They estimated the redundancy by the number of pathways that involved the synthesis of the nutrients.Robustness of the network to gene deletion was studied by Stelling el al. They estimated robustness to gene knockouts as the percentage of EM that are still feasible after genes are knocked out. The percentage of EM shows high correlation with measurements of lethality. Furthermore, they have shown that the maximal growth is robust to reduction in the number of feasible pathways.Papin have used EP to find sets of reaction that are always activated together. These sets are called Enzyme subsets.
40Altering Phenotypic Potential: Gene Knockouts Minimization Of Metabolic Adjustment (MOMA) (Segre et. al, 2002)The flux distribution after a knockout is close to the wild-type’s state under the Euclidian normRegulatory On/Off Minimization (ROOM) (Shlomi et. al, 2005)Minimize the number of Boolean flux changes from the wild-type’s statePredicting the metabolic state of an organism that undergo gene knock in the lab is harder than predicting the metabolic state of wild-type strain.Minimization Of Metabolic Adjustment (MOMA) developed by Segre is an optimization method which assumes that the metabolic state of the knocked-out organism should be close to that of the wild-type strain as the mutated organism did no evolve to maximize its growth. MOMA uses an Euclidian metric to measure the distance between the metabolic state of the wild-type and knocked-out strains.Regulatory On/Off Minimization (ROOM) is based on the same assumption (that the metabolic state of the knocked-out strain should be close to that of the wild-type strain), but uses a different metric to measure the distance that is based on the number of Boolean flux changes from the metabolic state of the wild-type strain.wv
41Altering Phenotypic Potential Explaining gene dispensability (Papp, el. al., 2004)Only 32% of yeast genes contribute to biomass production in rich mediaConsidered one arbitrary optimal growth solutionOptKnock – Identify gene deletions that generate desired phenotype (Burgard, et. al., 2003)OptStrain – Identify strains which can generate desired phenotypes by adding/deleting genes (Pharkya, el., al., 2004)FBA was used to study metabolic states after the knockouts of dispensable genes. (genes with no observable change in growth following their knockout). They found that only 32% of the genes had non-zero flux – contribute to biomass production under rich media. In a work we did we found that if you consider alternative non-optimal FBA solutions, 90% of the genes may contribute in rich media. Therefore, their function role can be found in rich media without the need to look at other medias.OptKnock is an optimization methods developed by Burgard which identifies configurations of genes whose knockout may cause the overproduction of desired nutrients (amino acids). This method used FBA or MOMA to predict the metabolic state of the knocked-out strain.OptStrain is a newer method which identifies a specific strain which along with a set of genes that can be added or removed from its genome for the same biotechnological purposes.
43Cellular Adaptation to Genetic and Environmental Perturbations Transient changes in expression levels in hundreds of genes (Gasch 2000, Ideker 2001)Convergence to expression steady-state close to the wild-type (Gasch 2000, Daran 2004, Braun 2004)Drop in growth rates followed by a gradual increase (Fong 2004)minutesThere are various experiments showing that following a gene knockout or some kind of environmental perturbation there are large-scale changes in gene expression levels. For example Ideker..However, these experiments and others have shown that following an adaptation period the cell converges to steady-state which is close to that of the wild-type strain. For example we see on the figure on the left (taken from Gasch et al) the average expression ratio following a perturbation in the from of stressful environment which is characterized by sharp changes in expression levels which is followed by a steady-state that is close to the wild-type.A recent experiment by Fong et al have shown that the growth rate of the organism drops immediately following the gene knockout and the gradually increase and reaches a steady-state growth rate which is close or even higher to that of the wild-type.growthgenerations
44Regulatory On/Off Minimization (ROOM) Predicts the metabolic steady-state following the adaptation to the knockoutAssumes the organism adapts by minimizing the set of regulatory changesBoolean RegulatoryChangeBoolean FluxChangeFinds flux distribution with minimal number of Boolean flux changesBased on these findings, we’ve developed the algorithm Regulatory On/Off Minimization to predict the metabolic steady-state following the adaptation of the knocked-out strain to the knockout.ROOM is based on the assumption that an organism adapting to gene knockout, minimizes the set of regulatory changes that it makes. Assigning an equal “cost” to each regulatory change, ROOM tries to minimize the total “cost” of adapting to the knockout.Now, since regulatory constraints are not explicitly incorporated into the metabolic model, ROOM identifies Boolean regulatory changes implicitly by identifying Boolean changes in flux. Therefore ROOM aims to find feasible flux distribution with a minimal number of Boolean flux changes from the flux distribution of the wild-type strain.wv
45ROOM: ImplementationSolved using Mixed Integer Linear Programming (MILP)Boolean variable yiyi = 1Flux vi change from wild-typeMin yi - minimize changess.tv – y ( vmax - w) w - distance constraintsv – y ( vmin - w) w - distance constraintsS∙v = 0, - mass balance constraintsvj = 0, jG knockout constraintsROOM is implemented using Mixed Integer Linear Programming (MILP), which is an optimization algorithm similar to LP that allows variables to be defined as integers.We define integer Boolean variables yi that specify whether the i’th flux change from the wild-type’s flux distribution. In order to minimize the number of flux changes from the wild-type’s flux distribution, ROOM is formulated as to minimize the sum of yi’s.To constrain the i’th flux to its wild-type value if and only if yi equals zero we use the two constraints labeled “distance constraints”. When yi equals zero then the distance constraints fix vi to its wild-type value wi, and when yi equals one, the distance constraints do not impose new constraints on vi.MILP is NP-hard meaning that the running time of any solver would be exponentially dependent on the size of the problem. Since the size of the problem is determined by the number of constraints which is in the order of hundreds the problem is computationally intractable.In our analysis we have used two relaxation methods that provide a reasonable running time:1. One uses Linear Programming by relaxing the Boolean constraints and allowing the Boolean variables to take any value between zero and one.2. The other relaxation is in proximity to the wild-type flux distribution, looking for a flux distribution that only minimizes the number of significant flux changes.MILP is NP-HardRelax Boolean constraints - solve using LPRelax strict constraint of proximity to wild-type
46Example Network Wild-type’s solution. MOMA’s solution diverges flux. ROOM’s solution finds a short alternative pathway.ROOM preserves linearity of flowROOM’s solution is with maximal growth
47ROOM’s Implicit Growth Rate Maximization ROOM implicitly attempts to maintain the maximal possible growth rate of the wild-type organismA change in growth requires numerous changes in fluxesM1M2Growth ReactionTo understand this, let’s go back to the pseudo growth reaction that represents the growth rate of the organism. The growth reaction drains out from the cell the metabolites that are required for growth. This number of such metabolites is in the order of tens. Any change in growth rate that is represented as a change in flux through this reaction requires many more changes to preserve mass balance constraint. So ROOM by using a metric that minimizes the number of flux changes implicitly prefers solutions that do not reduce the growth rate.Therefore, we get that although the organism did not evolve to maximize its growth under all possible knockout configurations, the evolved regulatory mechanism that cope with the knockout by minimizing the number of regulatory changes may work to that effect..BiomassMn
48Intracellular Flux Measurements Intracellular fluxes measurements in E. coli central carbon metabolismObtained using NMR spectroscopy in C labeling experiments5 knockouts: pyk, pgi, zwf, gnd, ppc in Glycolysis and Pentose Phosphate pathwaysGlucose limited and Ammonia limited mediasFBA wild-type predictions above 90% accuracy13To compare the flux predictions obtained by ROOM, MOMA and FBA on a real metabolic network, we searched the literature for experimental flux measurements in E. coli’s central carbon metabolism.We’ve found experimental flux measurements in 4 different knocked-out strains on Glycolysis and Pentose-Phosphate pathway as shown in the figure. The flux measurements were done on different glucose-limited and ammonia-limited medias.All measurements were obtained in experiments using NMR spectroscopy with isotope carbon sources.For each experiment, we start by applying FBA to predict the flux distribution of the wild-type strain. The accuracy of the predictions was above 90% for all experiments. We note that the accuracy is calculated as the the Pearson correlation between the predicted and measured fluxes.Emmerling, M. et al. (2002), Hua, Q. et al. (2003), Jiao, Z et al. (2003), Peng, et. al (2004)
49Knockout Flux Predictions ROOM flux predictions are significantly more accurate than MOMA and FBA in 5 out of 9 experimentsROOM steady-state growth rate predictions are significantly more accurate than MOMAComparing the flux predictions obtained by ROOM with MOMA and FBA for the knocked-out strains we get that, in 4 out of 8 experiments, ROOM flux predictions were significantly more accurate than MOMA and FBA. This is shown in the left graph showing the accuracy of the predictions obtained by ROOM (red), MOMA (green) and FBA (blue). Only in 1 out of the 8 experiments, MOMA’s prediction is significantly more accurate than ROOM’s (we will discuss this later on).Furthermore, we’ve found that the growth rate predictions obtained by ROOM and FBA are significantly more accurate than MOMA in 4 out of the 8 experiments. The graph on the right shows the error in the growth rate prediction as obtained by the different algorithms. We see that ROOM’s and FBA’s errors are less than 15% in all cases, while MOMA’s error (in green) reaches 50% and 90%.Both ROOM and MOMA predicts a flux distribution of the knocked-out organism starting from a specific flux distribution of the wild-type strain. We note that the starting from alternative FBA solutions for the wild-type gives almost the same results that we present here.
50ROOM vs. MOMA ROOM predicts metabolic steady-state after adaptation Provides accurate flux predictionsPreserved flux linearityFinds alternative pathwaysPredicts steady-state growth ratesMOMA predicts transient metabolic states following the knockoutProvides more accurate transient growth rates
51Additional Constraints Transcriptional regulatory constraints (Covert, et. al., 2002)Boolean representation of regulatory networkUsed to predict growth, changes in expression levels, simulate courses of batch culturesEnergy balance analysis (Beard, et. al., 2002)Loops are not feasible according to thermodynamic principles – resulting in a non-convex solution space
52Additional Constraints: Slow Changes in the Environment Timescales of cellular process are shorter than those of surrounding environmentGenerate dynamic curves to simulate batch experiments (Varma, et. al., 1994)