Presentation on theme: "The Systems Biology Modelling Cycle EBI-BioPreDyn Workshop"— Presentation transcript:
1 The Systems Biology Modelling Cycle EBI-BioPreDyn Workshop 12-15 May, 2014, UK
2 Parameter Estimation in Large-Scale Kinetic Models of Microorganisms Alejandro F. Villaverde(Bio)-Process Engineering groupIIM-CSIC
3 What is a kinetic model? (I) Many biological processes arenon-stationary,time-dependent,dynamic.Example: metabolismCCM of E. coliChassagnole et al, Biotechnol. Bioeng. 79(1), 2002
4 What is a kinetic model? (II) Kinetic model: mathematical model of a dynamic systemInclude mathematical expressions of the rates at which the biochemical reactions take place equations describe fluxes as a function of concentrations
5 Example: kinetic model of E. coli’s CCM Mass balance equations:
6 Example: kinetic model of E. coli’s CCM (in COPASI)
7 Why use kinetic models?Think of an example application: industrial fermentation processWe would like to understand (and ideally improve), how a particular metabolite is produced in a bioreactorDynamic process: different events can affect the outcome“Genome-scale kinetic models of metabolism are important for rational design of the metabolic engineering required for industrial biotechnology applications. They allow one to predict the alterations needed to optimize the flux or yield of the compounds of interest, while keeping the other functions of the host organism to a minimal, but essential, level.”Large-scale metabolic models: From reconstruction to differential equations K Smallbone, P Mendes. Industrial Biotechnology 2013, 9: 179–184
8 Kinetic models vs. GEMs GEMs = “GEnome-scale Metabolic models” GEMs focus on stoichiometry, not dynamicsGEMs include a large set of reactions, without kinetic detailConstraint-based methods (FBA…) use GEMs to calculate steady-state fluxes [GEMs are also called constraint-based models]However, GEMs cannot predict how behavior emerges from dynamic concentration changes of cellular components to do this kinetic models are neededGEMs don’t define the kinetics of every rate expression They can make predictions for steady-state, but cannot describe dynamic adaptations.That is, they cannot model the temporal evolution of the concentrations
9 Kinetic models from GEMs It’s possible to start from a constraint-based model to build a kinetic modelProcedure:Start with a network stoichiometryAdd generic rate laws (linlog, Michaelis-Menten-like kinetics)Estimate unknown kinetic constantsSmallbone & Mendes presented a pipeline for creating thermodynamically consistent kinetic models, using limited data and ensuring consistency with known data and kinetic constantsLarge-scale metabolic models: From reconstruction to differential equationsK Smallbone, P Mendes. Industrial Biotechnology 2013, 9: 179–184
10 What is a “large-scale” kinetic model? Large-scale models have (at least):dozens of reactions and specieshundreds of parametersExample: E. coli’s CCM model18 species (= dynamic states)30 reactions139 parameters
11 Which models of microorganisms exist, and where to find them? Several LS kinetic models of microorganisms have been built, mostly for E. coli and S. cerevisiaeTalk by P. Mendes on Thursday:“Large-scale kinetic models of E. coli and yeast ”Model building takes time and resources.Are there (LS) kinetic models available?Yes! See databases, e.g.:BiomodelsCellML(although most of these models are not really LS)
12 BioPreDyn-bench Collection of benchmark problems for PE in LS models Includes:Yeast, metabolic2 x E. coli (metabolic, metab. + transcr. regul.)CHO, metabolicD. melanogaster, developmentGeneric signaling networkAvailable at the web (very soon!):Matlab, AMIGO, Copasi, C, SBMLIncluding ready-to-run implementations
14 So why are kinetic models not widely used (yet)? Kinetic models: very useful, but… still an exception in biotech applicationsProblem: incomplete knowledge ofRegulatory interactionsKinetic parametersThis leads to limited accuracy of predictions parameter estimation (PE) is one of the ways of addressing this problem
15 How to build a kinetic model? Model building steps:Define the purpose of the modelEstablish the network structure (“wiring diagram”) of the modelDetermine kinetic rate expressionsModel structure = network structure + kineticsDetermine the parametersValidate the modelPurpose: Typical questions are: Why do we model? What do we want to use the model for? What type of behavior should the model be able to explain?“Kinetic models in industrial biotechnology – Improving cell factory performance”J Almquist, M Cvijovic, V Hatzimanikatis, J Nielsen, M Jirstrand. Metabolic Engineering 2014
16 Parameter determination Parameter values are sometimes established one by one, either from targeted experiments measuring them directly or from other types of a priori information on individual parameter values.In contrast, parameter values can also be determined simultaneously using parameter estimation methods (PE)Parameter estimation as an optimization problem (previous talk by Eva Balsa Canto)
18 Overview of PE methods Local vs. Global: Deterministic vs. Stochastic: Local methods converge to the closest optimumWhen several optima exist, global optimization methods (GO) must be usedDeterministic vs. Stochastic:Deterministic GO methods guarantee that the solution is the global optimum, but the computational effort is very highStochastic GO methods do not guarantee the global optimality of the solution, but they are frequently capable of finding excellent solutions in reasonable computation times
19 Parameter estimation: Optimization methods LOCAL NLP solversConverge to the closest optimum to the initial guess.May end up in local solutions.GLOBAL NLP solversBranch and bound (BB or B&B) is a general algorithm for finding optimal solutions of various optimization problems, especially in discrete and combinatorial optimization. A branch-and-bound algorithm consists of a systematic enumeration of all candidate solutions, where large subsets of fruitless candidates are discarded en masse, by using upper and lower estimated bounds of the quantity being optimized.hill climbing is a mathematical optimization technique which belongs to the family of local search. It is an iterative algorithm that starts with an arbitrary solution to a problem, then attempts to find a better solution by incrementally changing a single element of the solutionIn a genetic algorithm, a population of candidate solutions (called individuals, creatures, or phenotypes) to an optimization problem is evolved toward better solutions. Each candidate solution has a set of properties (its chromosomes or genotype) which can be mutated and altered
20 MetaheuristicsHeuristic: procedure based on expert knowledge, not on formal analysisMetaheuristic: general-purpose heuristic method designed to guide an underlying problem-specific heuristicA metaheuristic is therefore a general algorithmic framework which can be applied to different optimization problems with relative few modificationsMetaheuristic approaches are a particularly efficient class of stochastic GO methods. They combine mechanisms for exploring the search space and exploiting the obtained knowledge
21 PE in LS kinetic models in biology The difficult problem of PE of LS kinetic modelsNonlinear systemsMulti-modal problems (several local minima)Need of time-series data (usually scarce)Lack of identifiabilityOverfittingAligning the model with the data…Computational issues (integrators, tolerances, …). Different timescales: StiffnessCPU times can be very large (days, weeks…) Stochastic (or hybrid) GO methods (metaheuristics)Smallbone & Mendes, 2013: Inherent stiffness of genome-scale models. Since cells need to produce some metabolites at a much higher rate than others, metabolic processes will necessarily be taking place at different timescales.As such, systems biology tools are needed that can robustly simulate models of this size and with these numerical instabilities.
22 Some classic, stochastic, nature-inspired GO methods Genetic Algorithms A population of candidate solutions is evolved toward better solutions. Each candidate solution has a set of properties (chromosomes) which can be mutatedSwarm intelligence: Ant Colony Optimization, Particle Swarm… mimic the movement of agents in a swarmSimulated Annealing mimics the annealing process in metallurgy: slow cooling of a material to produce crystals (temperature = probability of accepting worse solutions)Etc etc …
23 Some classic, stochastic, nature-inspired GO methods doi: /j.eee
24 Some classic, stochastic, nature-inspired GO methods
25 PE methods: the eSS family Scatter Search (SS): population-based metaheuristic (Glover 1977).Main differences with GA:SS orients its exploration systematically, relative to a set of reference points (RefSet). This allows to exploit the information gathered by each solution.Besides, SS includes the Improvement Method (local search )Five-method template:Diversification Generation Method:Improvement MethodReference Set Update MethodSubset Generation MethodSolution Combination Method
26 PE methods: eSS Diversification Generation Method: generates solutions Improvement Method: enhances solutionsRefSet Update Method: selects a ref. set of solutions (according to quality or diversity)Subset Generation Method: produces subsets of solutions from the RefSetSolution Combination Method“Scatter search for chemical and bio-process optimization”JA Egea, M Rodríguez-Fernández, JR Banga, R Martí. J Glob Optim (2007) 37:481–503
27 PE methods: the eSS family Enhanced Scatter Search (eSS):Advanced implementation of the SS metaheuristicsCombines SS with local methods (hybrid methodology), to accelerate convergence to the optimumIncludes several improvements of the original methodDeveloped for parameter estimation in LS biological problemsEgea JA, Martí R, Banga JR: An evolutionary method for complex-process optimization. Computers and Operations Research 2010, 37(2):315–324.
29 PE methods: the eSS family –extensions and implementations CeSS: parallel cooperative version of eSSSSmGO toolbox: eSS in MatlabAMIGO: includes eSS, in MatlabMEIGO: includes eSS & CeSS in Matlab & R (& Python interface to R)COPASI also includes SS in its latest releaseSS implementation in C presented at this workshop (poster)
30 Example: PE of a LS kinetic model (I) LS kinetic model of yeast (UNIMAN)Largest model included in BioPreDyn-bench (B1)1759 parameters, 285 reactions, 276 speciesImplementation—difficultiesNumerical problems: integration errors (COPASI—LSODA, MATLAB—CVODES)
31 Example: PE of a LS kinetic model (II) Ready-to-run implementations in AMIGO and COPASIPE settings:Parameter bounds: [0.2×nominal, 5×nominal]In AMIGO: eSS + DHCMax. Time allowed = 1 weekResults: see next slides
35 Final remarksKinetic modeling: adequate modeling framework for dynamic systemsLS kinetic models not widely used in systems biology yet, due to uncertainties, which limit applicabilityParameter estimation is necessary to address this issuePE in LS kinetic models is problematic (and costly)Stochastic or hybrid GO methods are preferredTomorrow, 10:30h: practical session on PE and OED
36 Recommended recent bibliography: “Kinetic models in industrial biotechnology – Improving cell factory performance”J Almquist, M Cvijovic, V Hatzimanikatis, J Nielsen, M Jirstrand. Metabolic Engineering 2014“Advancing metabolic models with kinetic information”H Link, D Christodoulou, U Sauer. Current Opinion in Biotechnology 2014, 29:8–14“Modeling metabolic systems: the need for dynamics”H-S Song, F DeVilbiss, D Ramkrishna. Current Opinion in Chemical Engineering 2013, 2:373–382“Yeast 5–an expanded reconstruction of the saccharomyces cerevisiae metabolic network”BD Heavner, K Smallbone, B Barker, P Mendes, LP Walker. BMC Systems Biology 2012, 6: 55.“Large-scale metabolic models: From reconstruction to differential equations”K Smallbone, P Mendes. Industrial Biotechnology 2013, 9: 179–184“BioPreDyn-bench: a suite of benchmark problems for dynamic modelling in systems biology”AF Villaverde, D Henriques, K Smallbone, S Bongard et al. in preparation“An evolutionary method for complex-process optimization”JA Egea, R Martí, JR Banga. Computers and Operations Research 2010, 37(2):315–324“A cooperative strategy for parameter estimation in large scale systems biology models”.AF Villaverde, JA Egea, JR Banga. BMC Systems Biology 2012, 6: 75“MEIGO: an open-source software suite based on metaheuristics for global optimization in systems biology and bioinformatics”JA Egea, D Henriques, T Cokelaer, AF Villaverde et al. BMC Bioinformatics 2014 arXiv:On kinetic modelsYeast model (and others)PE methods
37 Thanks for your attention Now it’s dinner time!
Your consent to our cookies if you continue to use this website.