Presentation is loading. Please wait.

Presentation is loading. Please wait.

Modeling and Associated Visualization Needs A Trilogy in Four Parts.

Similar presentations


Presentation on theme: "Modeling and Associated Visualization Needs A Trilogy in Four Parts."— Presentation transcript:

1 Modeling and Associated Visualization Needs A Trilogy in Four Parts

2 The Acts: Not in Chronological Order Overview of the G2P cyberinfrastructure Systems biology models (bottom up) –Viz needs: Multivariate Dynamics, Inner Space, & Sensitivity Analysis Ecophysiological models (top down) –Viz needs: The same, plus Outer Space Statistical models (non-mechanistic) –Viz needs: Help & fast!!

3 Solving the G2P problem means developing a methodology… …that lets one start with some species & trait that one knows very little about and end with the ability to quantitatively predict trait scores for target genotype/environment combinations. Tools IgnorancePrediction Acquire data Elicit hypotheses Testing Build quantitative models To work, such a methodology must be cyber-enabled

4 User inferred Seq data Expression data Metabolic data Whole plant data Environment data Visualization DI Experiment Modeling and Statistical Inference Hypothesis User inferred Visualization Super-user Developer

5 Systems Biology Models

6 Modeling a single gene Amount of gene product at time t Controlled by the amounts of upstream regulatory gene products Some fraction of M degrades per unit time Temperature

7 Linking multiple genes… Promoter Region Transcription Factor “A” Gene Codons RNAP DNA Promoter Region RNAP “B” Gene Codons Prot. Syn. Transcription Translation

8 A “Bathtub” Model Transcription Factors modulate reading Temperature modulates all rates Other Gene Products affect degradation

9 What is a “product”?  RNA’s: messenger (mRNA) & otherwise  Some models do not distinguish mRNA & protein (e.g., when time scales are long)  Some models individually represent mRNA, cytosolic protein, and nuclear protein  Some models will separate products by tissue/organ (e.g., leaves, phloem, meristem)  Many models include metabolites & protein complexes  Basic equation is still the same (influx-eflux)

10 Input Activation Hill Function LinearConstant Frac. Michaelis- Menton Mass Action Etc.

11 Temperature effects

12 One form of temperature effect (Abstracted from Ellgaard et al. 1999) Folded protein packaged to go Bad protein (unreleased) Chaperone (folds/QC) Endoplasmic reticulum

13 Temperature effects Input Activation Hill Function Linear Mass Action Michaelis- Menton Etc. Constant Frac.

14 A close up – the diurnal clock Barak et al., 2000 Locke et al., of 13 equations ? Locke et al., 2005 Hill function Mass action Michaelis-Menton Influx - Efflux mRNA Translation Net transport into nucleus Environmental effect (light)

15 (S. Brady)

16

17 Sensitivity Analysis & Sloppy Systems Each letter is a power of two in sensitivity

18 Stiff & Sloppy Directions Parameter 1 Parameter 2 All parameter combinations inside this ellipse yield essentially identical goodness-of-fit values “Stiff” direction “Sloppy” direction Sloppy/Stiff ca The “ellipses” may be “hyper-pancakes” with 15 to 30 sloppy directions. How can these be meaningfully visualized?? Optimum goodness-of-fit

19 Sloppy directions in a clock model GIGANTEA ? 71 parameters reduced to 46 parameters

20 Ecophysiological Models… …come in three flavors –Environmental physics models (1945 to present) –Crop simulation models (1965 to present) –Geochemical cycling models Blend the characteristics of both of the above Are more recent …are now poised to contribute to the G2P problem via a top-down approach

21 What is the focus of models in Environmental Physics? Mimics conditions inside a uniform plant canopy; Mimics conditions inside a uniform plant canopy; The typical setting is an agricultural field; The typical setting is an agricultural field; – Includes plant-related, edaphic (soil), and meteorological inputs; Based on physical principles; Based on physical principles; – Conservation of matter and energy; convection, conduction, convection; – Some plant processes – gas exchange, photosynthesis, respiration Plant structure consists of leaves, stems, roots; Plant structure consists of leaves, stems, roots; Time horizon typically a few days with time steps on the order of minutes. Time horizon typically a few days with time steps on the order of minutes. Ergo plants often do not growErgo plants often do not grow What is the focus of models in Environmental Physics? Mimics conditions inside a uniform plant canopy; Mimics conditions inside a uniform plant canopy; The typical setting is an agricultural field; The typical setting is an agricultural field; – Includes plant-related, edaphic (soil), and meteorological inputs; Based on physical principles; Based on physical principles; – Conservation of matter and energy; convection, conduction, convection; – Some plant processes – gas exchange, photosynthesis, respiration Plant structure consists of leaves, stems, roots; Plant structure consists of leaves, stems, roots; Time horizon typically a few days with time steps on the order of minutes. Time horizon typically a few days with time steps on the order of minutes. Ergo plants often do not growErgo plants often do not grow

22 Environmental Physics Models: D or Bulk approach;1D or Bulk approach; Big Leaf / Big Root submodels;Big Leaf / Big Root submodels; Bucket soil submodels;Bucket soil submodels; Resistance analogs used for the atmospheric environment;Resistance analogs used for the atmospheric environment; Limited prediction of soil or canopy scalar variables;Limited prediction of soil or canopy scalar variables; Many empirical relationships;Many empirical relationships; Nebulous controlling variables (e.g., canopy resistance to vapor flux);Nebulous controlling variables (e.g., canopy resistance to vapor flux); Poor plant/environment feedback.Poor plant/environment feedback. 1D or Bulk approach;1D or Bulk approach; Big Leaf / Big Root submodels;Big Leaf / Big Root submodels; Bucket soil submodels;Bucket soil submodels; Resistance analogs used for the atmospheric environment;Resistance analogs used for the atmospheric environment; Limited prediction of soil or canopy scalar variables;Limited prediction of soil or canopy scalar variables; Many empirical relationships;Many empirical relationships; Nebulous controlling variables (e.g., canopy resistance to vapor flux);Nebulous controlling variables (e.g., canopy resistance to vapor flux); Poor plant/environment feedback.Poor plant/environment feedback. Atmosphere Bucket of Soil Big Root Big Leaf

23 Environmental Physics Models: Canopy Layers Sunlit Shade Atmosphere Layers Soil Layers Rooting Profile Multi-layer atmosphere, soil, and canopy;Multi-layer atmosphere, soil, and canopy; “Scaled leaf” approach within canopy layers;“Scaled leaf” approach within canopy layers; Relationships between photo- synthesis, transpiration, and biophysics (e.g., stomatal action);Relationships between photo- synthesis, transpiration, and biophysics (e.g., stomatal action); Use finite difference methods to compute soil heat, water, and gas flows;Use finite difference methods to compute soil heat, water, and gas flows; Incorporate root density functions and soil physical properties.Incorporate root density functions and soil physical properties. T AIR, VPD, CO 2, wind speed profiles T CANOPY, VPD, CO 2, canopy profiles T SOIL, profiles

24 What is a Crop Growth Model? Mimics one “average plant” at a field or smaller scale; Mimics one “average plant” at a field or smaller scale; The plant environment is an agricultural production setting; The plant environment is an agricultural production setting; – Includes cultural- and production-related I/O variables; – Includes varietal, edaphic, and meteorological inputs; Based on physiological processes; Based on physiological processes; – Photosynthesis, respiration, transpiration, nutrient uptake, carbon partitioning, growth, and phenological development; Plant structure consists of leaves, stems, roots, & grain; Plant structure consists of leaves, stems, roots, & grain; Annual time horizon with daily or hourly time steps. Annual time horizon with daily or hourly time steps. What is a Crop Growth Model? Mimics one “average plant” at a field or smaller scale; Mimics one “average plant” at a field or smaller scale; The plant environment is an agricultural production setting; The plant environment is an agricultural production setting; – Includes cultural- and production-related I/O variables; – Includes varietal, edaphic, and meteorological inputs; Based on physiological processes; Based on physiological processes; – Photosynthesis, respiration, transpiration, nutrient uptake, carbon partitioning, growth, and phenological development; Plant structure consists of leaves, stems, roots, & grain; Plant structure consists of leaves, stems, roots, & grain; Annual time horizon with daily or hourly time steps. Annual time horizon with daily or hourly time steps.

25 What is the current status of Crop Growth Models? Skillful models can account for ca. 70% of yield variance; Skillful models can account for ca. 70% of yield variance; Ongoing work focuses on refinement and applications; Ongoing work focuses on refinement and applications; – Problems being researched include methods for estimating cultivar and soil characteristics on an operational scale; Model structures and approaches have matured; Model structures and approaches have matured; Recent physical theory may not be emphasized; Recent physical theory may not be emphasized; Physical theory does not seem to improve predictions. Physical theory does not seem to improve predictions. Interestingly, incorporating crop growth model components into physical models does not guarantee improved predictability either, even though physical scientists recognize knowledge of the plant as limiting. What is the current status of Crop Growth Models? Skillful models can account for ca. 70% of yield variance; Skillful models can account for ca. 70% of yield variance; Ongoing work focuses on refinement and applications; Ongoing work focuses on refinement and applications; – Problems being researched include methods for estimating cultivar and soil characteristics on an operational scale; Model structures and approaches have matured; Model structures and approaches have matured; Recent physical theory may not be emphasized; Recent physical theory may not be emphasized; Physical theory does not seem to improve predictions. Physical theory does not seem to improve predictions. Interestingly, incorporating crop growth model components into physical models does not guarantee improved predictability either, even though physical scientists recognize knowledge of the plant as limiting.

26 Special case Geochemical cycling models Used to model “ecosystem services” and/or “land surface processes” inside general circulation modelsUsed to model “ecosystem services” and/or “land surface processes” inside general circulation models Blend of both kinds of models;Blend of both kinds of models; Includes plant-related, edaphic, and meteorological inputs; Includes plant-related, edaphic, and meteorological inputs; Based on physical principlesBased on physical principles – Conservation of matter and energy; convection, conduction, convection; – Some plant processes – gas exchange, photosynthesis, respiration Plant structure consists of leaves, stems, roots; Plant structure consists of leaves, stems, roots; Time horizon of years with time steps on the order of minutes (depends on spatial scale). Time horizon of years with time steps on the order of minutes (depends on spatial scale).

27 Neither current crop growth models nor environmental physics models adequately depict plant process control mechanisms;Neither current crop growth models nor environmental physics models adequately depict plant process control mechanisms; This accounts for the failure of models to mimic the plasticity of real plants across different environments;This accounts for the failure of models to mimic the plasticity of real plants across different environments; The information needed to remedy this situation is emerging from the genomic sciences;The information needed to remedy this situation is emerging from the genomic sciences; Incorporating this information requires a reorganization of crop modelsIncorporating this information requires a reorganization of crop models Main points --

28 New Crop Growth Model Concept Energy Water N Physical Submodel [CPAI] [KE60] Sensors Control Submodel

29 Viz needs for ecophysiological models and G2P components Largely the same as for systems biology models – multivariate dynamics in spatially discrete plant parts Note that our “G2P solution” specifies predicting trait scores in non-constant environments. –That most directly refers to the outdoors –Therefore geographic variation must also be considered

30 A hazy shade of winter… One frame of a movie comparing the standard deviation of flowering time for the Columbia strain of A. thaliana germinating on each day. Projected by the gene-based model of Wilczek et al, The standard deviation is over five years (left, , real data; right, , A1B climate scenario.)

31 Statistical genetic methods I Can be used to –Predict phenotypes based on genotypes –Locate regions of the genome likely to contain genes controlling particular phenotypes Can be used when –Knowledge of gene mechanisms is lacking Big Caveat –The mathematical form of the G2P relationship is just assumed to be linear – … and the data & models elaborated until the job gets done to adequate accuracy

32 Statistical genetic methods II Why does it work? –Because there are sufficient regimes of near linearity buried in mechanistic network eq’ns that general linear statistical models have levels of predictive skill useful for some purposes (e.g. crop breeding) –Rest assured that there are limits to what should be expected of these models How does it work?

33 What are genetic markers? Position within gene Aligned DNA sequences of 25 different genetic lines Single nucleotide polymorphism (SNP) (Data from the Purugganan Lab)

34 Different sibling lines will have different marker combinations The DNA sequence for line 1 has the same sequence as parent “B” at the location of marker “g17286”… …but in line 8 the DNA matches parent “A” at that location

35 Many different linear models Genome Wide Association Finding quantitative trait loci (QTL) Find markers i, j, and k such that is a good fit etc….

36 What a QTL analysis output looks like. This is a “1d-scan” – i.e. X m,j (Buckler et al, Science, 2009)

37 Two Stat Inf Viz Problems Higher order scans e.g. –Remember SNP numbers can be in the 150K to 3M range. eQTL viz problems –Can be 30K phenotypes… –…and higher order scans

38 eQTL Analysis – Looking for Regulators Promoter Region Transcription Factor “A” Gene Codons RNAP DNA Promoter Region RNAP “B” Gene Codons Prot. Syn. Let “Pheno” be the amount of mRNA (expression) produced by gene “B”. This could be different in lines that varied either in the promoter of “B” or in lines that had differences in the coding region of gene “A”. These are called “cis” and “trans” effects, respectively.

39 Massive eQTL Variation 75% of all genes have at least 1 eQTL IIIIIIIVV Chromosome Position of eQTL for each of 15,771 genes Arranged by Physical Order Bay + Bay - QTL Effect Cis Diagonal Trans Hotspot (D. Kliebenstein)

40 eQTL Viz Problems… How to plot interaction effects? That is X m,j X m,k and a gazillion phenotypes

41 Questions? Virtual soybean simulations from Han et al. 2007

42 CROPGRO-Soybean Modular structureModular structure HierarchicalHierarchical Plant processes & partsPlant processes & parts Standardized input & output filesStandardized input & output files FORTRANFORTRAN Modules not shownModules not shown –Water balance –Photosynthesis –Soil Nitro. & Temp –Weather Extensive research supportExtensive research support International useInternational use Crop Model Example: [KB00]


Download ppt "Modeling and Associated Visualization Needs A Trilogy in Four Parts."

Similar presentations


Ads by Google