Presentation is loading. Please wait.

Presentation is loading. Please wait.

Activist Data Mining (as Applied to Carbon:Nitrogen Sensing in Plants) Dennis Shasha.

Similar presentations


Presentation on theme: "Activist Data Mining (as Applied to Carbon:Nitrogen Sensing in Plants) Dennis Shasha."— Presentation transcript:

1 Activist Data Mining (as Applied to Carbon:Nitrogen Sensing in Plants) Dennis Shasha

2 New York University Department of Biology Gloria Coruzzi Mike Chou Andrew Kouranov Laurence Lejay Bud Mishra Marco Antoinotti Marc Rejali Courant Institute of Math & Computer Sciences Dennis Shasha

3 LIGHT NH 4 + Amino Acids Sugar Photosynthesis Asp Glu Asn Gln

4 Light, Carbon and Amino acids differentially regulate N-assimilation genes Light Carbon GS2 Gln C :N C5:N2 Light Carbon AS1 Asn C: N C4:N2 Amino acids

5 Goal: Figure out the Circuit for many genes Identify Arabidopsis mutants defective in C:N sensing Forward genetics: Selections for C:N sensing mutants Reverse genetics: Mutants in candidate C:N signaling genes Ultimate Goal: Virtual plant… (frankenfoods) A Multi-factor Approach to C:N sensing in plants. Identify how a combination of interactions of “inputs” (Light, Carbon, & Nitrogen) affects gene regulation using Combinatorial Design and Genome Chip analysis.

6 A Combinatorial Approach to discovering interactions Inputs:*Light *Starvation to Various Nutrients *Carbon *Inorganic N (NO3/NH4) *Organic N (Glu) *Organic N (Gln) If inputs are take binary values (first approximation) 6 binary (+/-) inputs= 2 6 or 64 input combinations (or treatments) Use combinatorial design to reduce number of treatment combinations required to effectively cover the experimental space

7 Combinatorial design generates a subset of the 64 treatments that give “good” approximation of the entire experimental space. For every pair of “inputs”, all four combinations of binary variables are tested: Example; NO 3 and Carbon have four possible combinations +NO 3 +Carbon; +NO 3 -Carbon; -NO 3 +Carbon; -NO 3 -Carbon Each combination of inputs is present in at least one treatment of experiments predicted by combinatorial design ACTIVIST DATA MINING Don’t study the experiments (only). Change them.

8 EXPT 1 PIVOT LIGHT LANELIGHTSTARVECARBONNO3NH4GLUGLN 1LIGHTN Y0L0H 3 YL0H0 4 NLLHH 5 N00HH 6 NLL00 7DARKN Y0L0H 9 YL0H0 10DARKNLLHH 11DARKN00HH 12DARKNLL00 “Combinatorial design” predicts 12 conditions to test the effect of Light in all combinations of Starvation, Carbon, and Nitrogen

9 Find “minimal pairs” of treatments that are the same except in one input (e.g. Light) to measure its effect on a dependent variable (gene) (e.g. AS1) PIVOTDependent Variable (Gene) EFFECTEvidence = Minimal pair treatments LITESTARVECARBONNO3GLU LIGHTAS1repress 4_8 L_D N L0H Analyze a series of minimal pair treatments using one input (e.g. Light) as a “pivot”, to determine the effect of light on a dependent variable (e.g. AS1) under a variety of carbon and nitrogen combinations. If consistent, likely always true. “Pivot” analysis of gene expression data from C:N treatments

10 PIVOTdependentEFFECTEvidence= Minimalpair treatments LITESTARVECARBONNO3/NH4GLU LIGHTAS1repress1_5L_DY000 LIGHTAS1repress2_6L_DYL00 LIGHTAS1repress3_7L_DYLL0 LIGHTAS1repress4_8L_DNL0H LIGHTAS1repress10_14L_DN000 LIGHTAS1repress11_15L_DYL00 LIGHTAS1repress12_16L_DYLL0 LIGHTAS1repress13_17L_DYL0H LIGHTGS2induce1_5L_DY000 LIGHTGS2induce2_6L_DYL00 LIGHTGS2induce3_7L_DYLL0 LIGHTGS2induce4_8L_DYL0H LIGHTGS2induce10_14L_DN000 LIGHTGS2induce11_15L_DNL00 LIGHTGS2induce12_16L_DNLL0 LIGHTGS2induce13_17L_DYL0H LITE represses AS1 & induces GS2 under a variety of C:N conditions

11 PIVOTGeneEFFECT Evidence= Minimalpair Treatments LIGHTSTARVECarbon NO3/NH4 GLU AS1induce2_4LYL00_H GLUAS1induce6_8DYL00_H GLUAS1induce15_17DYL00_H GLUAS1induce19_21DNL00_H GLUAS1induce23_25LNL00_L GLUAS1induce26_28LYL00_L GLUAS1induce30_32LYL00_L GLUGS2repress2_4LYL00_H GLUGS2repress6_8DYL00_H GLUGS2repress11_13LYL00_H GLUGS2repress15_17DYL00_H GLUGS2repress19_21DNL00_H GLUGS2repress20_22DNLL0_H GLUGS2repress23_25LNL00_L GLUGS2repress30_32LYL00_L GLU induces AS1 & represses GS2 under a variety of conditions

12 Underlying Method: combinatorial design Combinatorial design: Inspired by work in software testing by David Cohen, Siddhartha Dalal, Michael Fredman and Gardner Patton at Bellcore/Telcordia. Their problem: how to test a good set of inputs to a program to discover whether there are any bugs. Not program coverage, but input coverage. Not all input combinations, but all combinations of every pair of of input variables. Hypothesis: every input combination should give same output: no error. If true for designed subset, then program is ok.

13 Underlying Method: combinatorial design 2 Scientific question: does input X induce (resp. repress) the output? If so, then, regardless of the other inputs, X should induce. So, choose X = low and then a combinatorial design of the other inputs. Then choose X = high and then the same combinatorial design of the other inputs. If for each context c in the design (high,c) has more output than (low,c) -- minimal pair -- then X is inductive.

14 Underlying Methods: adaptive design What happens when X isn’t uniformly inductive or repressive? Suppose X shows induction normally, but repression occasionally. That is for most c values (low, c) vs. (high, c) shows induction, but for one c’ (low,c’) vs. (high, c’) shows repression. Then study difference between those c values showing induction that are closest to c’ and design experiments to reduce those differences.

15 Conclusions About Methodology Design/don’t wait: Use the data you are given, sure, but don’t be shy to ask for more. Combinatorial Design can help test a hypothesis: e.g. 10 three-valued variables require 59,049 experiments to cover whole space. Combinatorial design can reduce this to 27. Adaptation is easy: Study differences between normal cases and abnormal ones to discover fine structure.


Download ppt "Activist Data Mining (as Applied to Carbon:Nitrogen Sensing in Plants) Dennis Shasha."

Similar presentations


Ads by Google