Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Detecting selection using phylogeny. 2 Evaluation of prediction methods  Comparing our results to experimentally verified sites Positive (hit)Negative.

Similar presentations


Presentation on theme: "1 Detecting selection using phylogeny. 2 Evaluation of prediction methods  Comparing our results to experimentally verified sites Positive (hit)Negative."— Presentation transcript:

1 1 Detecting selection using phylogeny

2 2 Evaluation of prediction methods  Comparing our results to experimentally verified sites Positive (hit)Negative TrueTrue-positive True-negative FalseFalse-positive (false alarm) False-negative (miss) Our prediction gives: Is the prediction correct?

3 3 Calibrating the method  All methods have a parameter (cutoff) that can be calibrated to improve the accuracy of the method.  For example: the E-value cutoff in BLAST

4 4 Calibrating E-value cutoff Positive (hit)Negative TrueTrue-positive (real homolog( True-negative (real non-homolog) FalseFalse-positive (false alarm: not a homolog) False-negative (missed a homolog) Our prediction gives: Is the prediction correct? Is this a homolog?

5 5 Calibrating the E-value  What will happen if we raise the E-value cutoff (for instance – work with all hits with an E-value which is < 10) ? Positive (hit)Negative TrueTrue-positive True-negative FalseFalse-positive (false alarm) False-negative (miss) Our prediction gives: Is the prediction correct?

6 6 Calibrating the E-value  On the other hand – if we lower the E-value (look only at hits with E-value < 10 -8 ) Positive (hit)Negative TrueTrue-positive True-negative FalseFalse-positive (false alarm) False-negative (miss) Our prediction gives: Is the prediction correct?

7 7 Improving prediction  Trade-off between specificity and sensitivity

8 8 Sensitivity vs. specificity  Sensitivity =  Specificity = True positive True positive + False negative Represent all the proteins which are really homologous True negative True negative + False positive Represent all the proteins which are really NOT homologous How good we hit real homologs How good we avoid real non- homologs

9 9  Raising the E-value to 10: sensitivity specificity  Lowering the E-value to 10 -8 sensitivity specificity

10 10 Functional prediction in proteins (purifying and positive selection)

11 11 Darwin – the theory of natural selection  Adaptive evolution: Favorable traits will become more frequent in the population

12 12 Adaptive evolution  When natural selection favors a single allele and therefore the allele frequency continuously shifts in one direction

13 13 Kimura – the theory of neutral evolution  Neutral evolution: Most molecular changes do not change the phenotype Selection operates to preserve a trait (no change)

14 14 Purifying Selection  Stabilizes a trait in a population: Small babies  more illness Large babies  more difficult birth …  Baby weight is stabilized round 3-4 Kg

15 15 Purifying selection (conservation) - the molecular level  Histone 3

16 16 Synonymous vs. non-synonymous substitutions Purifying selection: excess of synonymous substitutions

17 17 Synonymous vs. non-synonymous substitutions Purifying selection: excess of synonymous substitutions Synonymous substitution: GUU  GUC Non-synonymous substitution: GUU  GCU

18 18 Conservation as a means of predicting function Infer the rate of evolution at each site Low rate of evolution  constraints on the site to prevent disruption of function: active sites, protein-protein interactions, etc.

19 19 Conservation as a means of predicting function 1234567 HumanDMAAHAM ChimpDEAAGGC CowDQAAWAP FishDLAACAL S. cerevisiaeDDGAFAA S. pombeDDGALGE

20 20 Which site is more conserved? 1234567 HumanDMAAHAM ChimpDEAAGGC CowDQAAWAP FishDLAACAL S. cerevisiaeDDGAFAA S. pombeDDGALGE

21 21 Use Phylogenetic information 1234567 HumanDMAAHAM ChimpDEAAGGC CowDQAAWAP FishDLAACAL S. cerevisiaeDDGAFAA S. pombeDDGALGE A G A A A G A A A A G G

22 22 Prediction of conserved residues by estimating evolutionary rates at each site ConSurf/ConSeq web servers:

23 23 Working process Input a protein with a known 3D structure (PDB id or file provided by the user) Find homologous protein sequences (psi-blast) Perform multiple sequence alignment (removing doubles)Construct an evolutionary tree Project the results on the 3D structureCalculate the conservation score for each site

24 24 The Kcsa potassium channel  An outstanding mystery: how does the Kcsa Potassium channel conduct only K+ ions and not Na+?

25 25 The Kcsa potassium channel structure  The structure of the Kcsa channel was resolved in 1998  Kcsa is a homotetramer with a four-fold symmetry axis about its pore.

26 26 The Kcsa potassium selectivity filter  The selectivity filter identifies water molecules bound to K+  When water is bound to Na+: no passage

27 27 Conservation analysis of Kcsa  Use Consurf to study Kcsa conservation

28 28 ConSurf results

29 29 Conseq  ConSeq performs the same analysis as ConSurf but exhibits the results on the sequence.  Predict buried/exposed relation  exposed & conserved  functionally important site  buried & conserved  structurally important site

30 30 Conseq analysis Exposed & conserved  functionally important site Buried & conserved  structurally important site

31 31 Positive selection & drug resistance

32 32 Darwin – the theory of natural selection  Adaptive evolution: Favorable traits will become more frequent in the population

33 33 Adaptive evolution on the molecular level

34 34 Adaptive evolution on the molecular level Look for changes which confer an advantage

35 35 Na ï ve detection  Observe multiple sequence alignment: variable regions = adaptive evolution??

36 36 Na ï ve detection  The problem – how do we know which sites are simply sites with no selection pressure ( “ non-important ” sites) and which are under adaptive evolution?

37 37 Solution – look at the DNA synonymous non- synonymous

38 38 Solution – look at the DNA Purifying selection Syn > Non-syn Adaptive evolution = Positive selection Non-syn > Syn Neutral selection Syn = Non-syn

39 39 Also known as … Ka/Ks (or dn/ds, or ω)  Purifying selection: Ka < Ks (Ka/Ks <1)  Neutral selection: Ka=Ks (Ka/Ks = 1)  Positive selection: Ka > Ks (Ka/Ks >1) Non- synonymous mutation rate Synonymous mutation rate

40 40 Examples for positive selection  Proteins involved in immune system  Proteins involved in host-pathogen interaction ‘ arms-race ’  Proteins following gene duplication  Proteins involved in reproduction systems

41 41 Selecton – a server for the detection of purifying and positive selection http://selecton.bioinfo.tau.ac.il

42 42 Detecting drug resistance using Selecton

43 43 HIV: molecular evolution paradigm Rapidly evolving virus: 1.High mutation rate (low fidelity of reverse transcriptase) 2.High replication rate

44 44 HIV Protease Protease is an essential enzyme for viral replication Drugs against Protease are always part of the “cocktail”

45 45 Ritonavir Inhibitor  Ritonavir (RTV) is a specific protease inhibitor (drug) C 37 H 48 N 6 O 5 S 2

46 46 Drug resistance No drug Drug Adaptive evolution (positive selection)

47 47 Used Selecton to analyse HIV-1 protease gene sequences from patients that were treated with RTV only

48 48

49 49 Example: HIV Protease  Primary mutations  Secondary mutations  novel predictions (experimental validation)

50 50 Summary  Sequence analysis can provide valuable information about protein function  Conservation on the amino acid level http://consurf.tau.ac.il  Positive “ Darwinian ” selection and purifying selection http://selecton.bioinfo.tau.ac.il http://selecton.bioinfo.tau.ac.il


Download ppt "1 Detecting selection using phylogeny. 2 Evaluation of prediction methods  Comparing our results to experimentally verified sites Positive (hit)Negative."

Similar presentations


Ads by Google