Presentation is loading. Please wait.

Presentation is loading. Please wait.

C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Master Course Sequence Alignment Lecture 11 Database searching Issues (2)

Similar presentations


Presentation on theme: "C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Master Course Sequence Alignment Lecture 11 Database searching Issues (2)"— Presentation transcript:

1 C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Master Course Sequence Alignment Lecture 11 Database searching Issues (2)

2

3

4 C; family: zinc finger -- CCHH-type C; class: small C; reordered by kitschorder 1.0a C; reordered by kitschorder 1.0a C; last update 7/9/98 >P1;1zaa1 structureX:1zaa: 3 :C: 33 :C:zinc-finger (ZIF268, domain 1):Mus musculus:2.10:18.20 ------RPYACPVESCDRRFSRSDELTRHI-RI-HTGQK* >P1;1zaa2 structureX:1zaa: 34 :C: 61 :C:zinc-finger (ZIF268, domain 2):Mus musculus:2.10:18.20 -------PFQCRI--CMRNFSRSDHLTTHI-RT-HTGEK* >P1;1zaa3 structureX:1zaa: 62 :C: 87 :C:zinc-finger (ZIF268, domain 3):Mus musculus:2.10:18.20 -------PFACDI--CGRKFARSDERKRHT-KI-HLR--* >P1;1ard structureN:1ard: 102 : : 130 : :zinc-finger (transcription factor ADR1):Saccharomyces cerevisiae:-1.00:-1.00 ------RSFVCEV--CTRAFARQEHLKRHY-RS-HTNEK* >P1;1znf structureN:1znf: 1 : : 25 : :zinc-finger (XFIN, 31st domain):Xenopus laevis:-1.00:-1.00 --------YKCGL--CERSFVEKSALSRHQ-RV-HKN--* >P1;2drp2 structureX:2drp: 137 :A: 165:A:zinc-finger (tramtrack, domain 2):Drosophila melanogaster:2.80:19.30 ----NVKVYPCPF--CFKEFTRKDNMTAHV-KIIHK---* >P1;3znf structureN:3znf: 1 : : 30 : :zinc-finger (enhancer binding protein):Homo sapiens:-1.00:-1.00 ------RPYHCSY--CNFSFKTKGNLTKHMKSKAHSKK-* >P1;5znf structureN:5znf: 1 : : 30 : :zinc-finger (ZFY-6T):Homo sapiens:-1.00:-1.00 ------KTYQCQY--CEYRSADSSNLKTHIKTK-HSKEK* Example You can also look at superposed structures..

5

6

7

8

9

10

11

12

13

14

15 Sensitivity and Specificity – medical world + - Test Test + 9990 True Positive (TP) 990 False Positive (FP) All with Positive Test TP+FP Positive Predictive Value= TP/(TP+FP) 9990/(9990+990) =91% - 10 False Negative (FN) 989,010 True Negative (TN) All with Negative Test FN+TN Negative Predictive Value= TN/(FN+TN) 989,010/(10+989,0 10) =99.999% All with Disease 10,000 All without Disease 999,000 Everyone= TP+FP+FN+TN Sensitivity= TP/(TP+ FN) 9990/(99 90+10) Specificity= TN/(FP+TN ) 989,010/ (989,010+99 0) Pre-Test Probability= (TP+FN)/(TP+FP+FN+TN) (in this case = prevalence) 10,000/1,000,000 = 1%

16

17

18

19 Structure-based function prediction SCOP (http://scop.berkeley.edu/) is a protein structure classification database where proteins are grouped into a hierarchy of families, superfamilies, folds and classes, based on their structural and functional similarities

20 Structure-based function prediction SCOP hierarchy – the top level: 11 classes

21 Structure-based function prediction All-alpha protein Coiled-coil protein All-beta protein Alpha-beta proteinmembrane protein

22 Structure-based function prediction SCOP hierarchy – the second level: 800 folds

23 Structure-based function prediction SCOP hierarchy - third level: 1294 superfamilies

24 Structure-based function prediction SCOP hierarchy - third level: 2327 families

25 Structure-based function prediction Using sequence-structure alignment method, one can predict a protein belongs to a –SCOP family, superfamily or fold Proteins predicted to be in the same SCOP family are orthologous Proteins predicted to be in the same SCOPE superfamily are homologous Proteins predicted to be in the same SCOP fold are structurally analogous folds superfamilies families

26 Note: the numbers do not add up in every profile column since a selection of alignment sequences in the MSA and amino acids represented in the profile are taken!

27

28

29

30 ABAB B C C D

31

32

33

34

35 Conserved hypotheticals >P00001 Conserved hypothetical A substantial fraction of genes in sequenced genomes encodes 'conserved hypothetical' proteins, i.e. those that are found in organisms from several phylogenetic lineages but have not been functionally characterized.

36

37


Download ppt "C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Master Course Sequence Alignment Lecture 11 Database searching Issues (2)"

Similar presentations


Ads by Google