Presentation is loading. Please wait.

Presentation is loading. Please wait.

Determine ORF and BLASTP

Similar presentations


Presentation on theme: "Determine ORF and BLASTP"— Presentation transcript:

1 Determine ORF and BLASTP
WSSP Chapter 9 Determine ORF and BLASTP atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag gatcccgatc atgattgctt caatattttc acttcaatga ttggttctaa gcattcgaat gcgtacccgt ttgattaata tttccatttc tgtcccagtt tttaattttc atttcttttg gttaaaaaat tcccagtctc ttgaatgctt ttctaaaatc tttaattcaa ttatttatta gaatcttctg ttttgagaac tttgtaatgt aattaaataa tttgatgaaa tgattatgaa tgcgaataaa ttattaattt accgtgctga ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag gatcccgatc atgattgctt caatattttc acttcaatga ttggttctaa gcattcgaat gcgtacccgt ttgattaata tttccatttc tgtcccagtt tttaattttc atttcttttg gttaaaaaat tcccagtctc ttgaatgctt ttctaaaatc tttaattcaa ttatttatta gaatcttctg ttttgagaac tttgtaatgt aattaaataa tttgatgaaa tgattatgaa tgcgaataaa ttattaattt accgtgttgg attgaaggta attatcttgc atgagccagc tgatgagtat gatacagttt

2 © 2014 WSSP 2 2

3 Steps and terms used in protein expression
1st ATG in mRNA p 9-1 © 2014 WSSP

4 Cloning the cDNA library
p 9-1 © 2014 WSSP

5 Possible reading frames
© 2014 WSSP

6 Possible types of clones in the cDNA library
© 2014 WSSP

7 Link to Toolbox translation program
DSAP Define ORF page: Link to Toolbox translation program p 9-3 © 2014 WSSP

8 Toolbox: DNA Sequence Translation Program
PolyA tail at 3’ end Reading frames p 9-3 © 2014 WSSP

9 EX Reading Frame Longest ORF Translation stop p 9-3 © 2014 WSSP

10 Which one of these would be the correct ORF?
A) B) Rule #1: If downstream of a stop codon, translation of the protein MUST start with an M (MET) p 9-3 © 2014 WSSP

11 Could this ORF code for the protein?
© 2014 WSSP

12 Does this region match the BLASTX matches?
Region of DNA that codes for the highlighted in protein sequence BLASTx p 9-4 © 2014 WSSP

13 Could the DNA code for a partial protein?
© 2014 WSSP

14 Does this region match the BLASTX matches?
Region of DNA that codes for the highlighted in protein sequence BLASTx p 9-4 © 2014 WSSP

15 Examine +2 reading frame A) Yes
Could this be the ORF that codes the protein? B) No p 9-4 © 2014 WSSP

16 Does this region match the BLASTX matches?
Region of DNA that codes for the highlighted in protein sequence BLASTx p 9-4 © 2014 WSSP

17 Examine +3 reading frame A) Yes
Could this be the ORF that codes the protein? B) No WHY? p 9-5 © 2014 WSSP

18 Examine +2 reading frame
Which type of clone? A) B) C) Can not tell p 9-4 © 2014 WSSP

19 An example of a partial coding sequence
Similar Seq. © 2014 WSSP

20 Is this a partial ORF cDNA clone?
What about this region? © 2014 WSSP

21 The first part of the protein may not have matches because it is not conserved.
2 60 410 475 Query Sbjct Region of similarity © 2014 WSSP

22 What type of clone would this sequence be?
C) Can not tell p 9-4 © 2014 WSSP

23 The BLASTx helps determine which reading frame is correct
It also helps suggest the start point p 9-6 © 2014 WSSP

24 Based on the BLASTX information shown, which ORF is correct for EX1.14?
- Both Neither Can not tell © 2014 WSSP

25 Chose the reading frame and paste in the protein sequence
Do not include the * (stop codon) Make sure to include bases that code for the stop codon p 9-7 © 2014 WSSP

26 The Five Commandments of DSAP
I. The stop codon is part of the ORF.

27 DSAP BLASTp page p 9-8 © 2014 WSSP

28 Paste in protein sequence
NCBI BLASTp page Paste in protein sequence p 9-8 © 2014 WSSP

29 BLASTp results of EX1.14 +2 ORF
Link to Conserved Domain Database p 9-9 © 2014 WSSP

30 BLASTp results of EX1.14 +1 ORF
© 2014 WSSP

31 BLASTp results of EX1.13 +3 ORF
No matches © 2014 WSSP

32 Enter BLASTp data into table
M * Protein AAAAAA Possible DNA Clones AAAAAA AAAAAA p 9-10 © 2014 WSSP

33 Does the EX1.14 clone likely code for the start of the protein?
BLASTP alignment A) Yes B) No © 2014 WSSP

34 Suppose the cDNA was missing the first 13 bp
Does this DNA code for the start of the protein? >gi| |ref|NP_ | dynein light chain LC6, flagellar outer arm [Zea mays] Length=93 Score = 139 bits (351), Expect = 8e-32 Identities = 63/81 (77%), Positives = Query 1 MPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVVGSSFGCFFTHKK 60 MP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVVGS FGC+ TH K Sbjct 13 MPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVVGSGFGCYITHSK 72 Query 61 GSFIYFRLETLHFLIFKGAAA 81 GSFIYFRLE+L FL+FKGAAA Sbjct 73 GSFIYFRLESLRFLVFKGAAA 93 © 2014 WSSP

35 Suppose the cDNA was missing the first 13 bp
Did they choose the correct ORF? >gi| |ref|NP_ | dynein light chain LC6, flagellar outer arm [Zea mays] Length=93 Score = 139 bits (351), Expect = 8e-32 Identities = 63/81 (77%), Positives = Query 1 MPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVVGSSFGCFFTHKK 60 MP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVVGS FGC+ TH K Sbjct 13 MPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVVGSGFGCYITHSK 72 Query 61 GSFIYFRLETLHFLIFKGAAA 81 GSFIYFRLE+L FL+FKGAAA Sbjct 73 GSFIYFRLESLRFLVFKGAAA 93 © 2014 WSSP

36 Suppose the cDNA was missing the first 13 bp
Did they choose the correct ORF? BLASTP starting here >gi| |ref|NP_ | dynein light chain LC6, flagellar outer arm [Zea mays] Length=93 Score = 139 bits (351), Expect = 8e-32 Identities = 63/81 (77%), Positives = Query 1 MPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVVGSSFGCFFTHKK 60 MP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVVGS FGC+ TH K Sbjct 13 MPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVVGSGFGCYITHSK 72 Query 61 GSFIYFRLETLHFLIFKGAAA 81 GSFIYFRLE+L FL+FKGAAA Sbjct 73 GSFIYFRLESLRFLVFKGAAA 93 BLASTP starting here >gi| |ref|NP_ | dynein light chain LC6, flagellar outer arm [Zea mays] Score = 156 bits (395), Expect = 6e-37 Identities = 72/92 (78%), Positives = 82/92 (89%), Gaps = 0/92 (0%) Query 1 LEGRARVEDTDMPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVVG 60 LEG+A VEDTDMP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVVG Sbjct 2 LEGKAVVEDTDMPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVVG 61 Query 61 SSFGCFFTHKKGSFIYFRLETLHFLIFKGAAA 92 S FGC+ TH KGSFIYFRLE+L FL+FKGAAA Sbjct 62 SGFGCYITHSKGSFIYFRLESLRFLVFKGAAA 93 © 2014 WSSP

37 Q3 Does the DNA sequence code for the likely start of the protein?
gi| |ref|NP_ | 50S ribosomal protein L20 [Zea mays] Length=122 Score = 207 bits (528), Expect = 2e-52 Identities = 100/113 (88%), Positives = 106/113 (93%), Gaps = 0/113 (0%) Query MNKEKILKLAKGFRGRAKNCIRIARERVEKALQYSYRDRRNKKRDMRSLWIQRINAATRL 60 MNK KI KLAKGFRGRAKNCIRIARERVEKALQYSYRDRRNKKRDMRSLWI+RINA TRL Sbjct MNKGKIFKLAKGFRGRAKNCIRIARERVEKALQYSYRDRRNKKRDMRSLWIERINAGTRL 60 Query HAVNYGTFMHGLLRENVQLNRKVLSELSMHEPHSFKALVDVSRAAFPGNRQIP 113 H VNYG FMHGL++EN+QLNRKVLSELSMHEP+SFKALVDVSR AFPGNR +P Sbjct HGVNYGNFMHGLMKENIQLNRKVLSELSMHEPYSFKALVDVSRNAFPGNRPVP 113 Yes No Can not tell from data © 2014 WSSP

38 Q4 Does the DNA sequence code for the likely end of the protein?
gi| |ref|NP_ | 50S ribosomal protein L20 [Zea mays] Length=122 Score = 207 bits (528), Expect = 2e-52 Identities = 100/113 (88%), Positives = 106/113 (93%), Gaps = 0/113 (0%) Query MNKEKILKLAKGFRGRAKNCIRIARERVEKALQYSYRDRRNKKRDMRSLWIQRINAATRL 60 MNK KI KLAKGFRGRAKNCIRIARERVEKALQYSYRDRRNKKRDMRSLWI+RINA TRL Sbjct MNKGKIFKLAKGFRGRAKNCIRIARERVEKALQYSYRDRRNKKRDMRSLWIERINAGTRL 60 Query HAVNYGTFMHGLLRENVQLNRKVLSELSMHEPHSFKALVDVSRAAFPGNRQIP 113 H VNYG FMHGL++EN+QLNRKVLSELSMHEP+SFKALVDVSR AFPGNR +P Sbjct HGVNYGNFMHGLMKENIQLNRKVLSELSMHEPYSFKALVDVSRNAFPGNRPVP 113 Yes No Can not tell from data © 2014 WSSP

39 Q3 Does the DNA sequence code for the likely start of the protein?
BLASTp >gi| |ref|NP_ | 50S ribosomal protein L20 [Zea mays] Length=122 Score = 124 bits (310), Expect = 4e-27 Identities = 57/68 (83%), Positives = 63/68 (92%), Gaps = 0/68 (0%) Query MRSLWIQRINAATRLHAVNYGTFMHGLLRENVQLNRKVLSELSMHEPHSFKALVDVSRAA 60 MRSLWI+RINA TRLH VNYG FMHGL++EN+QLNRKVLSELSMHEP+SFKALVDVSR A Sbjct MRSLWIERINAGTRLHGVNYGNFMHGLMKENIQLNRKVLSELSMHEPYSFKALVDVSRNA 105 Query FPGNRQIP 68 FPGNR +P Sbjct FPGNRPVP 113 Yes No Can not tell from data Query Sbjct 1 45 68 113 84 122 © 2014 WSSP

40 Q3 Does the DNA sequence code for the likely start of the protein?
BLASTp gi| |ref|NP_ | 50S ribosomal protein L20 [Zea mays] Length=122 Score = 124 bits (310), Expect = 4e-27 Identities = 57/68 (83%), Positives = 63/68 (92%), Gaps = 0/68 (0%) Query MRSLWIQRINAATRLHAVNYGTFMHGLLRENVQLNRKVLSELSMHEPHSFKALVDVSRAA 60 MRSLWI+RINA TRLH VNYG FMHGL++EN+QLNRKVLSELSMHEP+SFKALVDVSR A Sbjct MRSLWIERINAGTRLHGVNYGNFMHGLMKENIQLNRKVLSELSMHEPYSFKALVDVSRNA 105 Query FPGNRQIP 68 FPGNR +P Sbjct FPGNRPVP 113 BLASTx GENE ID: LOC | 50S ribosomal protein L20 [Zea mays] Length=122 Score = 146 bits (369), Expect = 7e-34 Identities = 68/79 (86%), Positives = 74/79 (93%), Gaps = 0/79 (0%) Frame = +1 Query SYRDRRNKKRDMRSLWIQRINAATRLHAVNYGTFMHGLLRENVQLNRKVLSELSMHEPHS 180 SYRDRRNKKRDMRSLWI+RINA TRLH VNYG FMHGL++EN+QLNRKVLSELSMHEP+S Sbjct SYRDRRNKKRDMRSLWIERINAGTRLHGVNYGNFMHGLMKENIQLNRKVLSELSMHEPYS 94 Query FKALVDVSRAAFPGNRQIP 237 FKALVDVSR AFPGNR +P Sbjct FKALVDVSRNAFPGNRPVP 113 Yes No Can not tell from data © 2014 WSSP

41 Compare the BLASTx and BLASTp results for EX1.14:
Are the matches to the same proteins? p 9-11 © 2014 WSSP

42 Compare the BLASTx and BLASTp results for EX1
Compare the BLASTx and BLASTp results for EX1.14: Are the e-values similar? p 9-11 © 2014 WSSP

43 Do they start and stop at the same place in the Sbjct sequence?
Compare the BLASTx and BLASTp results for EX1.14: Are the alignments similar? Do they start and stop at the same place in the Sbjct sequence? BLASTx >ref|NP_ | dynein light chain LC6, flagellar outer arm [Zea mays] Length=93 Query MLEGRARVEDTDMPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVV 190 MLEG+A VEDTDMP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVV Sbjct MLEGKAVVEDTDMPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVV 60 Query GSSFGCFFTHKKGSFIYFRLETLHFLIFKGAAA 289 GS FGC+ TH KGSFIYFRLE+L FL+FKGAAA Sbjct GSGFGCYITHSKGSFIYFRLESLRFLVFKGAAA 93 BLASTp >gi| |ref|NP_ | dynein light chain LC6, flagellar outer arm [Zea mays] Length=93 Score = 158 bits (400), Expect = 2e-37 Query 1 MLEGRARVEDTDMPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVV 60 MLEG+A VEDTDMP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVV Sbjct 1 MLEGKAVVEDTDMPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVV 60 Query 61 GSSFGCFFTHKKGSFIYFRLETLHFLIFKGAAA 93 GS FGC+ TH KGSFIYFRLE+L FL+FKGAAA Sbjct 61 GSGFGCYITHSKGSFIYFRLESLRFLVFKGAAA 93 p 9-12 © 2014 WSSP

44 Compare the BLASTx and BLASTp results for another clone:
Did the student select the correct protein sequence? Yes No Can not tell © 2014 WSSP

45 Compare the BLASTx and BLASTp results: Are the matches to the same proteins?
Yes No Can not tell © 2014 WSSP

46 Compare the BLASTx and BLASTp results: Are the e-values similar?
Yes No Can not tell © 2014 WSSP

47 Did the student choose the correct protein sequence?
BLASTp gi| |gb|EES | hypothetical protein SORBIDRAFT_09g Length=122Score = 211 bits (538), Expect = 1e-53, Method: Compositional matrix adjust. Identities = 100/113 (88%), Positives = 106/113 (93%), Gaps = 0/113 (0%) Query MNKEKILKLAKGFRGRAKNCIRIARERVEKALQYSYRDRRNKKRDMRSLWIQRINAATRL 70 MNK KI KLAKGFRGRAKNCIRIARERVEKALQYSYRDRRNKKRDMRSLWI+RINA TRL Sbjct MNKGKIFKLAKGFRGRAKNCIRIARERVEKALQYSYRDRRNKKRDMRSLWIERINAGTRL 60 Query HAVNYGTFMHGLLRENVQLNRKVLSELSMHEPHSFKALVDVSRAAFPGNRQIP 123 H VNYG FMHGL++EN+QLNRKVLSELSMHEP+SFKALVDVSR AFPGNR +P Sbjct HGVNYGNFMHGLMKENIQLNRKVLSELSMHEPYSFKALVDVSRNAFPGNRPVP 113 BLASTx gi| |gb|EES | hypothetical protein SORBIDRAFT_09g Length=122 Score = 207 bits (528), Expect = 5e-52 Identities = 100/113 (88%), Positives = 106/113 (93%), Gaps = 0/113 (0%) Frame = +3 Query MNKEKILKLAKGFRGRAKNCIRIARERVEKALQYSYRDRRNKKRDMRSLWIQRINAATRL 248 MNK KI KLAKGFRGRAKNCIRIARERVEKALQYSYRDRRNKKRDMRSLWI+RINA TRL Sbjct MNKGKIFKLAKGFRGRAKNCIRIARERVEKALQYSYRDRRNKKRDMRSLWIERINAGTRL 60 Query HAVNYGTFMHGLLRENVQLNRKVLSELSMHEPHSFKALVDVSRAAFPGNRQIP 407 H VNYG FMHGL++EN+QLNRKVLSELSMHEP+SFKALVDVSR AFPGNR +P Sbjct HGVNYGNFMHGLMKENIQLNRKVLSELSMHEPYSFKALVDVSRNAFPGNRPVP 113 Yes No Can not tell from data #1 Mistake in Selecting an ORF © 2014 WSSP

48 Compare the BLASTx and BLASTp a different clone:
Did the student select the correct protein sequence? Yes No Can not tell © 2014 WSSP

49 Compare the BLASTx and BLASTp : Are the matches to the same proteins?
Yes No Can not tell © 2014 WSSP

50 Compare the BLASTx and BLASTp: Are the e-values similar?
Yes No Can not tell © 2014 WSSP

51 Did the student choose the correct protein sequence?
gi| |ref|NP_ | 50S ribosomal protein L20 [Zea mays] Length=122 Score = 124 bits (310), Expect = 4e-27 Identities = 57/68 (83%), Positives = 63/68 (92%), Gaps = 0/68 (0%) Query MRSLWIQRINAATRLHAVNYGTFMHGLLRENVQLNRKVLSELSMHEPHSFKALVDVSRAA 60 MRSLWI+RINA TRLH VNYG FMHGL++EN+QLNRKVLSELSMHEP+SFKALVDVSR A Sbjct MRSLWIERINAGTRLHGVNYGNFMHGLMKENIQLNRKVLSELSMHEPYSFKALVDVSRNA 105 Query FPGNRQIP 68 FPGNR +P Sbjct FPGNRPVP 113 BLASTp GENE ID: LOC | 50S ribosomal protein L20 [Zea mays] Length=122 Score = 146 bits (369), Expect = 7e-34 Identities = 68/79 (86%), Positives = 74/79 (93%), Gaps = 0/79 (0%) Frame = +1 Query SYRDRRNKKRDMRSLWIQRINAATRLHAVNYGTFMHGLLRENVQLNRKVLSELSMHEPHS 180 SYRDRRNKKRDMRSLWI+RINA TRLH VNYG FMHGL++EN+QLNRKVLSELSMHEP+S Sbjct SYRDRRNKKRDMRSLWIERINAGTRLHGVNYGNFMHGLMKENIQLNRKVLSELSMHEPYS 94 Query FKALVDVSRAAFPGNRQIP 237 FKALVDVSR AFPGNR +P Sbjct FKALVDVSRNAFPGNRPVP 113 BLASTx Yes No Can not tell from data #2 Mistake in Selecting an ORF © 2014 WSSP

52 Data already entered in!
DSAP Review Page Data already entered in! © 2014 WSSP

53 DSAP Review Page p. 7-17 © 2014 WSSP

54 Do NOT use Toolbox to determine the 5’ UTR!!!
© 2014 WSSP

55 The Five Commandments of DSAP
I. The stop codon is part of the ORF. II. The start of the 5’ UTR is always the first base.

56 Do NOT use Toolbox to determine the 3’ UTR!!!
© 2014 WSSP

57 Determine ranges of 5’ UTR and 3’ UTR by highlighting the ranges in the DSAP cDNA text box
© 2014 WSSP

58 What should you do if your clone is a partial?
5’ UTR No 5’ UTR © 2014 WSSP

59 The Five Commandments of DSAP
I. The stop codon is part of the ORF. II. The start of the 5’ UTR is always the first base. III. If the clone is a partial there is no 5’ UTR.

60 An example of a partial coding sequence
Similar Seq. © 2014 WSSP

61 ? S I R XGC TCA ATC CGT The first bases are part of the reading frame
© 2014 WSSP

62 The Five Commandments of DSAP
I. The stop codon is part of the ORF. IV. If the clone is a partial, the start of the ORF is always the first base. II. The start of the 5’ UTR is always the first base. III. If the clone is a partial, there is no 5’ UTR.

63 What should you do if you get these results?
BLASTX © 2014 WSSP

64 Why is my cDNA noncoding?
Genomic DNA ORF RNA AAAAAAA cDNA (Partial) AAAAAAA Recent genome wide RNA sequence studies show that more than 10% of polyA RNAs are non-coding © 2014 WSSP

65 If your DNA is non-coding, enter in the entire sequence as 3’ UTR
© 2014 WSSP

66 The Five Commandments of DSAP
I. The stop codon is part of the ORF. IV. If the clone is a partial, the start of the ORF is always the first base. II. The start of the 5’ UTR is always the first base. V. If the clone is a noncoding, the entire DNA is considered 3’ UTR. III. If the clone is a partial, there is no 5’ UTR.

67 The Five Commandments of DSAP
I. The stop codon is part of the ORF II. The start of the 5’ UTR is always the first base III. If the clone is a partial, there is no 5’ UTR. IV. If the clone is a partial, the start of the ORF is always the first base. V. If the clone is non-coding, the entire DNA 3’ UTR © 2014 WSSP


Download ppt "Determine ORF and BLASTP"

Similar presentations


Ads by Google