Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sequence Search and Analysis SPE 1653 (703) 308-2923.

Similar presentations


Presentation on theme: "Sequence Search and Analysis SPE 1653 (703) 308-2923."— Presentation transcript:

1 Sequence Search and Analysis Christopher.low@uspto.gov SPE 1653 (703) 308-2923

2 Biosequence Patent Search Mission Impossible - ? Mission Difficult - ?

3 Sample Searchable Public Databases National Center for Biotechnology Information (NCBI) Entrez –www.ncbi.nlm.nih.gov European Bioinformatics Institute (EBI) –www.ebi.ac.uk DNA DataBank of Japan (DDBJ) –www.ddbj.nig.ac.jp SwissProt, PIR, etc do not cover patents

4 NCBI Entrez NCBI Genbank –In collaboration with EMBL and DDBJ Databases from other producers –SwissProt, TrEMBL, PDB, PIR, etc Bibliographic databases –E.g., PubMed (MEDLINE) NCBI BLAST ® sequence searching

5 EMBL-EBI on the Web EMBL databases –EMBL Nucleotide Database (i.e. GenBank) –Translated EMBL (TrEMBL) Databases from other producers –SwissProt, PDB, etc Many sequence search options: FASTA, NCBI-BLAST, WU-Blast, Smith-Waterman

6 DDBJ via the Web DDBJ databases –DNA DataBank of Japan (i.e. GenBank) –Protein Mutant Database (PMD) Databases from other producers –Protein Databank (PDB) Several sequence search options: FASTA, BLAST, Smith-Waterman

7 USPTO Nucleic Acid Databases –GenEmbl (GenBank) –N-Genseq –ESTs Protein Databases –Protein Databank (PDB) –SwissProt –A-Genseq

8 Searched Sequence HIV protease PQITLWQAPLVTIKIGGQLKEALLDT GADDTVLEEMNLPGRWKPKMIGGIG GFIKVAQYDQILIEICGHKAIGTVLVG PTPVNIIGANLLTQIGCT Default parameters selected

9 Searched Sequence – Results - A Database: Protein sequences derived from the Patent division of GenBank 78 Hits |gb|AAN27487.1| Sequence 17 from patent US 6440730 Length = 1003 Score = 191 bits (486), Expect = 4e-50 Identities = 93/96 (96%), Positives = 93/96 (96%) |gb|AAN27487.1| Query1 : PQITLWQAPLVTIKIGGQLKEALLDTGADDTVLEEMNLPGRWKPKMIGGIGGFIKVAQYD 60 PQITLWQ PLVTIKIGGQLKEALLDTGADDTVLEEMNLPGRWKPKMIGGIGGFIKV QYD PQITLWQ PLVTIKIGGQLKEALLDTGADDTVLEEMNLPGRWKPKMIGGIGGFIKV QYD Sbjct: 57 PQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMNLPGRWKPKMIGGIGGFIKVGQYD 116 Query1 : QILIEICGHKAIGTVLVGPTPVNIIGANLLTQIGCT 96 QILIEICGHKAIGTVLVGPTPVNIIG NLLTQIGCT QILIEICGHKAIGTVLVGPTPVNIIG NLLTQIGCT Sbjct: 117 QILIEICGHKAIGTVLVGPTPVNIIGRNLLTQIGCT 152

10 Searched Sequence – Results - B Database: Protein Data Base (PDB) 75 Hits gi|230577|pdb|2HVP| HIV-1 Protease Length = 99 Score = 172 bits (437), Expect = 1e-44 Identities = 93/96 (96%), Positives = 93/96 (96%) gi|230577|pdb|2HVP| gi|230577|pdb|2HVP| Query1 : PQITLWQAPLVTIKIGGQLKEALLDTGADDTVLEEMNLPGRWKPKMIGGIGGFIKVAQYD 60 PQITLWQ PLVTIKIGGQLKEALLDTGADDTVLEEMNLPGRWKPKMIGGIGGFIKV QYD PQITLWQ PLVTIKIGGQLKEALLDTGADDTVLEEMNLPGRWKPKMIGGIGGFIKV QYD Sbjct: 57 PQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMNLPGRWKPKMIGGIGGFIKVRQYD 60 Query1 : QILIEICGHKAIGTVLVGPTPVNIIGANLLTQIGCT 96 QILIEICGHKAIGTVLVGPTPVNIIG NLLTQIGCT QILIEICGHKAIGTVLVGPTPVNIIG NLLTQIGCT Sbjct: 117 QILIEICGHKAIGTVLVGPTPVNIIGRNLLTQIGCT 96

11 Searched Sequence – Results - C Title: TITLE OF YOUR APPLICATION GOES HERE Perfect score: 521 Sequence: 1 PQITLWQRPLVTIKIGGQLK..........TPVNIIGRNLLTQIGCTLNF 99 Scoring table: BLOSUM62 Gapop 10.0, Gapext 0.5 Searched: 908470 seqs, 133250620 residues Total number of hits satisfying chosen parameters: 908470 Minimum DB seq length: 0 Maximum DB seq length: 2000000000 Post-processing: Minimum Match 0% Maximum Match 100% Listing first 45 summaries Database : A_Geneseq_101002:*

12 Searched Sequence – Results - D RESULT 1 ID AAU77767 standard; Protein; 99 AA. AC AAU77767; DT 05-JUN-2002 (first entry) DE Human immunodeficiency virus type 1 (HIV-1) related protein #1. KW Human immunodeficiency virus type 1; HIV-1; protease. OS Unidentified. PN KR98066681-A. PD 15-OCT-1998. PF 28-JAN-1997; 97KR-0002361. PR 28-JAN-1997; 97KR-0002361. PA (GLDS ) LG CHEM LTD. PI Kwon YD, Lee TG; DR WPI; 1999-598487/51. PT Mutated human immunodeficiency virus type 1 (HIV-1) protease PT and process for preparing the same - PS Example 3; Page 10; 18pp; Korean. CC The invention relates to a mutated human immunodeficiency CC virus type 1 (HIV-1) protease and a process for preparing the CC mutants. This sequence represents a human immunodeficiency CC virus associated protein described in the invention. SQ Sequence 99 AA;

13 Searched Sequence – Results - E Pred. No. is the number of results predicted by chance to have a score greater than or equal to the score of the result being printed, and is derived by analysis of the total score distribution. SUMMARIES % Result Query No. Score Match Length DB ID Description ------------------------------------------------------------- 1 521 100.0 99 20 AAU77767 Human immunodefici 15 516 99.0 177 11 AAR05744 HIV-1 protease gen SQ Sequence 99 AA; RESULT 1 Query Match 100.0%; Score 521; DB 20; Length 99; Query Match 100.0%; Score 521; DB 20; Length 99; Best Local Similarity 100.0%; Pred. No. 2.6e-58; Best Local Similarity 100.0%; Pred. No. 2.6e-58; Matches 99; Conservative 0; Mismatches 0; Indels 0; Gaps 0; Matches 99; Conservative 0; Mismatches 0; Indels 0; Gaps 0; Qy 1 PQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMNLPGRWKPKMIGGIGGFIKVRQYD 60 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Db 1 PQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMNLPGRWKPKMIGGIGGFIKVRQYD 60 Qy 61 QILIEICGHKAIGTVLVGPTPVNIIGRNLLTQIGCTLNF 99 ||||||||||||||||||||||||||||||||||||||| ||||||||||||||||||||||||||||||||||||||| Db 61 QILIEICGHKAIGTVLVGPTPVNIIGRNLLTQIGCTLNF 99

14 Searched Sequence – Results - F Pred. No. is the number of results predicted by chance to have a score greater than or equal to the score of the result being printed, and is derived by analysis of the total score distribution. SUMMARIES % Result Query No. Score Match Length DB ID Description ------------------------------------------------------------- 1 521 100.0 99 20 AAU77767 Human immunodefici 15 516 99.0 177 11 AAR05744 HIV-1 protease gen SQ Sequence 177 AA; RESULT 15 Query Match 99.0%; Score 516; DB 11; Length 177; Query Match 99.0%; Score 516; DB 11; Length 177; Best Local Similarity 99.0%; Pred. No. 2.3e-57; Best Local Similarity 99.0%; Pred. No. 2.3e-57; Matches 98; Conservative 1; Mismatches 0; Indels 0; Gaps 0; Matches 98; Conservative 1; Mismatches 0; Indels 0; Gaps 0; Qy 1 PQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMNLPGRWKPKMIGGIGGFIKVRQYD 60 ||||||||||||||||||||||||||||||||||||:||||||||||||||||||||||| ||||||||||||||||||||||||||||||||||||:||||||||||||||||||||||| Db 56 PQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMSLPGRWKPKMIGGIGGFIKVRQYD 115 Qy 61 QILIEICGHKAIGTVLVGPTPVNIIGRNLLTQIGCTLNF 99 ||||||||||||||||||||||||||||||||||||||| ||||||||||||||||||||||||||||||||||||||| Db 116 QILIEICGHKAIGTVLVGPTPVNIIGRNLLTQIGCTLNF 154

15 Sample Claims A polypeptide having HIV protease activity An isolated polypeptide having HIV protease activity An isolated polypeptide comprising SEQ ID NO: 1 An isolated polypeptide consisting essentially of SEQ ID NO: 1 An isolated polypeptide consisting of SEQ ID NO: 1 A peptide fragment having HIV protease activity A peptide fragment of SEQ ID NO: 1 with HIV protease activity A epitope of ten amino acids in length of SEQ ID NO: 1 capable of binding to an antibody to SEQ ID NO:1 An isolated polypeptide or fragment thereof of SEQ ID NO: 1 wherein one or more of amino acid residues have been substituted, deleted, or inserted and which polypeptide retains HIV protease enzymatic activity

16 Acknowledgements STIC / Toby Port and David Schreiber TC 1600 / James Martinell


Download ppt "Sequence Search and Analysis SPE 1653 (703) 308-2923."

Similar presentations


Ads by Google