Presentation is loading. Please wait.

Presentation is loading. Please wait.

BIOINFORMÁTICA UFMG A T G C. A T G C Genômica e Bioinformática ESTs mesmo que redundantes Genoma completo ou morte! 19952000.

Similar presentations


Presentation on theme: "BIOINFORMÁTICA UFMG A T G C. A T G C Genômica e Bioinformática ESTs mesmo que redundantes Genoma completo ou morte! 19952000."— Presentation transcript:

1 BIOINFORMÁTICA UFMG A T G C

2 A T G C Genômica e Bioinformática ESTs mesmo que redundantes Genoma completo ou morte!

3 BIOINFORMÁTICA UFMG A T G C O fim de uma EST (A) 20 0 AUG (A) 20 0 (T) 18 cDNA (fita -) AUG (A) 18 cDNA (fita +) (T) 18 cDNA (fita -) (A) 18 ATG ATCATGACTTACGGGCGCGCGATxxxxxx GGCGCGCGATATCCxxxx A A A T T T A T T A T C C x x x x x 3’EST 5’EST A A A T T T A T T A T C C A T C T A C G x x x x Uma foto de um novo transcriptoma [otorrin...] [...damonh...] start end

4 BIOINFORMÁTICA UFMG A T G C Vida depois de PHRED 15 Query: 469 TTAGGAGGATCGTTTTTAGAATCCCCTGCAACGTTACCACGGTGGATTTCACTGACTGCG 528 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 1038 ttaggaggatcgtttttagaatcccctgcaacgttaccacggtggatttcactgactgcg 979 Query: 529 ACGTTCTTAACGTTGAATCCAACGTTGCTACCAgggagagcctcagtaagtgcttcatga 588 ||||||||||||||||| || |||||||||||||||||| |||||||||||||||||||| Sbjct: 978 acgttcttaacgttgaagcccacgttgctaccagggagaccctcagtaagtgcttcatga 919 Query: 589 tgcatttcgacagaattgacttcagtcgacaaaccttgcggagcaaaagtgacgaccata 648 |||||||||||||| |||||||||| |||| ||||||||||| ||||||||||||||||| Sbjct: 918 tgcatttcgacagacttgacttcagccgaccaaccttgcggaccaaaagtgacgaccata 859 Query: 649 ccaggcttgatgataccagtttcaacgc 676 |||||||||||||||||||||||||||| Sbjct: 858 ccaggcttgatgataccagtttcaacgc 831 Query: non trimmed read. Subj: published sequence

5 BIOINFORMÁTICA UFMG A T G C When PHRED meets BLAST pUC18 (published sequence) Sequencing reaction: single pool distributed over 3 96-well plates 3 MegaBACE 3 reads each reads total Processing: MegaBLAST (BLASTn, SWAT) Phred –trim: a chromatogram analyzer –trim_alt: increasing trim_cutoff from 1% up to 25%

6 BIOINFORMÁTICA UFMG A T G C O fim de uma EST PHRED 10 (10% error): only losses

7 BIOINFORMÁTICA UFMG A T G C

8 A T G C 0,00% 5,00% 10,00% 15,00% 20,00% 25,00% 30,00% 1%2%3%4%5%6%7%8%9%10%11%12%13%14%15%16%17%18%19%20%21%22%23%24%25% total miscallstepwise miscall 16%17% Trimmed reads % error in sequence Added bases 3% % error in the tip Error occurrence:

9 BIOINFORMÁTICA UFMG A T G C Virtual pUC18 protein: STOP = * >protein_puc18 RQGFPSHDVVKRRPVPSLHACRSTLEDPRVPSSNS*SWS*LFPV*NCYPLTIPHNIRAGS IKCKAWGA**VS*LTLIALRSLPAFQSGNLSCQLH**IGQRAGRGGLRIGRSSASSLTDS LRSVVRLRRAVSAHSKAVIRLSTESGDNAGKNM*AKGQQKARNRKKAALLAFFHRLRPPD EHHKNRRSSQRWRNPTGL*RYQAFPPGSSLVRSPVPTLPLTGYLSAFLPSGSVALSHSSR CRYLSSV*VVRSKLGCVHEPPVQPDRCALSGNYRLESNPVRHDLSPLAAATGNRISRARY VGGATEFLKWWPNYGYTRRTVFGICALLKPVTFGKRVGSS*SGKQTTAGSGGFFVCKQQI TRRKKGSQEDPLIFSTGSDAQWNENSR*GILVMRLSKRIFT*ILLN*K*SFKSI*SIYE* TWSDSYQCLISEAPISAICLFRSSIVA*LPVV*ITTIREGLPSGPSAAMIPRDPRSPAPD LSAINQPAGRAERRSGPATLSASIQSINCCREARVSSSPVNSLRNVVAIATGIVVSRSSF GMASFSSGSQRSRRVT*SPMLCKKAVSSFGPPIVVRSKLAAVLSLMVMAALHNSLTVMPS VRCFSVTGEYSTKSF*E*CMRRPSCSCPASIRDNTAPHSRTLKVLIIGKRSSGRKLSRIL PLLRSSSM*PTRAPN*SSASFTFTSVSG*AKTGRQNAAKKGIRATRKC*ILILFLFQYY* SIYQGYCLMSGYIFECI*KNKQIGVPRTFPRKVPPDV*ETIIIMTLTYKNRRITRPFRLA RFGDDGENL*HMQLPETVTACL*ADAGSRQARQGASAGVGGCRGWLNYAASEQIVLRVHH MRCEIPHRCVRRKYRIRRHSPFRLRNCWEGRSVRASSLLRQLAKGGCAARRLSWV

10 BIOINFORMÁTICA UFMG A T G C tBLASTn (BLASTx) maximize with PHRED Trim_cutoff parameter value (%) BLASTx score

11 BIOINFORMÁTICA UFMG A T G C Summarizing PHRED meets BLAST as errors in tip are 16% Molecules carry 3% global error And scores for EST vs aa comparisons maximize Real life: crossmatch ends with X’s Authors: –Fabiano Peixoto (CENAPAD) –Francisco Prosdocimi (Lab Biodiversidade) –Maurício Mudado (Lab Biodados)

12 BIOINFORMÁTICA UFMG A T G C pUC18 proteina virtual


Download ppt "BIOINFORMÁTICA UFMG A T G C. A T G C Genômica e Bioinformática ESTs mesmo que redundantes Genoma completo ou morte! 19952000."

Similar presentations


Ads by Google