Presentation on theme: "PAVE Overview Assembler and SNP finderWill WEB annotateWill WEB displayWill Java annotateCari Java viewPAVECari and Mark Java cmpPAVEMark Data organization."— Presentation transcript:
PAVE Overview Assembler and SNP finderWill WEB annotateWill WEB displayWill Java annotateCari Java viewPAVECari and Mark Java cmpPAVEMark Data organization Overall design Cari 1
454 MSU 3
Illumina NCGR 4
454+Ilm RedRice( OlR) 6 Total 454: 17,624 Total Illumina: 21,083 Need to compute Illimina low coverage (~singletons) from exp level.
454+Ilm Ginger (ZoR) 8
PAVE assembler Assemble Sanger with mate-pairs – retaining mate-pairs in contigs Assemble 454 – can handle ~500,000 easily by burying redundancy Assemble consensus sequences from 454 and Illumina SNPs – most use two confirming bases, but with 454 there is way too many false-positives due to homopolymers. So a ‘p-value’ is computed based on number of confirming ESTs, depth of ESTs at the base, and estimated base-call error. Script to add expression level 9
Web PAVE 10
15 Ginger with Illumina expression levels
UniProt table 17
18 Unitrans tables
Immediate Future Streamline and combine web and java annotation cmpPAVE Show alignments of: a protein to all its UniTrans a UniTrans to all its proteins Self-blast all UniTrans, create clusters of cliques, display and filters similar to UniProt Incorporate GO, EC, etc viewPAVE Show alignment of all proteins for a UniTran Show coverage of reads as histogram Cluster similar UniTrans (instead of Pairs) Incorporate GO and EC (i.e. same functionality as Web) Web PAVE Remove count of ESTs - instead indicate protein coverage How much we further extend….. 20
22 NRO: would need to reduce hits to D6NDn_9POAL because one could have best be 42 and another 43….
Clustering Best clustering for paralogs in viewPAVE and orthologs in cmpPAVE: Same protein region Prodom and/or Motif EC GO/GoSlim 23