Presentation is loading. Please wait.

Presentation is loading. Please wait.

Phytome A Data Analysis Pipline presented by Jason Phillips.

Similar presentations


Presentation on theme: "Phytome A Data Analysis Pipline presented by Jason Phillips."— Presentation transcript:

1 Phytome A Data Analysis Pipline presented by Jason Phillips

2 High Level Flow Chart Retrieve Unigenes Translate Unigenes Families

3 Main Outline ● Unigenes (Where'd they come from, where'd they go?) ● Translation (methods and procedures) ● Building Families (the power of together-ness)

4 phytome » Unigene ● What are? ● Where from? ● Nine Species ● Arabidopsis, a special case ● Storage

5 phytome » Unigene » What Are? Combined EST's that overlap

6 phytome » Unigene » Where From? ● TIGR ● Other sources?

7 phytome » Unigene » Nine Species

8 phytome » Unigene » Arabidopsis Highly annotated... Highly sequenced... Highly translated...

9 phytome » Unigene » Storage species count ------------------- ghir 24350 mcry 8455 osat 60778 hann 20520 mtru 36976 lesc 31012 ljap 11025 lsat 21960 atha 27170 ------------------- total: 242246

10 phytome » Translation ● Methods ● Estwise ● Estscan ● FrameFinder ● Procedure ● Numbers

11 phytome » Translation » methods EST-WISE ESTSCAN FRAMEFINDER AB INITIO HOMOLOGIES via BLAST sprot + trembl

12 phytome » Translation » procedure ● EST-WISE (Mac OSX Cluster) – blast swiss prot: 10.3 hours, 35 nodes (~15 days) – blast trembl: 35.7 hours, 35 nodes (~52 days) ● ESTSCAN (Mustard) ● FrameFinder (Mustard)

13 phytome » Translation » numbers 242,246 Unigenes 242,246 Unigenes ESTWISE FRAMEFINDER ESTSCAN 151,83 0 226,988 242,24 2 90,416 15,258 4

14 phytome » Families ● Relationships ● Clustering ● Numbers

15 phytome » Families » Relationships Blast everything against everything sequences blastable db of sequences query sbjct e-value ------- -------- ----------- mtru302 ljap4523 1 29 mtru302 lesc25072 1 26 mtru302 hann20270 5 24 osat59606 osat59606 1 157 osat59606 osat4002 1 96 osat59606 atha25166 1 88..............

16 phytome » Families » Relationships But we have 4 set's of sequences! tblastx 242,246 nucleotides blastp 151,830 estwise blastp 226,988 estscan blastp 242,242 framefinder Which method do we trust?

17 phytome » Families » Relationships 4 data sets...4 family interpretations tb ew es ff ~3 days, 28 nodes (~84 days) ~1/4 day, 21 nodes (~5days) BLAST OFF!

18 phytome » Families » Relationships Method size no blast no trans attrition ------ -------- -------- -------- ---------- tb 242246 153 0 153 ew 151830 22 90416 90438 ff 242242 24563 4 24567 es 226988 1345 15258 16603 BLAST RESULTS

19 phytome » Families » Clustering TRIBE MCL evalue gene

20 phytome » Families » Clustering TRIBE MCL evalue gene

21 phytome » Families » Clustering fam id member ------........... 4035 atha7499 4035 atha7503 4035 atha8483 4036 atha10704 4036 osat23081 4036 osat36667 4037 atha1072 4037 atha5059 4037 lsat15421 4037 lsat21190..................... query sbjct evalue -------- -------- ------ atha7499 atha8483 6 78 atha7499 atha7503 4 90 osat23081 atha10704 8 78 osat23081 osat36667 8 78 atha1072 atha5059 2 68 atha1072 lsat15421 2 60 atha1072 lsat21190 1 102 atha1072 atha5059 9 54............... tribe mcl

22 phytome » Families » Clustering tb ff es ew tb ff es ew TRIBE MCL blast results families

23 phytome » Families » Clustering Let's look as some histograms!

24 What should we do next round?


Download ppt "Phytome A Data Analysis Pipline presented by Jason Phillips."

Similar presentations


Ads by Google