Presentation is loading. Please wait.

Presentation is loading. Please wait.

Weixi Zhong Mentor: Dr. Andrew Cameron Center for Computational Regulatory Genomics California Institute of Technology.

Similar presentations


Presentation on theme: "Weixi Zhong Mentor: Dr. Andrew Cameron Center for Computational Regulatory Genomics California Institute of Technology."— Presentation transcript:

1 Weixi Zhong Mentor: Dr. Andrew Cameron Center for Computational Regulatory Genomics California Institute of Technology

2  Set up an accessible database for E. tribuloides transcriptome  Compare the quality of Eucidaris tribuloides RNA sequence assemblies  Choose best assembly  Create sequence database  Create web interface to access database Facilitate future E. tribuloides gene studies Share findings on E. tribuloides transcriptome Extensions after further research (i.e. more search options, feedback, etc.) 1. Image courtesy of Image 1.

3  Strongylocentrotus purpuratus Only Echinoderm with fully sequenced genome Evolutionarily closer to humans than many other model organisms used in developmental biology  Eucidaris tribuloides Distant relative of S. purpuratus (~275 my) Useful in comparative studies Image Image courtesy of SpBase (http://www.spbase.org/)

4 Gene regulatory differences? S. purpuratus E. tribuloides *Red arrows point to mesenchyme cells, which develop later in E. tribuloides than other sea urchins; circles indicate location of blastopore Image Image courtesy of Microscope images of sea urchin gastrula courtesy of Dr. Andrew Cameron

5  No available E. tribuloides genome  Assemble transcriptome: RNAcDNA Solexa reads Velvet assembly Expression studies Quality comparison! Early Et gastrulaDatabase

6  High-throughput short read sequencing technology cDNA.:AGGTCTTAC.: Sequenced reads

7  De novo genome assembly software developed by Daniel Zerbino and Ewan Birney at the European Bioinformatics Institute (EMBL-EBI) in UK  Contig – sequence of a set of contiguous overlapping reads  Contigs from a single velvet run assumed to be unique and non-overlapping Information from SOLEXA reads Contigs AGCAT GCATA GCATA CATAC CATAC ATACC ATACC TACCT TACCT ACCTG ACCTG CCTGT CCTGT CTGTA CTGTA TGTAA TGTAA AGCATACCTGTAA

8  Assess quality of assembly using length distribution: n50 and 90% complexity calculations  N50—length of shortest contig such that the summed length of equal or longer contigs constitute at least 50% of the total length of all contigs*  90% complexity—similar, assuming unique contigs *n50 definition based on definition by Jeremy Leipzig (http://jermdemo.blogspot.com/2008/11/calculating-n50-from-velvet-output.html) n50

9  Use S. purpuratus proteome as reference  Map contigs to proteome  Using proteome “removes” silent mutation differences between genes  Record metadata : count of matches, annotated matches, unique matches GLEAN3_00299: LEU-MET-TYR-PHE-GLU-GLY-CYS-LEU-LYS S. purpuratus: CTC-ATG-TAC-TTC-GAG-GGA-TGC-TTG-AAG E. tribuloides: TTG-ATG-TAT-TTT-GAA-GGA-TGC-CTG-AAA

10  Create database using PostgreSQL  User information table  Contig information table  Gene information table  Contig-gene match information table  Sequences  Write webpage to access database  Ability to search using both species  Display in text and graphical formats *Sample database site courtesy of Autumn Yuan

11 Eucidaris tribuloides RNA Sequence Database Search Sp genome search results Contig information popup Gene match information graphical display Et contigs search results Match details popup Tabular display Search history Change display order

12  Conduct research using database information  Share data with researchers through website  Add functionality to website as research findings evolve

13  Special thanks to:  The SoCalBSI faculty and staff Dr. Jamil Momand, Dr. Sandy Sharp, Dr. Nancy Warter-Perez, Dr. Wendie Johnston, Dr. Beverly Krilowicz, and Ronnie Cheng  My mentor: Dr. Andrew Cameron  The CCRG staff Autumn Yuan, Dong He, Dave Felt  All the SoCalBSI interns  Funded by:

14

15 Search using either species Choose search criterion Enter search terms Narrow down results Examples

16 Sp gene name with link to match display pageOfficial identifier for this sequence in the Sp genomeLink to result page for all Et contigs that match to this gene Link to SpBase page for this gene

17 Contig name, link to popup with contig informationContig lengthTop SPU matches for this contigCorresponding genes if existent, with link to display pageBlastx score and e-value

18 Contig name Contig length Contig coverageContig sequence

19 Gene name Basic gene information and links to more comprehensive webpages Change display format Alignment Link to popup with detailed alignment

20 Change display format Tabulated alignment summary Link to popup with detailed alignment

21 Alignment details Contig information

22

23

24


Download ppt "Weixi Zhong Mentor: Dr. Andrew Cameron Center for Computational Regulatory Genomics California Institute of Technology."

Similar presentations


Ads by Google