Presentation is loading. Please wait.

Presentation is loading. Please wait.

Transcriptome analysis

Similar presentations


Presentation on theme: "Transcriptome analysis"— Presentation transcript:

1 Transcriptome analysis
Edouard Severing

2 Overview Introduction: Transcriptome complexity
Transcriptome reconstruction Without a genome With a genome Transcript abundances Differential expression Transcript abundances models (Maximum likelihood)

3 Gene-expression/Phenotypes
What are the gene expression differences that underly these phenotypic differences? Gene expression measured by assessing the abundance of mRNA molecules

4 Transcriptome vs. genome
Initial assumption N mRNA Molecules N Proteins N Protein coding genes Assumption is based on studies that were performed on bacterial systems

5 Complexity and gene count
genes genes

6 Transcriptome vs. genes in eukaryotes
Current view X N mRNA Molecules ? N Proteins What happens here ? N Protein coding genes

7 Splicing Splicing 5’- -3’ 5’- -3’ 5’- -3’ Pre-mRNA Exon Intron Exon
Gene

8 Alternative splicing II (Alternative splicing)
Pre-mRNA 5’- -3’ 5’- -3’ Splicing 5’- -3’ 5’- -3’ Splicing

9 Complexity and AS 90% genes have AS 42% genes have AS
The average number of transcripts produced by human genes is also higher than the average number of transcripts produced by plant genes

10 Extremes Dscam gene produces over 35,000 transcripts

11 AS type difference Humans Plants
In humans exon skipping is most frequent AS event type In plants intron retention are the most common AS event type Humans Exon skipping Plants Intron retention

12 RNA editing (Base modification)
Primary transcript (Predicted sequence) C U C 5’- A G U - 3’ A RNA-Editing After editing (Observed sequence) A C U U 5’- A G U - 3’ A Difficulty: Distinguish genuine RNA-editing from sequencing errors

13 Translation or decay A large fraction (>30%) of transcripts of protein coding genes are degraded by the nonsense-mediated decay (NMD) pathway. The position of the stop codon is used to predict whether a transcript is likely to be degraded by the NMD pathway

14 NMD target prediction Pre-mRNA 5’- -3’ Exon/Exon junctions mRNA 5’-
Open reading frame M 5’- -3’ Stop d Transcripts containing a Stop codon more than 55 nt upstream of the last exon/exon junction are predicted to be targets for the NMD-pathway.

15 Remember The number of unique mRNA molecules is much larger than the number of genes. A large fraction of the mRNA molecules is degraded by the NMD pathway. NMD provides a means to regulate gene-expression at the post-transcriptional level

16 Transcriptome analysis.
Reconstruction of the expressed transcripts given the sequencing data (Fragmented). Without a reference genome Trinity, TransABySS and Velvet With a reference genome Cufflinks, Scripture Determining the relative abundances of the predicted transcripts (cufflinks) Differential analysis (cufflinks) Gene-expression Alternative splicing

17 Without genome I

18 Without genome II

19 With a genome (Spliced alignment)
5’- -3’ mRNA

20 With a genome

21 With Genome II

22 Assignment Transcriptome reconstruction
Mapping of reads to the genome using tophat Reconstruction of the transcriptome using cufflinks Blast analysis of the assembly result

23 Your login barshap berryk cizara dennisv dirkv dunyac giorgiot heleenw hildam ioannism jitskel joelk kamleshs leilas luigif mushtar patricial peterve roberte seyeda taox tristanj weic xiaoxues yanickh allemaal hetzelfde pw: wvdABcv12

24 Change password ssh <yourlogin>@137.224.100.201 passwd Exit
Enter your password Change it to new password Type new password again Exit

25 Details ssh –X <yourlogin>@137.224.100.212
cd /mnt/geninf15/work/bif_course_2012 assignments are in assignment.txt

26 Estimating Expression levels
Would be easy if only full length transcripts were recovered. However, we have transcript fragments. Simply counting the number of reads mapping to a gene or transcript is not good enough (Normalization is needed) The number of fragments that can be produced from a transcript not only depends its abundance but also its length.

27 Expression levels FPKM is analogous to RPKM One fragment One read

28 Back to gene level expression (I)

29 Back to gene level expression (II)

30 Differential expression analysis
A genes is differentially expressed under two conditions if its expression difference is statistically significant. Larger that you would expect based random natural variation - In order to estimate the variance it is important to have experimental replicates . (Variation between biological replicates is larger than that between technical replicates).

31 Expression assignment
Estimate the expression levels of predicted transcripts / genes in Arabidopsis roots and flower buds. (Cufflinks) Differential expression analysis of transcript abundances in Arabidopsis roots and flower buds (Cuffdiff)


Download ppt "Transcriptome analysis"

Similar presentations


Ads by Google