Presentation is loading. Please wait.

# LECTURE 2 Splicing graphs / Annoteted transcript expression estimation.

## Presentation on theme: "LECTURE 2 Splicing graphs / Annoteted transcript expression estimation."— Presentation transcript:

LECTURE 2 Splicing graphs / Annoteted transcript expression estimation

Transcriptomics Study of RNA biology gene genome
Transcription, alternative splicing 1203x 87x How to measure these? 234x

RNA-seq split read alignment
gene genome More on this on Thursday study group align

Splicing graph We’ll study this problem next week 220
Find paths that best explain the graph, e.g: 1203x 1511 1198 1303 1603 1508 1380 1597 87x 234x 95

Annotated transcript expression estimation
220 Assume we know the possible transcripts but their relative abundancies are unknown a b c ?x 1511 1198 1303 1603 1508 1380 1597 ?x ?x 95 (1603-(a+b+c))2+(95-b) 2 +(1511-(a+c)) 2 + (1508-(a+c)) 2 +(220-c) 2 +(1198-a) 2 +(1380-(a+b)) 2 + (1303-(a+b)) 2 +(1597-(a+b+c)) 2 f(a,b,c)= Least squares problem: Minimize

Least squares problem f(a,b,c) receives minimum when all partial derivates of f are zero. f’a(a,b,c) = 2(a+b+c-1603)+ 2(a+c-1511) + 2(a+c-1508) + 2(a-1198) + 2(a+b-1380) + 2(a+b-1303) + 2(a+b+c-1597) f’b(a,b,c) = 2(a+b+c-1603)+ 2(b-95) +2(a+b-1380) + 2(a+b-1303) + 2(a+b+c-1597) f’c(a,b,c) = 2(a+b+c-1603) + 2(a+c-1511) + 2(a+c-1508) + 2(c-220) + 2(a+b+c-1597) Google: linear equations solver, click first link, copy-paste, click ”solve the system”, copy-paste the result 7a+4b+4c=10100 4a+5b+2c=5978 4a+2b+5c=6439 This system has a unique solution, which is { a = 21032/17, b = 5260/51, c = 13097/51 }.

Solution 220 a 1237 x 1511 1198 1303 1603 1508 1380 1597 b 103 x c 257 x 95

Without annotation 2n possible transcripts for a gene with n exons
Solve least squares for each combination of possible transcripts and select the combination with best solution 22n combinations to consider In the next week Thursday study group we study an algorithm solving the same problem in polynomial time! Before that we study an easier problem, where the goal is just to predict transcripts, not their expression levels

Study group this Thursday
An algorithm for split-read alignment Input: Maximal exact matches between genome and RNA- sequencing read E.g. ACGATCATCGCT vs. ACGAGATCCGCTAGT Such alignment anchors can be computed efficiently using methods from Biological Sequence Analysis course

Study group this Thursday
Output: Consistent split-read alignment covering maximally the initial local alignments Exon 1 Exon 2 Exon 3

Download ppt "LECTURE 2 Splicing graphs / Annoteted transcript expression estimation."

Similar presentations

Ads by Google