Download presentation

Published byMerilyn Horn Modified over 4 years ago

1
LECTURE 2 Splicing graphs / Annoteted transcript expression estimation

3
**Transcriptomics Study of RNA biology gene genome**

Transcription, alternative splicing 1203x 87x How to measure these? 234x

4
**RNA-seq split read alignment**

gene genome More on this on Thursday study group align …

5
**Splicing graph We’ll study this problem next week 220**

Find paths that best explain the graph, e.g: 1203x 1511 1198 1303 1603 1508 1380 1597 87x 234x 95

6
**Annotated transcript expression estimation**

220 Assume we know the possible transcripts but their relative abundancies are unknown a b c ?x 1511 1198 1303 1603 1508 1380 1597 ?x ?x 95 (1603-(a+b+c))2+(95-b) 2 +(1511-(a+c)) 2 + (1508-(a+c)) 2 +(220-c) 2 +(1198-a) 2 +(1380-(a+b)) 2 + (1303-(a+b)) 2 +(1597-(a+b+c)) 2 f(a,b,c)= Least squares problem: Minimize

7
Least squares problem f(a,b,c) receives minimum when all partial derivates of f are zero. f’a(a,b,c) = 2(a+b+c-1603)+ 2(a+c-1511) + 2(a+c-1508) + 2(a-1198) + 2(a+b-1380) + 2(a+b-1303) + 2(a+b+c-1597) f’b(a,b,c) = 2(a+b+c-1603)+ 2(b-95) +2(a+b-1380) + 2(a+b-1303) + 2(a+b+c-1597) f’c(a,b,c) = 2(a+b+c-1603) + 2(a+c-1511) + 2(a+c-1508) + 2(c-220) + 2(a+b+c-1597) Google: linear equations solver, click first link, copy-paste, click ”solve the system”, copy-paste the result 7a+4b+4c=10100 4a+5b+2c=5978 4a+2b+5c=6439 This system has a unique solution, which is { a = 21032/17, b = 5260/51, c = 13097/51 }.

8
Solution 220 a 1237 x 1511 1198 1303 1603 1508 1380 1597 b 103 x c 257 x 95

9
**Without annotation 2n possible transcripts for a gene with n exons**

Solve least squares for each combination of possible transcripts and select the combination with best solution 22n combinations to consider In the next week Thursday study group we study an algorithm solving the same problem in polynomial time! Before that we study an easier problem, where the goal is just to predict transcripts, not their expression levels

10
**Study group this Thursday**

An algorithm for split-read alignment Input: Maximal exact matches between genome and RNA- sequencing read E.g. ACGATCATCGCT vs. ACGAGATCCGCTAGT Such alignment anchors can be computed efficiently using methods from Biological Sequence Analysis course

11
**Study group this Thursday**

Output: Consistent split-read alignment covering maximally the initial local alignments Exon 1 Exon 2 Exon 3

Similar presentations

© 2020 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google