Download presentation

Presentation is loading. Please wait.

Published byFrancesca Antill Modified over 2 years ago

1
Bioinformatics Algorithms Genomics Proteomics Cellomics/Cytomics

2
Sequence analysis Phylogeny Patterns recognition Structure prediction Expression analysis Image analysis Bioinformatics Algorithms

3
Associative arrays (hashes) Associative arrays are similar to normal arrays, with one important difference. Instead of using numbers to index each element in an array, you can use more meaningful strings. Associative arrays (“hash variables”) are preceded by the percent sign ( % ). %favourite_cats = ( mary, "Fluffy", jim, "Tibby", fred, "Lucky" ); $cat{jim} = "Felix"; %colour_purple = ( r => 255, g => 0, b => 255 ); %product = ("Super Widget" => 39.99, "Wonder Widget" => 49.99, "Mega Widget" => ); $product{ " Super Widget " } = 29.99; $item = " Super Widget " ; $price = 39.99; $product{$item} = $price; print $product{"Super Widget"}; 39.99

4
Collect all known sequences for the region of interest. Align all sequences (using multiple sequence alignment). Compute the frequency of each nucleotide in each position (PSPM). Incorporate background frequency for each nucleotide (PSSM). Score the query sequence with the PSSM. Building a scoring matrix

5
Positional count

6
A T C G Frequency matrix

7
N : total number of sequences (15 in this example) n i,j :number of times nucleotide ‘i’ was observed in position ‘j’ of the alignment f i,j :n i,j /N - frequency of letter ‘i’ at position ‘j’ p i :a priori probability of letter ‘i’ in this example p A = 0.3, p T = 0.3, p G = 0.2, p C = 0.2 (overall frequency of the letters within Drosophila melanogaster genome) Prior probability Normalized with background probability Positive weight i,j means that frequency of letter i at position j of the alignment is higher then a priori probability of this letter.

8
Shapiro Senapathy Scoring Schema RNA splice junctions of different classes of eukaryotes: sequence statistics and functional implications in gene expression. (Nucleic Acids Res Sep 11;15(17): ) Exon Intron 5’ 3’ A C G T AGGTA/GAGT Score = 100 (t-min t ) / (max t -min t ) max t = 595 min t = 47

9
min t = 47 & max t = 595 Score = 100(509-47)/(595-47) = 84.3 AAGTGAGT Scoring a sequence t = = 509

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google