Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computer Programming for Biologists Class 7 Nov 27 th, 2014 Karsten Hokamp

Similar presentations


Presentation on theme: "Computer Programming for Biologists Class 7 Nov 27 th, 2014 Karsten Hokamp"— Presentation transcript:

1 Computer Programming for Biologists Class 7 Nov 27 th, 2014 Karsten Hokamp http://bioinf.gen.tcd.ie/GE3M25/programming

2 Hash Variables  associative arrays  list of key/value pairs  values and keys  scalars  access values by key names  Great for look-ups! Description

3 Hash Variables Look-up Table Look-up table in real life for translation: AAAK AACN AAGK AAUN … … UUGL UUUF Genetic code In Perl use hash variable: %genetic_code = ( 'AAA' => 'K', 'AAC' => 'N', 'AAG' => 'K', 'AAU' => 'N', … 'UUG' => 'L', 'UUU' => 'F' ); Keys are unique!

4 Hash Variables  %bases = ('a', 'purine', 'c', 'pyrimidine', 'g', 'purine', 't', 'pyrimidine');  %complement = ('a' =>'t', 'c' => 'g', 'g' => 'c', 't' => 'a');  %letters = (1, 'a', 2, 'b', 3, 'c', 4, 'd'); Examples Hashes: Lists with special relationship between each pair of elements!

5 Hash Variables Storing Data # count frequency of nucleotides: my $As = 0; my $Cs = 0; my $Gs = 0; my $Ts = 0; foreach my $nuc (split //, $dna) { if ($nuc eq 'A') { $As++; } elsif ($nuc eq 'C') { $Cs++; } elsif ($nuc eq 'G') { $Gs++; } elsif ($nuc eq 'T') { $Ts++; }

6 Hash Variables Storing Data # count frequency of nucleotides: my %freq = (); foreach my $nuc (split //, $dna) { $freq{$nuc}++; }

7 Hash Variables Storing Data # count frequency of nucleotides: my %freq = (); foreach my $nuc (split //, 'ACTTGGGT') { $freq{$nuc}++; } keyvalue A1 C1 G3 T3 keys are stored in no specific order auto-initialisation with '' or 0

8 Hash Variables Scalar vs Hash $As = 0; As 0 $Cs = 0; Cs 0 $Gs = 0; Gs 0 $Ts = 0; Ts 0

9 Hash Variables Scalar vs Hash $As = 0; $As++; As 1 $Cs = 0; $Cs++; Cs 1 $Gs = 0; $Gs++; Gs 1 $Ts = 0; $Ts++; Ts 1

10 Hash Variables Scalar vs Hash $As = 0; $As++; As 1 $Cs = 0; $Cs++; Cs 1 $Gs = 0; $Gs++; Gs 1 $Ts = 0; $Ts++; Ts 1 Cs As Gs Ts 1 %freq = (); $freq{'Gs'}++; freq

11 Computer Programming for Biologists Practical: http://bioinf.gen.tcd.ie/GE3M25/programming/class7 Exercises

12 Hash Variables Accessing Elements General: $value = $hash{$key}; Special funtions: keys and values # get complement of a base my $new_base = $complement{$base}; # get aminoacid for a codon my $aa = $genetic_code{$codon}; # list all the aa's that occurred foreach my $aa (keys %list) { print "$aa was found!\n"; } loop through all keys

13 Hash Variables $freq = $freq{'Gs'}; print "Gs: $freq\n";  Gs: 3 Retrieving a key/value pair Cs As Gs Ts 3 %freq

14 Hash Variables $nuc = 'Gs'; print "$nuc: $freq{$nuc}\n";  Gs: 3 Retrieving a key/value pair Cs As Gs Ts 3 %freq

15 Hash Variables foreach my $nuc (keys %freq) { print "$nuc: $freq{$nuc}\n"; }  Cs: 1 Ts: 3 Gs: 3 As: 1 Retrieving a key/value pair Cs As Gs Ts 3 %freq

16 Hash Variables foreach my $nuc (sort keys %freq) { print "$nuc: $freq{$nuc}\n"; }  As: 1 Cs: 1 Gs: 3 Ts: 3 Retrieving a key/value pair Cs As Gs Ts 3 %freq

17 Hash Variables Checking for keys/values # does the key exist? if (exists $hash{$key}) { } # does the key have a defined value? if (defined $hash{$key}) { } # does the key have a value if ($hash{$key}) { }

18 Computer Programming for Biologists Use hashes in your sequence analysis tool for: -reporting frequencies of nucleotides or amino acids - reporting the GC content Exercises


Download ppt "Computer Programming for Biologists Class 7 Nov 27 th, 2014 Karsten Hokamp"

Similar presentations


Ads by Google