Presentation is loading. Please wait.

Presentation is loading. Please wait.

Teaching Materials by Ivan Ovcharenko

Similar presentations


Presentation on theme: "Teaching Materials by Ivan Ovcharenko"— Presentation transcript:

1 Teaching Materials by Ivan Ovcharenko <IVOvcharenko@lbl
Teaching Materials by Ivan Ovcharenko part 3

2 part 2 File management: How to deal with files? Read data from a file? Write to a file? Regular expressions: How to compare strings? Extract substrings? Substitute characters? Practical example: reverse complement of a DNA sequence

3 Windows Graphical User Interface
part 2 File management Windows Graphical User Interface 1. Open 2. Edit 3. Save

4 Edit Read or Write part 2 Perl file management structure is simplier
1. Open file 2. READ or WRITE data (line by line) 3. Close file

5 part 2 How to open and close a file “data.txt” from a perl program? # open data.txt file for READING open (FILE, “ < data.txt”); Direction of file data flow File handler. This name will be used everywhere later in the program, when we will deal with this file. < - READ from a file > - WRITE to a file # close a file specified by FILE file handler close (FILE);

6 part 2 Writing “Hello everyone” to the “tmp.txt” file: #!/usr/local/bin/perl open (FILE, “ > tmp.txt”); print FILE “Hello everyone\n”; close (FILE);

7 part 2 Reading data from a file #!/usr/local/bin/perl # open file data.txt for reading open (FILE, “ < data.txt”); # read file line by line and print it out to the screen while ($line = <FILE>) { print “$line”; } #close file close(FILE); while loop is analogous to the for loop. All the body statements of it are executed until the condition in parenthesis is correct (true). $line = <FILE> - read next line from a file specified by the file handler FILE

8 part 2 Example. Calculating a sum of numbers in the file data.txt 1 18 23 2 chomp command removes “\n” (new line) symbol from the string #!/usr/local/bin/perl $sum = 0; open (FILE, “ < data.txt”); while ($line = <FILE>) { chomp($line); $sum = $sum + $line; } close(FILE); print “Sum of the numbers in data.txt file is $sum\n”; Sum of the numbers in data.txt file is 44

9 part 2 Operations with strings $string = “Hello everybody\n”; # concatenating strings $strA = “Hello “; $strB = “everybody\n”; $string = $strA . $strB; # length of the string -> number of characters inside $strLen = length ($strA); # $strLen = 6; # extracting a part of a string $strA = “Hello everybody\n”; $strB = substr ($strA, 2, 5); print “$strB”; llo e substr substr ( $string, $offset, $n) -- extracts $n characters from the string $string, starting at the position $offset (first position in a string is 0, not 1!)

10 part 2 Calculate a length of every string in a file named a.txt #!/usr/local/bin/perl # open the a.txt file open (INP, “<a.txt”); # read the file line by line while ($line = <FILE>) { chomp($line); $lineLength = length($line); print “$lineLength\n”; } # close the file close (INP);

11 ? ? part 2 Comparing strings $strA = “AAA”; $strB = “BBB”;
$strC = “bbb” if ($strA eq $strB) { print “true\n”; } if ($strB ne $strC) { print “true\n”; ? $strA = “AAAbbb”; $strC = “bbb” if ($strC eq substr($strA,3,3)) { print “true\n”; } ?

12 part 2 Modifying strings $strA = “AAAxCTT”; # substitute ‘x’ symbol by ‘N’ symbol in string $strA $strA =~ s/x/N/; # substitute all ‘A’ symbols by ‘G’ symbols $strA =~ s/A/G/g; global substitution # substitute all ‘A’s by ‘G’s, all ‘T’s by ‘A’s $strA =~ tr/AT/GA/; print “$strA \n”; GGGxCAA Note: tr/// substitutes only symbols, while s/// substitutes strings $strA =~ s/AAA/123/; 123xCTT

13 part 2 Example Convert string “Robert has 2 brothers and 2 sisters” to “John has 3 brothers and 3 sisters” $string = “Robert has 2 brothers and 2 sisters”; # 2 --> 3 $string =~ tr/2/3/; # Robert --> John $string =~ s/Robert/John/;

14 part 2 Reverse complement of a DNA sequence. Input file “seq1.fasta”. Output file “RCseq1.fasta”. #!/usr/local/bin/perl # open file seq1.fasta for reading & RCseq1.fasta for # writing open (INP, “ < seq1.fasta”); open (OUT, “ > RCseq1.fasta”); # read the sequence and store it into the string # variable $seq, skip the header line $seq = “”; <INP>; # reading a header line and losing the data while ($line = <FILE>) { chomp($line); $seq = $seq . $line; } # reverse complement the sequence $seq = reverse($seq); $seq =~ tr/ACTGactg/TGACtgac/; # output the $seq sequence in fasta format $offset = 0; while ($offset < length($seq)) { $subSeq = substr ($seq, $offset, 50); $offset = $offset + 50; print OUT “$subSeq\n”; } #close files close (INP); close (OUT);


Download ppt "Teaching Materials by Ivan Ovcharenko"

Similar presentations


Ads by Google