" )) { $seq.= $fastaLine; $fastaLine = ; } # 2. Do something... # 2.1 compute $gcContent print "$header: $gcContent\n"; } Do something End of input? No End Start Save header Read line Header or end of input Yes Concatenate to sequence No Read line"> " )) { $seq.= $fastaLine; $fastaLine = ; } # 2. Do something... # 2.1 compute $gcContent print "$header: $gcContent\n"; } Do something End of input? No End Start Save header Read line Header or end of input Yes Concatenate to sequence No Read line">

Presentation is loading. Please wait.

Presentation is loading. Please wait.

5.1 Previously on... PERL course (let ’ s practice some more loops)

Similar presentations


Presentation on theme: "5.1 Previously on... PERL course (let ’ s practice some more loops)"— Presentation transcript:

1 5.1 Previously on... PERL course (let ’ s practice some more loops)

2 5.2 FASTA: Analyzing complex input Overall design: Read the FASTA file (several sequences). For each sequence: 1.Read the FASTA sequence 1.1. Read FASTA header 1.2. Read each line until next FASTA header 2.For each sequence: Do something 2.1. Compute G+C content 2.2. Print header and G+C content Let’s see how it’s done… Do something End of input? No End Start Save header Read line Header or end of input Yes Concatenate to sequence No Read line

3 5.3 # 1. Read FASTA sequece $fastaLine = ; while (defined $fastaLine) { # 1.1. Read FASTA header $header = substr($fastaLine,1); $fastaLine = ; # 1.2. Read sequence until next FASTA header while ((defined $fastaLine) and (substr($fastaLine,0,1) ne ">" )) { $seq.= $fastaLine; $fastaLine = ; } # 2. Do something... # 2.1 compute $gcContent print "$header: $gcContent\n"; } Do something End of input? No End Start Save header Read line Header or end of input Yes Concatenate to sequence No Read line

4 5.4 Class exercise 4a 1.Write a script that reads lines of names and expenses: Yossi 6.10,16.50,5.00 Dana 21.00,6.00 Refael 6.10,24.00,7.00,8.00 END For each line print the name and the sum. Stop when you reach "END" 2.Change your script to read names and expenses on separate lines, Identify lines with numbers by a "+" sign as the first character in the string: Yossi +6.10 +16.50 +5.00 Dana +21.00 +6.00 Refael +6.10 +24.00 +7.00 +8.00 END Sum the numbers while there is a '+' sign before them. Output: Yossi 27.6 Dana 27 Refael 45.1 Output: Yossi 27.6 Dana 27 Refael 45.1

5 5.5 Class exercise 4a 3.(Home Ex. 2 Q. 5) Write a script that reads several protein sequences in FASTA format, and prints the name and length of each sequence. Start with the example code from the last lesson. 4*.Write a script that reads several DNA sequences in FASTA format, and prints FASTA output of the sequences whose header starts with ' Chr07 '. 5**.Write a script that reads several DNA sequences in FASTA format, and prints FASTA output of the sequences whose header contains 'Chr07'.

6 5.6 Reading and writing files

7 5.7 Open a file for reading, and link it to a filehandle: open(IN, "<EHD.fasta"); And then read lines from the filehandle, exactly like you would from : my $line = ; my @inputLines = ; foreach $line (@inputLines)... Every filehandle opened should be closed: close(IN); Always check the open didn ’ t fail (e.g. if a file by that name doesn ’ t exists): open(IN, "<$file") or die "can't open file $file"; Reading files

8 5.8 Open a file for writing, and link it to a filehandle: open(OUT, ">EHD.analysis") or die... NOTE: If a file by that name already exists it will be overwriten! You could append lines to the end of an existing file: open(OUT, ">>EHD.analysis") or die.. Print to a file (in both cases): print OUT "The mutation is in exon $exonNumber\n"; Writing to files no comma here

9 5.9 You can ask questions about a file or a directory name (not filehandle): if (-e $name) { print "The file $name exists!\n"; } -e$name exists -r$name is readable -w$name is writable by you -z$name has zero size -s$name has non-zero size (returns size) -f$name is a file -d$name is a directory -l$name is a symbolic link -T$name is a text file -B$name is a binary file (opposite of -T). File Test Operators

10 5.10 open( IN, '<D:\workspace\Perl\p53.fasta' ); Always use a full path name, it is safer and clearer to read Remember to use \\ in double quotes open( IN, "<D:\\workspace\\Perl\\$name.fasta" ); (usually) you can also use / open( IN, "<D:/workspace/Perl/$name.fasta" ); Working with paths

11 5.11 Reading files: example $line = ; chomp $line; # loop processes one input line and print output for line while ($line ne "END") { # Separate name and numbers @nameAndNums = split(/ /, $line); $name = $nameAndNums[0]; @nums = split(/,/, $nameAndNums[1]); $sum = 0; # Sum numbers foreach $num (@nums) { $sum = $sum + $num; } print "$name $sum\n"; # Read next line $line = ; chomp $line; } Input: Yossi 6.10,16.50,5.00 Dana 21.00,6.00 Refael 24.00,7.00,8.00 END Output: Yossi 27.6 Dana 27 Refael 45.1

12 5.12 Reading files: example open(IN, '<D:\perl_ex\in.txt') or die "can't open input file"; $line = ; chomp $line; # loop processes one input line and print output for line while ($line ne "END") { # Separate name and numbers @nameAndNums = split(/ /, $line); $name = $nameAndNums[0]; @nums = split(/,/, $nameAndNums[1]); $sum = 0; # Sum numbers foreach $num (@nums) { $sum = $sum + $num; } print "$name $sum\n"; # Read next line $line = ; chomp $line; } close(IN); Input: Yossi 6.10,16.50,5.00 Dana 21.00,6.00 Refael 24.00,7.00,8.00 END Output: Yossi 27.6 Dana 27 Refael 45.1

13 5.13 Reading files: example open(IN, '<D:\perl_ex\in.txt') or die "can't open input file"; open(OUT,'>D:\perl_ex\out.txt') or die "can't open output file"; $line = ; chomp $line; # loop processes one input line and print output for line while ($line ne "END") { # Separate name and numbers @nameAndNums = split(/ /, $line); $name = $nameAndNums[0]; @nums = split(/,/, $nameAndNums[1]); $sum = 0; # Sum numbers foreach $num (@nums) { $sum = $sum + $num; } print OUT "$name $sum\n"; # Read next line $line = ; chomp $line; } close(IN); close(OUT); Input: Yossi 6.10,16.50,5.00 Dana 21.00,6.00 Refael 24.00,7.00,8.00 END Output: Yossi 27.6 Dana 27 Refael 45.1

14 5.14 Class exercise 5a 1.Change the script for class exercise 4a.2 to read the lines from an input file (instead of reading lines from keyboard). 2.Now, in addition, write the output of the previous question to a file named 'D:\perl_ex\class.ex.4a2.out' (instead of printing to the screen). 3*.Now, before opening 'D:\perl_ex\class.ex.4a2.out ‘, check if it exists, and if so – print a message that the output file already exist, and exit the script. 4*. Change the script for class exercise 4.a3 to receive from the user two strings: 1) a name of FASTA file 2) a name of an output file. And then - read from a FASTA file given by the user, and write to an output file also supplied by the user.

15 5.15 Passing information using command-line arguments

16 5.16 It is common to give arguments (separated by spaces) within the command-line for a program or a script: They will be stored in the array @ARGV : foreach my $arg (@ARGV){ print "$arg\n"; } Command line arguments > perl -w findProtein.pl D:\perl_ex\in.fasta 2 430 D:\perl_ex\in.fasta 2 430 @ARGV 'D:\perl_ex\in.fasta' '2' '430'

17 5.17 It is common to give arguments (separated by spaces) within the command-line for a program or a script: They will be stored in the array @ARGV : foreach my $arg (@ARGV){ print "$arg\n"; } > perl -w findProtein.pl D:\my perl\in.fasta 2 430 Command line arguments D:\my perl\in.fasta 2 430 @ARGV 'D:\my' 'perl\in.fasta' '2' '430'

18 5.18 It is common to give arguments (separated by spaces) within the command-line for a program or a script: They will be stored in the array @ARGV : foreach my $arg (@ARGV){ print "$arg\n"; } > perl -w findProtein.pl "D:\my perl\in.fasta" 2 430 Command line arguments D:\my perl\in.fasta 2 430 @ARGV 'D:\my perl\in.fasta' '2' '430'

19 5.19 It is common to give arguments (separated by spaces) within the command-line for a program or a script: They will be stored in the array @ARGV : my $inFile = $ARGV[0]; my $outFile = $ARGV[1]; Or more simply: my ($inFile,$outFile) = @ARGV; Command line arguments > perl -w findProtein.pl D:\perl_ex\in.fasta D:\perl_ex\out.txt

20 5.20 Command line arguments in Eclispe

21 5.21 Command line arguments in Eclispe

22 5.22 Reminder: the class exercise of 3 days ago. Reading files - example Input: Yossi 6.10,16.50,5.00 Dana 21.00,6.00 Refael 24.00,7.00,8.00 END Output: Yossi 27.6 Dana 27 Refael 45.1

23 5.23 Reading files: example $line = ; chomp $line; # loop processes one input line and print output for line while ($line ne "END") { # Separate name and numbers @nameAndNums = split(/ /, $line); $name = $nameAndNums[0]; @nums = split(/,/, $nameAndNums[1]); $sum = 0; # Sum numbers foreach $num (@nums) { $sum = $sum + $num; } print "$name $sum\n"; # Read next line $line = ; chomp $line; } Input: Yossi 6.10,16.50,5.00 Dana 21.00,6.00 Refael 24.00,7.00,8.00 END Output: Yossi 27.6 Dana 27 Refael 45.1

24 5.24 Reading files: example my ($inFileName) = @ARGV; open(IN, "<$inFileName") or die "can't open $inFileName"; $line = ; chomp $line; # loop processes one input line and print output for line while ($line ne "END") { # Separate name and numbers @nameAndNums = split(/ /, $line); $name = $nameAndNums[0]; @nums = split(/,/, $nameAndNums[1]); $sum = 0; # Sum numbers foreach $num (@nums) { $sum = $sum + $num; } print "$name $sum\n"; # Read next line $line = ; chomp $line; } close(IN); Input: Yossi 6.10,16.50,5.00 Dana 21.00,6.00 Refael 24.00,7.00,8.00 END Output: Yossi 27.6 Dana 27 Refael 45.1

25 5.25 Reading files: example my ($inFileName, $outFileName) = @ARGV; open(IN, "<$inFileName") or die "can't open $inFileName"; open(OUT, ">$outFileName") or die "can't open $outFileName"; $line = ; chomp $line; # loop processes one input line and print output for line while ($line ne "END") { # Separate name and numbers @nameAndNums = split(/ /, $line); $name = $nameAndNums[0]; @nums = split(/,/, $nameAndNums[1]); $sum = 0; # Sum numbers foreach $num (@nums) { $sum = $sum + $num; } print OUT "$name $sum\n"; # Read next line $line = ; chomp $line; } close(IN); close(OUT); Input: Yossi 6.10,16.50,5.00 Dana 21.00,6.00 Refael 24.00,7.00,8.00 END Output: Yossi 27.6 Dana 27 Refael 45.1

26 5.26 Reading files: example my ($inFileName, $outFileName) = @ARGV; open(IN, "<$inFileName") or die "can't open $inFileName"; open(OUT, ">$outFileName") or die "can't open $outFileName"; $line = ; chomp $line; # loop processes one input line and print output for line while (defined $line) { # Separate name and numbers @nameAndNums = split(/ /, $line); $name = $nameAndNums[0]; @nums = split(/,/, $nameAndNums[1]); $sum = 0; # Sum numbers foreach $num (@nums) { $sum = $sum + $num; } print OUT "$name $sum\n"; # Read next line $line = ; chomp $line; } close(IN); close(OUT); Input: Yossi 6.10,16.50,5.00 Dana 21.00,6.00 Refael 24.00,7.00,8.00 Output: Yossi 27.6 Dana 27 Refael 45.1

27 5.27 Reading files: example my ($inFileName, $outFileName) = @ARGV; open(IN, "<$inFileName") or die "can't open $inFileName"; open(OUT, ">$outFileName") or die "can't open $outFileName"; $line = ; # loop processes one input line and print output for line while (defined $line) { chomp $line; # Separate name and numbers @nameAndNums = split(/ /, $line); $name = $nameAndNums[0]; @nums = split(/,/, $nameAndNums[1]); $sum = 0; # Sum numbers foreach $num (@nums) { $sum = $sum + $num; } print OUT "$name $sum\n"; # Read next line $line = ; } close(IN); close(OUT); Input: Yossi 6.10,16.50,5.00 Dana 21.00,6.00 Refael 24.00,7.00,8.00 Output: Yossi 27.6 Dana 27 Refael 45.1

28 5.28 Class exercise 5b 1.Change the script of class exercise 5a.2 such that script receive the input and output file names as arguments. 2*.Write a script receives a number of numeric arguments and prints its sum. For example: 10 20 30 40 output: 100


Download ppt "5.1 Previously on... PERL course (let ’ s practice some more loops)"

Similar presentations


Ads by Google