Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Perl Syntax: substitution s// and character replacement tr//

Similar presentations


Presentation on theme: "1 Perl Syntax: substitution s// and character replacement tr//"— Presentation transcript:

1 1 Perl Syntax: substitution s// and character replacement tr//

2 2 Substitution Pattern matching is useful for finding or indexing items, but to modify the data, substitution is required. Substitution searches a string for a PATTERN and, if found, replaces it with REPLACEMENT. $line =~ s/PATTERN/REPLACEMENT/; Returns a value equal to the number of times the pattern was found and replaced. $result = $line =~ s/PATTERN/REPLACEMENT/;

3 3 s/// $_ = "one two"; s/^([^ ]+) +([^ ]+)/$2 $1/; # $_ = "green scaly dinosaur"; s/(\w+) (\w+)/$2, $1/; # s/^/huge, /; # s/,.*een//;# s/green/red/; # s/\w+$/($`!)$&/; # s/\s+(!\W+)/$1 /; # s/huge/gigantic/g; # $_= "fred barney"; if(s/fred/george/) {#s/// returns true if sucessful print "replaced fred with george\n";

4 4 s/// $_ = "one two"; s/^([^ ]+) +([^ ]+)/$2 $1/; #swap first 2 words $_ = "green scaly dinosaur"; s/(\w+) (\w+)/$2, $1/; # "scaly, green dinosaur" s/^/huge, /; #huge, scaly, green dinosaur s/,.*een//;#huge dinosaur s/green/red/; #fails s/\w+$/($`!)$&/; # huge (huge !)dinosaur next line ^^ --match _, !, and ) s/\s+(!\W+)/$1 /; # huge (huge!) dinosaur -- replace '_!)' with !)_ s/huge/gigantic/; #gigantic (huge!) dinosaur $_= "fred barney"; if(s/fred/george/) {#s/// returns true if sucessful print "replaced fred with george\n";

5 5 Character Replacement A similar operation to substitution is character replacement. tr/CHARACTER SEARCH LIST/REPLACEMENT LIST/ -- note that this is not pattern matching, but character matching -- returns number of characters replaced $line =~ tr/a-z/A-Z/; $count_CG = $line =~ tr/CG/CG/; $line =~ tr/ACGT/TGCA/; $line =~ s/A/T/g; # be CAREFUL $line =~ s/C/G/g; # this turns your sequence into all A|C $line =~ s/G/C/g; $line =~ s/T/A/g;

6 6 Character Replacement Flags tr/SEARCH_LIST/REPLACEMENT_LIST /c -- complement the SEARCHLIST -- SEARCH_LIST is comprised of all characters NOT in SEARCH_LIST tr/ / /d -- delete found but unreplaced characters tr / / /s -- squash duplicate replaced characters -- sequences (or runs) of characters replaced, are squashed down to a single character

7 7 Character Replacement while($line = ) { $count_CG = $line =~ tr/CG/CG/; $count_AT = $line =~ tr/AT/AT/; } $total = $count_CG + $count_AT; $percent_CG = 100 * ($count_CG/$total); print “The sequence was $percent_CG CG-rich.\n”;

8 8 Pattern Matching Flags gm// s// (not tr// ) match globally, find all occurrences iignore case mmatch multiple lines as continuous string s treat string as single line.

9 9 Examples $_ " AtttcgAtggctaaaAtttgctt" s/A/a/g;#atttcgatggctaaaatttgctt s/^\s+//; #strip leading white space s/\s+$//; #strip trailing white space Binding operator $string = " opps"; $string =~ s/^\s+//; # "opps"

10 10 Upper/Lower Case \U -- everything that follows is upper case \L -- what follows is lower case \u -- single character upper \l -- single character lower $_ = "I saw Barney with Fred"; s/(fred|barney)/\U$1/gi; I saw BARNEY with FRED

11 11 #!/usr/bin/perl # newNaive.pl # # Here is an example that shows how the "match" # returns a "true" -- so that on the "IF" control structure, # execution precedes into the block # # The $` is a special variable that "remembers" all of the # string that was passed over by the pattern matching engine. # # Using the length() function, the position of the match is determined, # and printed # $_ = "CCCATGATG"; if(/ATG/) { print "Found sequence at position ".length($`)."\n"; }

12 12 ` #!/usr/bin/perl # newNaive2.pl # # Here is an example that shows how the "match" # returns a "true" -- so that on the "IF" control structure, # execution precedes into the block # # The $` is a special variable that "remembers" all of the # string that was passed over by the pattern matching engine. # # Using the length() function, the position of the match is determined, # and printed # $_ = "CCCATAATTTAGTTTT"; if(/ATG/) { print "Found Start codong $& at position ".length($`)."\n"; } elsif (m/TAG/) { print "found stop codon $& at position ".length($`)."\n"; print "There are ".scalar(length($&)+length($'))." nucleotides after the stop, including the stop codon\n"; } elsif (m/TAA/) { print "found stop codon $& at position ".length($`)."\n"; print "There are ".scalar(length($&)+length($'))." nucleotides after the stop, including the stop codon\n"; } elsif (m/TGA/) { print "found stop codon $& at position ".length($`)."\n"; print "There are ".scalar(length($&)+length($'))." nucleotides after the stop, including the stop codon\n"; } else { print "Start/stop codons not found in $_\n"; }

13 13 #!/usr/bin/perl # sub.pl # # Example where I match "with" and " " and one or more # word characters # Then I replace all of that "with word" with "against 'word'" # # The $1 corresponds to the first set of parentheses. # $_="He's out bowling with Fred tonight"; s/with (\w+)/against $1/; print "$_\n";

14 14 #!/usr/bin/perl -w # bind3.pl # # Here's an example that takes a unix path ($file) # and copies it to anothe variable ($filename) # Then, we search for one or more of any character {.+} # followed by a "/" character -- but we have to use the # escape metacharacter "\" so that we don't end the match {\/}. # Finally, we are looking for one or more non-white spaces {(\S+) # at the end -- to pull off the the last file name "FOUND" # $path = "/home/tabraun/test/bob/FOUND"; $filename = $path; $filename =~ s/.+\/(\S+)/$1/; print "$filename\n";

15 15 #!/usr/bin/perl # randomSeq.pl # # Don't get too uptight over this line -- it is just setting # a "seed" for the rand() fuction with a value that approximates # a random number. If you must know, it takes a prccess ID ($$), # shifts its bit left 15 times, then add the process ID to the shifted # value, then does an bit-wise XOR (^) with the current time(). # print "Enter length of sequence to generate:"; $length = ; srand(time() ^ ($$ + ($$ << 15)) ); while($length) { # stay in loop until have generated enough sequence $rand = int rand(4); # Interger number between (0-3) inclusive $rand =~ tr/0123/ACTG/; $length = $length-1; #decrease loop counter $seq = $seq. $rand; #keep the nucleotide I just created } # Since I am out of the loop, I must be done print "$seq\n";

16 16 End


Download ppt "1 Perl Syntax: substitution s// and character replacement tr//"
Ads by Google