Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Topics Quiz 1 Homework Review Programming Assignment # 1 Perl shortcuts Declaring variables and Scope Subroutines passing arguments array references.

Similar presentations


Presentation on theme: "1 Topics Quiz 1 Homework Review Programming Assignment # 1 Perl shortcuts Declaring variables and Scope Subroutines passing arguments array references."— Presentation transcript:

1 1 Topics Quiz 1 Homework Review Programming Assignment # 1 Perl shortcuts Declaring variables and Scope Subroutines passing arguments array references Programming Methods Top Down Design Bottom Up Coding and Testing Debugging Reading manuals and help pages Plain old documentation (POD) Lab time BINF 634 FALL 2015

2 Acknowledgements Thanks to John Grefenstette for allowing me to use these slides as a starting point for tonight’s lecture BINF 634 FALL 20152

3 Some Humor Perl can be powerful BINF 634 FALL 20153

4 4 Perl Shortcuts Any simple statement can be followed by a single modifier right before the ; or closing } STATEMENT if EXPR STATEMENT unless EXPR STATEMENT while EXPR STATEMENT until EXPR $ave = $ave/$n unless $n == 0; Same as: unless ($n == 0) { $ave = $ave/$n } What does this do? $x = 0; print $x++, "\n" until $x == 10; Output 0 1 2 3 4 5 6 7 8 9 BINF 634 FALL 2015

5 5 Perl Shortcuts Any simple statement can be followed by a single modifier STATEMENT foreach LIST STATEMENT is evaluated for each item in LIST, with $_ set to current item. @A = qw/One two three four/; print "$_\n" foreach @A; Output: One two three four BINF 634 FALL 2015

6 6 Perl Shortcuts Predefined Perl functions may be used with or without parentheses around their arguments: $next = shift @A; open FILE, $filename or die "Can't open $filename"; $next = shift @A; @chars = split //, $word; @fields = split /:/, $line; Many Perl functions assume $_ if their argument is omitted: @A = qw/One two three four/; print length, " $_\n" foreach @A; 3 One 3 two 5 three 4 four BINF 634 FALL 2015

7 7 Scope of variables  my variables can be accessed only until the end of the enclosing block (or until end of file, if outside any block) It's best to declare a variable in the smallest possible scope if ($x < $y) { my $tmp = $x; $x = $y; $y = $tmp } Variable declared in a control-flow statement are visible only with the associated block: my @seq_list = qw/ATT TTT GGG/; my $sequence = "NNN"; for my $sequence (@seq_list){ $sequence.= "TAG"; print "$sequence\n"; } print "$sequence\n"; Output: ATTTAG TTTTAG GGGTAG NNN Are these two different variables? BINF 634 FALL 2015

8 8 Subroutines Advantages of Subroutines Shorter code Easier to test Easier to understand More reliable Faster to write Re-usable BINF 634 FALL 2015

9 9 Subroutines Defining a subroutine: sub name { BLOCK } Arguments are accessed through array @_ Subroutine values are returned by: return VALUE Subroutines may be defined anywhere in the file, but are usually placed at end They can be arranged alphabetically or by functionality BINF 634 FALL 2015

10 Passing Parameters Into Subroutines Values are passed into subroutines using the special array @_ How do we know that this is an array?? The shortened name of this argument is _ It contains all of the scalars passed into the subroutine BINF 634 FALL 201510

11 Pass by Value #!/usr/bin/perl -w # A driver program to test a subroutine that # uses pass by value use strict; use warnings; my $i = 2; simple_sub($i); print "In main program, after the subroutine call, \$i equals $i\n\n"; exit; sub simple_sub { my($i)=@_; $i += 100; print "In subroutine simple_sub, \$i equals $i\n\n"; } Output In subroutine simple_sub, $i equals 102 In main program, after the subroutine call, $i equals 2 11 BINF 634 FALL 2015 Why are the two values different?

12 #!/usr/bin/perl use strict; use warnings; # File: min.pl my $a = ; chomp $a; my $b = ; chomp $b; $small = min($a, $b); print "min of $a and $b is $small\n"; exit; sub min { my ($n, $m) = @_; # @_ is the array of parameters if ($n < $m) { return $n } else { return $m } } % min.pl 123 45 min of 123 and 45 is 45 There is a bug in this program as written can you find it? How would you fix it to produce the indicated output below? 12BINF 634 FALL 2015 $small is not declared

13 #!/usr/bin/perl use strict; use warnings; # File: min_max.pl ## Subroutines can return lists my $a = ; chomp $a; my $b = ; chomp $b; my ($small, $big) = min_max($a, $b); print "max of $a and $b is $big\n"; print "min of $a and $b is $small\n"; exit; sub min_max { my ($n, $m) = @_; # @_ is the array of parameters if ($n < $m) { return ($n, $m) } else { return ($m, $n) } } % min_max.pl 123 45 max of 123 and 45 is 123 min of 123 and 45 is 45 13BINF 634 FALL 2015

14 14 Passing arguments All arguments are passed in a single list @a = qw/ This will all /; $b = "end"; @c = qw/ up together /; @c = foo(@a, $b, @c); print "@c\n"; sub foo { my @args = @_; return @args; } Output: This will all end up together BINF 634 FALL 2015

15 Array Flattening #!/usr/bin/perl -w # A driver program to test a subroutine that # illustrates array flattening use strict; use warnings; my @i = ('1', '2', '3'); my @j = ('a','b','c'); print "In main program before calling subroutine: i = ". "@i\n"; print "In main program before calling subroutine: j = ". "@j\n"; reference_sub(@i, @j); print "In main program after calling subroutine: i = ". "@i\n"; print "In main program after calling subroutine: j = ". "@j\n"; exit; sub reference_sub { my (@i, @j) = @_; print "In subroutine : i = ". "@i\n"; print "In subroutine : j = ". "@j\n"; push(@i, '4'); shift(@j); } Output In main program before calling subroutine: i = 1 2 3 In main program before calling subroutine: j = a b c In subroutine : i = 1 2 3 a b c In subroutine : j = In main program after calling subroutine: i = 1 2 3 In main program after calling subroutine: j = a b c 15BINF 634 FALL 2015

16 Passing by Value Versus Passing by Reference Passing by Value Pass a copy of the variable Changes made to variable in subroutine do not effect the value of variables in the main body Can cause array flattening Passing by Reference Pass a reference (pointer) to the variable Must be dereferenced when used in the subroutine This is the cure for array flattening BINF 634 FALL 201516

17 Perl References - I A reference is a scalar variable that refers to (points to) another variable So a reference might refer to an array $aref = \@array; # $aref now holds a reference to @array $xy = $aref; # $xy now holds a reference to @array #Lines 2 and 3 working together do the same thing as line 1 $aref = [ 1, 2, 3 ]; @array = (1, 2, 3); $aref = \@array; BINF 634 FALL 201517 http://perl.plover.com/FAQs/references.html

18 Perl References - II BINF 634 FALL 201518 http://perl.plover.com/FAQs/references.html

19 Dereferencing ${$aref}[3] is too hard to read, so you can write $aref->[3] instead Additional helpful discussions can be found at http://oreilly.com/catalog/advperl/excerpt/ch01.html BINF 634 FALL 201519 http://perl.plover.com/FAQs/references.html

20 Passing by Reference #!/usr/bin/perl -w # A driver program to test a subroutine that # passes by reference use strict; use warnings; my @i = ('1', '2', '3'); my @j = ('a','b','c'); print "In main program before calling subroutine: i = ". "@i\n"; print "In main program before calling subroutine: j = ". "@j\n"; reference_sub(\@i, \@j); print "In main program after calling subroutine: i = ". "@i\n"; print "In main program after calling subroutine: j = ". "@j\n"; exit; sub reference_sub { my ($i, $j) = @_; print "In subroutine : i = ". "@$i\n"; print "In subroutine : j = ". "@$j\n"; push(@$i, '4'); shift(@$j); } Output: In main program before calling subroutine: i = 1 2 3 In main program before calling subroutine: j = a b c In subroutine : i = 1 2 3 In subroutine : j = a b c In main program after calling subroutine: i = 1 2 3 4 In main program after calling subroutine: j = b c 20BINF 634 FALL 2015

21 Arrays references To pass more than one list to a subroutine, use references to the arrays @a = qw/ This will all /; $b = "end"; @c = qw/ up together /; # this passes in references to the arrays bar(\@a, $b, \@c); # \@a is a reference (pointer) to @a sub bar { my ($x, $b, $z) = @_; # @_ has three items # dereference first argument my @A = @$x; # @$x is the array referenced by $x # dereference third argument my @C = @$z; print "@A\n"; print "$b\n"; print "@C\n"; } This will all end up together 21BINF 634 FALL 2015

22 22 Program Design InputAlgorithmOutput Q. What is the form of input data? Q. How will the program will get it? - interactive - command line - parameter file Q. How will the program process the data to compute the desired output? Q.How will the output be formatted and delivered? Specified by user requirements BINF 634 FALL 2015

23 23 Program Design Design Top Down Identify the inputs Understand the requirements for the output Design an overall algorithm for computing the output Express overall method in pseudocode Refine pseudocode until each step forms a well-defined subroutine Test Bottom Up Write and debug subroutines one at a time Start with “utility” subroutines that will be used by other subroutines Test each subroutine with input data that gives known results Include subroutines that help debugging, such as printing routines for data structures BINF 634 FALL 2015

24 24 Pseudocode High level, informal program No details Example: print out length statistics and overall nucleotide usage statistics for a file of sequences Input: get sequences from DNAfile Algorithm: for each DNA sequence, get length statistics count each type of nucleotide Output: print length statistics print nucleotide usage statistics BINF 634 FALL 2015

25 25 Pseudocode Keep pseudocode in perl program as comments # get sequences from DNAfile # for each DNA sequence, # get length statistics # count each type of nucleotide # print length statistics # print nucleotide usage statistics BINF 634 FALL 2015

26 26 Refinement Refine pseudocode into more detailed steps: Input: get name of DNAfile open DNAfile read lines from DNAfile, putting DNA sequences in a list Algorithm: for each DNA sequence in the list get length and update statistics count each type of nucleotide in the sequence Output: print length statistics print nucleotide usage statistics BINF 634 FALL 2015

27 27 Algorithm Refinement Try to express complex tasks using Perl control structures (e.g. loops) until inner subtasks for well-defined tasks that can be done by a single subroutine. Algorithm: for each DNA sequence in the list get length and update statistics count each type of nucleotide in the sequence for each DNA sequence in the list get length and update statistics for each base count the occurrence of that base in the sequence Now write a subroutine to count any base in any sequence BINF 634 FALL 2015

28 28 Program Design Design Top Down Identify the inputs Understand the requirements for the output Design an overall algorithm for computing the output Express overall method in pseudocode Refine pseudocode until each step forms a well-defined subroutine Test Bottom Up Write and debug subroutines one at a time Start with “utility” subroutines that will be used by other subroutines Test each subroutine with input data that gives known results Include subroutines that help debugging, such as printing routines for data structures BINF 634 FALL 2015

29 #!/usr/bin/perl # File: sub1.pl # subroutine to count A's in DNA use warnings; use strict; my $a; my $dna = "tagATAGAC"; $a = count_A($dna); print "$dna\n"; print "a: $a\n"; exit; ######################################### # subroutine to count A's in DNA # sub count_A { # @_ is the list of parameters my ($dna) = @_; # array context assignment my $count; # tr returns number of matches $count = ($dna =~ tr/Aa//); return $count; } Output: tagATAGAC a: 4 After you've written a subroutine, ask yourself if it can be made a bit more general 29BINF 634 FALL 2015

30 #!/usr/bin/perl # File: sub2.pl # subroutine to count any letter in DNA use warnings; use strict; my ($a, $c, $g, $t); my $dna = "tagATAGAC"; $a = count_base('A', $dna); $t = count_base('T', $dna); $c = count_base('C', $dna); $g = count_base('G', $dna); print "$dna\n"; print "a: $a t: $t c: $c g: $g\n"; exit; ######################################### # # subroutine to count any letter in DNA # sub count_base { my( $base, $dna ) = @_; my( $count ); $count = ($dna =~ s/$base//ig); return $count; } Output: tagATAGAC a: 4 t: 2 c: 1 g: 2 30BINF 634 FALL 2015

31 31 Program Design: Managing Complexity Understand inputs and outputs Use pseudocode to refine your algorithm Use divide-and-conquer to turn big problems into manageable pieces within a chromosomes, process one gene at a time within each gene, process one reading frame at a time within each reading frame, process one ORF at a time Pick data structures that make algorithms easier this gets easier with experience! Write subroutines to transform one data object to another, for example: dna (string) to reading frame (array of codons) reading frame to orf perform some well defined task compute some statistics on a single data object produce final output format Write small programs (drivers) to test each subroutine before combining them together BINF 634 FALL 2015

32 Some Good Programming References Algorithms + Data Structures = Programs (Prentice-Hall Series in Automatic Computation)[Hardcover] Niklaus Wirth (Author) Niklaus Wirth Introduction to Algorithms [Hardcover] Thomas H. Cormen (Author), Charles E. Leiserson (Author), Ronald L. Rivest (Author), Clifford Stein (Author) Thomas H. CormenCharles E. LeisersonRonald L. RivestClifford Stein BINF 634 FALL 201532

33 33 Read The Fine Manual (RTFM) The more you read manuals, the easier it will be For each function we have covered tonight, read the corresponding description in Ch. 29 of Wall If you find something in the manual you don't understand, look it up (or ask someone) Learn to use the online help pages, e.g., % perldoc -f join To see a list of online tutorials, see % perldoc perl For example: % perldoc perlstyle The interface is somewhat vi like BINF 634 FALL 2015

34 34 Debugging Strategies Before running the program, always run % perl -c prog Read the warnings and error message from the compiler carefully Always use strict and use warnings Basic strategy: bottom-up debugging Test and debug one subroutine at a time Insert print statements to figure out where a program fails to print values of variables Comment out when not needed - don't remove! BINF 634 FALL 2015

35 Starting the Debugger [binf:~/binf634/workspace/binf634_book_examples] jsolka% perl -d example-6-4.pl Loading DB routines from perl5db.pl version 1.28 Editor support available. Enter h or `h h' for help, or `man perldebug' for more help. main::(example-6-4.pl:11): my $dna = 'CGACGTCTTCTAAGGCGA'; DB BINF 634 FALL 201535

36 Getting Help Within the Debugger - I DB h List/search source lines: Control script execution: l [ln|sub] List source code T Stack trace - or. List previous/current line s [expr] Single step [in expr] v [line] View around line n [expr] Next, steps over subs f filename View source in file Repeat last n or s /pattern/ ?patt? Search forw/backw r Return from subroutine M Show module versions c [ln|sub] Continue until position Debugger controls: L List break/watch/actions o [...] Set debugger options t [expr] Toggle trace [trace expr] [>] [cmd] Do pre/post-prompt b [ln|event|sub] [cnd] Set breakpoint ! [N|pat] Redo a previous command B ln|* Delete a/all breakpoints H [-num] Display last num commands a [ln] cmd Do cmd before line = [a val] Define/list an alias A ln|* Delete a/all actions h [db_cmd] Get help on command w expr Add a watch expression h h Complete help page W expr|* Delete a/all watch exprs |[|]db_cmd Send output to pager ![!] syscmd Run cmd in a subprocess q or ^D Quit R Attempt a restart BINF 634 FALL 201536

37 Getting Help With the Debugger - II Data Examination: expr Execute perl code, also see: s,n,t expr x|m expr Evals expr in list context, dumps the result or lists methods. p expr Print expression (uses script's current package). S [[!]pat] List subroutine names [not] matching pattern V [Pk [Vars]] List Variables in Package. Vars can be ~pattern or !pattern. X [Vars] Same as "V current_package [Vars]". i class inheritance tree. y [n [Vars]] List lexicals in higher scope. Vars same as V. e Display thread id E Display all thread ids. For more help, type h cmd_letter, or run man perldebug for all docs. BINF 634 FALL 201537

38 Stepping Through Statements With the Debugger main::(example-6-4.pl:11): my $dna = 'CGACGTCTTCTAAGGCGA'; DB p $dna DB DB n main::(example-6-4.pl:12): my @dna; DB l 12==> my @dna; 13: my $receivingcommittment; 14: my $previousbase = ''; 15 16: my$subsequence = ''; 17 18: if (@ARGV) { 19: my$subsequence = $ARGV[0]; 20 }else{ 21: $subsequence = 'TA'; DB p $dna CGACGTCTTCTAAGGCGA BINF 634 FALL 201538

39 Using the Perl Debugger DB n n main::(example-6-4.pl:13): my $receivingcommittment; DB n main::(example-6-4.pl:14): my $previousbase = ''; DB n main::(example-6-4.pl:16): my$subsequence = ''; DB n main::(example-6-4.pl:18): if (@ARGV) { DB n main::(example-6-4.pl:21): $subsequence = 'TA'; DB n main::(example-6-4.pl:24): my $base1 = substr($subsequence, 0, 1); BINF 634 FALL 201539

40 Using the Perl Debugger DB n main::(example-6-4.pl:25): my $base2 = substr($subsequence, 1, 1); DB n main::(example-6-4.pl:28): @dna = split ( '', $dna ); DB p $base1 T DB p $base2 A DB DB n main::(example-6-4.pl:39): foreach (@dna) { DB p @dna CGACGTCTTCTAAGGCGA DB p "@dna" C G A C G T C T T C T A A G G C G A DB BINF 634 FALL 201540

41 Examining the Loop DB l 39-52 39==> foreach (@dna) { 40: if ($receivingcommittment) { 41: print; 42: next; 43 } elsif ($previousbase eq $base1) { 44: if ( /$base2/ ) { 45: print $base1, $base2; 46: $recievingcommitment = 1; 47 } 48 } 49: $previousbase = $_; 50 } 51 52: print "\n"; DB DB b 40 BINF 634 FALL 201541

42 Clearing Breakpoints and Exiting the Debugger DB c main::(example-6-4.pl:40): if ($receivingcommittment) { DB p C DB B Deleting a breakpoint requires a line number, or '*' for all DB q For additional discussions please see Ch. 20 of Wall or Ch. 6 of Tisdall BINF 634 FALL 201542

43 Modules and Libraries - I We will have more to say about this later We will collect subroutines into handy files called modules or libraries We tell the Perl compiler to utilize a particular module with the “use” command BINF 634 FALL 201543

44 Modules and Libraries - II Modules end in.pm BeginPerlBioinfo.pm The last line in a module must be 1; So we would access this module by putting the line use BeginPerlBioinfo; If the Perl compiler can’t find it you may have to tell it the path use lib ‘/home/tisdall/book’ use BeginPerlBioinfo; BINF 634 FALL 201544

45 45 POD (Ch. 26 in Wall) Plain Old Documentation produces self-documenting programs Comments can be extracted and formatted by external programs called translators Keeps program documentation consistent with external documentation pod text begins with " =identifier " at the start of a line but only where the compiler is expected a new statement All text is ignored by compiler until next line starting with " =cut " Various translators produced formatted documentation perldoc, pod2text, pod2html, pod2latex,etc details of format depends on identifier BINF 634 FALL 2015

46 =begin Put any number of lines of comments here. They will appear in the proper format when processed by pod translators. =cut # program text goes here =begin comment The identifier indicates which translator should process this text. This text will be ignored by all translators. Use this for internal documentation only. =cut # more program text... =head1 Section Heading text goes here, for example: =head1 SYNOPSIS usage: fasta.pl fastafile =over This starts a list: =item * First item in a list. =item * Second item. =back =cut 46BINF 634 FALL 2015

47 #!/usr/bin/perl =head1 NAME arglist.pl =head1 AUTHOR Jeff Solka =head1 SYNOPSIS usage: arglist.pl arg1 arg2... =head1 DESCRIPTION Echoes out the command line arguments. =over =item * First item in a list. =item * Second item. =back =cut ### main program print "The arguments are: @ARGV\n"; exit; An Example Program 47BINF 634 FALL 2015

48 48 Our Program in Action [binf:fall09/binf634/mycode] jsolka% arglist.pl cat The arguments are: cat BINF 634 FALL 2015

49 49 pod2text acting On Our Program [binf:fall09/binf634/mycode] jsolka% pod2text arglist.pl NAME arglist.pl AUTHOR Jeff Solka SYNOPSIS usage: arglist.pl arg1 arg2... DESCRIPTION Echoes out the command line arguments. * First item in a list. * Second item. See Ch. 26 for other formatting tricks. BINF 634 FALL 2015

50 50 perldoc Acting on Our Program [ binf:fall09/binf634/mycode] jsolka% perldoc arglist.pl > arglist.mp [binf:fall09/binf634/mycode] jsolka% cat arglist.mp ARGLIST(1) User Contributed Perl Documentation ARGLIST(1) NAME arglist.pl AUTHOR Jeff Solka SYNOPSIS usage: arglist.pl arg1 arg2... DESCRIPTION Echoes out the command line arguments. o First item in a list. o Second item. perl v5.8.8 2009-09-20 ARGLIST(1) BINF 634 FALL 2015

51 perl2html Acting on Our Program [binf:fall09/binf634/mycode] jsolka% pod2html arglist.pl > arglist.html /usr/bin/pod2html: no title for arglist.pl. [binf:fall09/binf634/mycode] jsolka% cat arglist.html arglist.pl NAME AUTHOR SYNOPSIS DESCRIPTION 51BINF 634 FALL 2015

52 52 html Output (cont.) h1> NAME arglist.pl AUTHOR Jeff Solka SYNOPSIS usage: arglist.pl arg1 arg2... DESCRIPTION Echoes out the command line arguments. First item in a list. Second item. BINF 634 FALL 2015

53 A Link to the Autogenerated Website Here it is BINF 634 FALL 201553

54 54 Readings Read Tisdall Chapters 7 and 8 HW3 Exercises 7.2 and 7.3 Don’t forget to turn in Program 1 next week. Don’t forget about Quiz # 2 next week BINF 634 FALL 2015


Download ppt "1 Topics Quiz 1 Homework Review Programming Assignment # 1 Perl shortcuts Declaring variables and Scope Subroutines passing arguments array references."

Similar presentations


Ads by Google