Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.

Similar presentations


Presentation on theme: "Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text."— Presentation transcript:

1 Introduction to Perl Bioinformatics

2 What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text files created by user describing a sequence of steps to be performed by the interpreter

3 Installation Create a Perl directory under C:\ Either Download AP.msi from the course website (http://curry.ateneo.net/~jpv/BioInf07/) and execute (installs into C:\Perl directory)http://curry.ateneo.net/~jpv/BioInf07/ Or download and unzip AP.zip into C:\Perl Reset path variable first (or edit C:\autoexec.bat) so that you can execute scripts from MSDOS C> path=%path%;c:\Perl\bin

4 Writing and Running Perl Scripts Create/edit script (extension:.pl) C> edit first.pl Execute script C> perl first.pl * Tip: place your scripts in a separate work directory # my first script print “Hello World”; print “this is my first script”;

5 Perl Features Statements Strings Numbers and Computation Variables and Interpolation Input and Output Files Conditions and Loops Pattern Matching Arrays and Lists

6 Statements A Perl script is a sequence of statements Examples of statements print “Type in a value”; $value = <>; $square = $value * $value; print “The square is ”, $square, “\n”;

7 Comments Lines that start with # are ignored by the Perl interpreter # this is a comment line In a line, characters that follow # are also ignored $count = $count + 1; # increment $count

8 Strings String Sequence of characters Text In Perl, characters should be surrounded by quotes ‘I am a string’ “I am a string” Special characters specified through escape sequences (preceded by a \ ) “a newline\n and a tab\t”

9 Numbers Integers specified as a sequence of digits 6 453 Decimal numbers: 33.2 6.04E24 (scientific notation)

10 Variables Variable: named storage for values (such as strings and numbers) Names preceded by a $ Sample use: $count = 5; # assignment statement $message = “Hello”; # another assignment print $count; # print the value of a variable

11 Computation Fundamental arithmetic operations: + - * / Others ** exponentiation () grouping Example (try this out as a Perl script) $x = 4; $y = 2; $z = (3 + $x) ** $y; print $z, “\n”;

12 Interpolation Given the following script: $x = “Smith”; print “Good morning, Mr. $x”; print ‘Good morning, Mr. $x’; Strings quoted with “” perform expansions on variables escape characters like \n are also interpreted when strings are quoted with “” but not when they are quoted with ‘’

13 Input and Output Output print function Escape characters Interpolation Input Bracket operator (e.g., $line = <>; ) Not typed (takes in strings or numbers)

14 Input Files Opening a file open INFILE, ’data.txt’; Input $line = ; Closing a file close INFILE;

15 Output Files Opening open OUTFILE, ’>result.txt’; Or, open OUTFILE, ’>>result.txt’; #append Input print OUTFILE “Hello”; Closing files close OUTFILE;

16 Conditions Can execute statements conditionally Syntax:Example: if ( condition )if ( $num > 1000 ) {{ statement print “Large”; statement} … }

17 If - Else $num = <>; if ( $num > 1000 ) { print “Large number\n”; } else { print “Small number\n”; } print “Thanks\n”;

18 Loops Repetitive execution Syntax:Example: while ( condition )$count = 0; {while ( $count < 10 ) statement{ statement print “counting-”, $count; … $count = $count + 1; }}

19 Conditions ( expr symbol expr ) Numbers ==equal =greater than or equal greater than Strings eq ne lt gt le ge =~pattern match

20 Functions length $strreturns number of characters in $str defined $strtests if $str is a valid string (useful for testing if $line=<>; suceeded) chomp $strremoves last character from $str (useful because $line=<>; includes the newline character) print $vardisplays $var on output device

21 Pattern Matching =~ is a condition that that checks if a string matches a pattern Simplest case: specifies a search substring Example: if (s =~ /bio/) … holds TRUE if s is “molecular biology”, “bioinformatics”, “the bionic man”; FALSE if s is “chemistry”, “bicycle”, “a BiOpsy”

22 Special pattern matching characters \wletters (word character) \ddigit \sspace character (space, tab \n) if ( s =~ /\w\w\s\d\d\d/ ) … holds TRUE for “CS 123 course”, “Take Ma 101 today” FALSE for “Only 1 number here”

23 Special pattern matching characters.any character ^beginning of string/line $end of string or line if ( s =~ /^\d\d\d\ss..r/ ) … holds TRUE for “300 spartans” FALSE for “all 100 stars”

24 Groups and Quantifiers [xyz] character set | alternatives * zero or more + 1 or more ? 0 or 1 {M} exactly M {M,N} between M and N characters

25 NCBI file Example /VERSION\s+(\S+)\s+GI:(\S+)/ Matches a version line Parenthesis groups characters for future retrieval $1 stands for the first version number, $2 gets the number after “GI:”


Download ppt "Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text."

Similar presentations


Ads by Google