Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSC 4630 Meeting 7 February 7, 2007.

Similar presentations


Presentation on theme: "CSC 4630 Meeting 7 February 7, 2007."— Presentation transcript:

1 CSC 4630 Meeting 7 February 7, 2007

2 More Scripting Languages
awk, named for Aho, Weinberger, Kernighan Script is embedded in a nested looping control structure: for each pattern {action} do for each input file line do if pattern matches line then action

3 awk Programs Generally a sequence of pattern {action} statements
If {action} is missing, matched lines are printed (meaning written to STDOUT) If pattern is missing, action is carried out for all lines

4 Running awk Programs Short one, composed at keyboard with little thought awk ‘program’ file1 file2 … Note that awk can take a sequence of files as input. Long one, composed in editor awk –f progfile file1 file2 …

5 awk’s View of Files Input to awk are text files Divided into lines
Each line divided into fields by blanks or tabs (the default separator) Each field referenced by relative number, $1, $2, $3, … $0 refers to the entire line

6 Examples awk ‘{print $1}’ names awk ‘/M/’ names
Print the first field in each line of the names file awk ‘/M/’ names Print each line of the names file that contains an upper case M

7 Some Built-In Variables
NR, line number of current line of input (runs sequential over all input files) NF, number of fields in current line FS, the field separator FS = “\t” sets the separator to tab, only FS = “:” sets the separator to colon FNR, number of the current line (record) in the current input file (resets when a new input file is opened)

8 Examples {print NR, NF} {print NR, $0} {print $NF} NR == 10 NF != 3

9 Patterns Special patterns
BEGIN Action is done once before any lines of the input file(s) are read END Action is done once after the last file has been processed Relational expressions between strings or numbers Arguments treated as numbers, if possible

10 Comparison Operators < less than > greater than
<= less than or equal to >= greater than or equal to == equal to != not equal to ~ matches !~ does not match

11 Regular Expressions Enclosed in / / Matches in entire line
Field match specified as $3 ~ /Ab/, for example Special symbols \ ^ $ . [ ] * + ? ( ) |

12 Examples /Asia/ /^.$/ /a\$/ /\t/ $2 !~ /^[0-9]+$/
/(apple|cherry) (pie|tart)/ (note space)

13 C Escape Sequences \b backspace \f formfeed \n newline
\r carriage return \t tab \ddd character whose ASCII value in octal is ddd \” quotation mark \c any other character c literally

14 Actions Mini C-like programs Can extend over several lines
Statements terminated by semicolons or newlines. Statements grouped with braces { }. Variables are either floating point numbers or strings. Variables are automatically declared and initialized Strings initialized to “”, the empty string Numbers initialized to 0

15 Assignment Statements
Simple version: v = e Variable or field name assigned value of expression Assignment operators: v op= e means v = v op e Legal values of op are * / % ^ Used because interpreted code runs faster

16 Increment Operators Borrowed from C Prefix or postfix ++ or –
Example: x = 3. What is the value of k? k = x++ k = ++x k = x-- k = --x

17 Arithmetic Functions sin(x) assumes x is in radians
cos(x) assumes x is in radians atan2(y,x) range from –pi to pi exp(x) exponential log(x) natural logarithm of x, so x>0 sqrt(x) square root of x, so x >= 0 int(x) truncates fractional part rand(x) returns a random number in [0,1] srand(x) sets the seed for rand to x

18 Strings Literal values enclosed in double quotes
“abc” “Wildcats rule” “20 bananas” Concatenation represented by juxtaposition s = “Villanova” t = “Wildcats” {print s t}

19 String Functions “Standard” string operations (cf. head, tail, firstfew, lastfew, allbut) length(s) length of s length = length($0) index(s,t) if t is a substring of s return position of first character, return 0 otherwise substr(s,p) returns substring starting at position p if 0<p<=length(s), returns empty string otherwise substr(s,p,n) returns substring of length n starting at position p

20 String Functions (2) “Editing” functions
sub(r,s) replace r by s in current record (first occurrence only) sub(r,s,t) replace r by s in t (first occurrence only) gsub(r,s) replace r by s in current record (globally) gsub(r,s,t) replace r by s in t (globally) In all cases, return the number of substitutions

21 Control Structures if (<expression>) <s1> else <s2>
<expression> can be any expression; true is defined to be non-zero or non-null <s1> and <s2> can be any group of statements Note the critical parentheses that separate the conditional expression from <s1>

22 Control Structures (2) while (<expression>) <s1>
Same rules as for if-then-else

23 Control Structures (3) for (<e1>;<e2>;<e3>) <s1> is equivalent to <e1>; while (<e2>) {<s1>;<e3>} for (k in <array>) <s1> loops over the subscripts of an array but the order of the subscripts is random. Careful: awk allows general subscripting. Strings can be used as subscripts.

24 Control Structures (4) “Go to” structures
break when executed within a for or while statement, causes an immediate exit continue when executed within a for or while statement, causes immediate execution of the next iteration next causes the next line (record) of the input file to be read and the sequence of pattern {action} statements executed on it exit causes the program to jump to the END pattern, execute it, and stop


Download ppt "CSC 4630 Meeting 7 February 7, 2007."

Similar presentations


Ads by Google