Shell Scripting Awk (part1) Awk Programming Language standard unix language that is geared for text processing and creating formatted reports but it.

Slides:



Advertisements
Similar presentations
CST8177 awk. The awk program is not named after the sea-bird (that's auk), nor is it a cry from a parrot (awwwk!). It's the initials of the authors, Aho,
Advertisements

Shell Scripting Printing; Loops and Logic printf: formatted print printf is a more standardized method to print text to the screen. It can be used instead.
CIS 240 Introduction to UNIX Instructor: Sue Sampson.
Introduction to Unix – CS 21 Lecture 11. Lecture Overview Shell Programming Variable Discussion Command line parameters Arithmetic Discussion Control.
Lecture 2 Introduction to C Programming
Introduction to C Programming
1 Unix Talk #2 AWK overview Patterns and actions Records and fields Print vs. printf.
True or false A variable of type char can hold the value 301. ( F )
More on Numerical Computation CS-2301 B-term More on Numerical Computation CS-2301, System Programming for Non-majors (Slides include materials from.
Bash, part 2 Prof. Chris GauthierDickey COMP Unix Tools.
JavaScript, Fourth Edition
Chapter 2 Data Types, Declarations, and Displays
Guide To UNIX Using Linux Third Edition
JavaScript, Third Edition
Introduction to C Programming
QUOTATION This chapter teaches you about a unique feature of the shell programming language: the way it interprets quote characters. Basically, the shell.
Programming Concepts MIT - AITI. Variables l A variable is a name associated with a piece of data l Variables allow you to store and manipulate data in.
Chapter Seven Advanced Shell Programming. 2 Lesson A Developing a Fully Featured Program.
Versions/Implementations awk : original awk nawk : new awk, dates to 1987 gawk : GNU awk has more powerful string functionality - NOTE – We are going.
Agenda Sed Utility - Advanced –Using Script-files / Example Awk Utility - Advanced –Using Script-files –Math calculations / Operators / Functions –Floating.
An Introduction to Unix Shell Scripting
Copyright © 2012 Pearson Education, Inc. Publishing as Pearson Addison-Wesley C H A P T E R 2 Input, Processing, and Output.
2440: 211 Interactive Web Programming Expressions & Operators.
CIS 218 Advanced UNIX1 CIS 218 – Advanced UNIX (g)awk.
Internet & World Wide Web How to Program, 5/e © by Pearson Education, Inc. All Rights Reserved.
Internet & World Wide Web How to Program, 5/e © by Pearson Education, Inc. All Rights Reserved.
1 Working with Data Types and Operators. 2 Using Variables and Constants The values stored in computer memory are called variables The values, or data,
C Programming n General Information on C n Data Types n Arithmetic Operators n Relational Operators n if, if-else, for, while by Kulapan Waranyuwat.
Shell Script Programming. 2 Using UNIX Shell Scripts Unlike high-level language programs, shell scripts do not have to be converted into machine language.
Java Software Solutions Lewis and Loftus Chapter 5 1 Copyright 1997 by John Lewis and William Loftus. All rights reserved. More Programming Constructs.
1 System Administration Introduction to Scripting, Perl Session 3 – Sat 10 Nov 2007 References:  chapter 1, The Unix Programming Environment, Kernighan.
Introduction to Awk Awk is a convenient and expressive programming language that can be applied to a wide variety of computing and data manipulation tasks.
© Copyright 1992–2004 by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved. Chapter 2 Chapter 2 - Introduction to C Programming.
Introduction to C Programming Chapter 2 : Data Input, Processing and Output.
Awk Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology
Introduction to Unix – CS 21 Lecture 12. Lecture Overview A few more bash programming tricks The here document Trapping signals in bash cut and tr sed.
Chapter 12: gawk Yes it sounds funny. In this chapter … Intro Patterns Actions Control Structures Putting it all together.
Operators Precedence - Operators with the highest precedence will be executed first. Page 54 of the book and Appendix B list C's operator precedence. Parenthesis.
Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}
Time to talk about your class projects!. Shell Scripting Awk (lecture 2)
©Colin Jamison 2004 Shell scripting in Linux Colin Jamison.
CSCI 330 UNIX and Network Programming
Awk- An Advanced Filter by Prof. Shylaja S S Head of the Dept. Dept. of Information Science & Engineering, P.E.S Institute of Technology, Bangalore
1 Lecture 5 More Programming Constructs Instructors: Fu-Chiung Cheng ( 鄭福炯 ) Associate Professor Computer Science & Engineering Tatung Institute of Technology.
PHP Programming with MySQL Slide 3-1 CHAPTER 3 Working with Data Types and Operators.
1 P51UST: Unix and Software Tools Unix and Software Tools (P51UST) Awk Programming Ruibin Bai (Room AB326) Division of Computer Science The University.
The awk command. Introduction Awk is a programming language used for manipulating data and generating reports. The data may come from standard input,
© Copyright 1992–2004 by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved. 1 Chapter 2 - Introduction to C Programming Outline.
Gator Engineering Copyright © 2008 W. W. Norton & Company. All rights reserved. 1 Chapter 3 Formatted Input/Output.
Tokens in C  Keywords  These are reserved words of the C language. For example int, float, if, else, for, while etc.  Identifiers  An Identifier is.
1 Lecture 10 Introduction to AWK COP 3344 Introduction to UNIX.
Basic Scripting & Variables Yasar Hussain Malik - NISTE.
ECE 103 Engineering Programming Chapter 4 Operators Herbert G. Mayer, PSU CS Status 6/19/2015 Initial content copied verbatim from ECE 103 material developed.
By Dr P.Padmanabham Professor (CSE)&Director Bharat Institute of Engineering &Technology Hyderabad Mobile
Awk- An Advanced Filter by Prof. Shylaja S S Head of the Dept. Dept. of Information Science & Engineering, P.E.S Institute of Technology, Bangalore
OPERATORS IN C CHAPTER 3. Expressions can be built up from literals, variables and operators. The operators define how the variables and literals in the.
Internet & World Wide Web How to Program, 5/e © by Pearson Education, Inc. All Rights Reserved.
Awk 2 – more awk. AWK INVOCATION AND OPERATION the "-F" option allows changing Awk's "field separator" character. Awk regards each line of input data.
1 Lecture 2 - Introduction to C Programming Outline 2.1Introduction 2.2A Simple C Program: Printing a Line of Text 2.3Another Simple C Program: Adding.
ECE 103 Engineering Programming Chapter 4 Operators Herbert G. Mayer, PSU Status 6/10/2016 Initial content copied verbatim from ECE 103 material developed.
Arun Vishwanathan Nevis Networks Pvt. Ltd.
CSC 4630 Meeting 7 February 7, 2007.
PROGRAMMING THE BASH SHELL PART IV by İlker Korkmaz and Kaya Oğuz
MATLAB: Structures and File I/O
John Carelli, Instructor Kutztown University
Chapter 8 JavaScript: Control Statements, Part 2
With Assignment Operator
Chap 7. Advanced Control Statements in Java
Introduction to Bash Programming, part 3
Introduction to C Programming
Presentation transcript:

Shell Scripting Awk (part1)

Awk Programming Language standard unix language that is geared for text processing and creating formatted reports but it is very valuable to seismologists because it uses floating point math, unlike integer only bash, and is designed to work with columnar data syntax similar to C and bash one of the most useful unix tools at your command

considers text files as fields (columns) and records (lines) performs floating & integer arithmetic and string operations uses loops and conditionals define your own functions (subroutines) execute unix commands within the scripts and process the results

versions awk: original awk nawk: new awk, dates to 1987 gawk: GNU awk has more powerful string functionality the CERI unix system has all three. You want to use nawk. I suggest adding this line to your.cshrc file alias awk ‘nawk’ in OS X, awk is already nawk so no changes are necessary

Command line functionality you can call awk from the command line two ways: awk [options] ‘{ commands }’ variables infile(s) awk –f scriptfile variables infile(s) or you can create an executable awk script %cat test.awk #!/usr/bin/nawk some set of commands EOF %chmod 755 test.awk %./test.awk

How it treats text awk commands are applied to every record or line of a file it is designed to separate the data in each line into a field essentially, each field becomes a member of an array so that the first field is $1, second field $2 and so on. $0 refers to the entire record

Field Separator the default field separator is one or more white spaces $1$2 $3 $4 $5 $6 $7 $8 $9 $10 $ ehb Notice that the fields may be integer, floating point (have a decimal point) or strings. Nawk is generally smart enough to figure out how to use them.

Field Separator the field separator may be modified by resetting the FS built in variable Look at passwd file %head -n1 /etc/passwd root:x:0:1:Super-User:/:/sbin/sh Separator is “:”, so reset it. %awk –F”:” ‘{ print $1, $3}’ /etc/passwd root 0

print One of the most common commands used in awk scripts is print awk is not sensitive to white space in the commands %awk –F”:” ‘{ print $1 $3}’ /etc/passwd root0 two solutions to this %awk –F”:” ‘{ print $1 “ “ $3}’ /etc/passwd %awk –F”:” ‘{ print $1, $3}’ /etc/passwd root 0

any string or numeric text can be explicitly output using “” Assume a starting file like so: ehb FEQ x %awk '{print "latitude:",$9,"longitude:",$10,"depth:",$11}’ SUMA. loc latitude: longitude: depth: 15.0 latitude: longitude: depth: 30.0 latitude: longitude: depth: 20.0

Unlike the shell AWK does not evaluate variables within strings. The second line, for example, could not be written: {print "$8\t$3" } As it would print ”$8 $3.” Inside quotes, the dollar sign is not a special character. Outside, it corresponds to a field.

ehb FEQ x you can specify a newline in two ways %awk '{print "latitude:",$9; print "longitude:",$10}’ SUMA. loc %awk '{print "latitude:",$9”\n”,”longitude:",$10}’ SUMA. loc latitude: longitude:

a trick If a field is composed of both strings and numbers, you can multiple the field by 1 to remove the string. %head test.tmp /09/09 03:32: N W /09/08 23:11: N W /09/08 19:44: N W 8.2 %awk '{print $4,$4*1}' test.tmp N N N 36.36

Selective execution awk recognizes regular expressions and conditionals, which can be used to selective execute awk procedures on certain records %awk –F”:” ‘ /root/ { print $1, $3}’ /etc/passwd #reg expr root 0 or within our example script #!/usr/bin/nawk -f /root/ { print $1}

if statements are also very useful %awk –F”:” ‘ {if ($1==“root”) print $1, $3}’ /etc/passwd root 0 or within our example script { if ($1==“root”) { print $1 } note, this particular if syntax is a bit different from your reading, which suggested %awk –F”:” ‘ $1==“root” {print $1, $3}’ /etc/passwd the syntax I use is more explicit and more like C or perl, so I essentially have to remember less syntax

Floating Point Arithmetic awk does floating point math!!!!! it stores all variables as strings, but when math operators are applied, it converts the strings to floating point numbers if the string consists of numeric characters the reading calls this stringy variables

Arithmetic Operators All basic arithmetic is left to right associative + : addition - : subtraction * : multiplication / : division % : remainder or modulus ^ : exponent other standard C programming operators

Assignment Operators = : set variable equal to value on right += : set variable equal to itself plus the value on right -= : set variable equal to itself minus the value on right *= : set variable equal to itself times the value on right /= : set variable equal to itself divided by value on right %= : set variable equal to the remainder of itself divided by the value on the right ^= : set variable equal to the itself to the exponent following the equal sign

Unary Operations A unary expression contains one operand and one operator ++ : increment the operand by 1 if ++ occurs after, $x++, the original value of the operand is used in the expression and then incremented if ++ occurs before, ++$x, the incremented value of the operand is used in the expression -- : decrement the operand by 1 + : unary plus maintains the value of the operand, x=+x - : unary minus negates the value of the operand, -1*x=-x ! : logical negation evaluates if the operand is true (returns 1) or false (returns 0)

Relational Operators Returns 1 if true and 0 if false !!! opposite of bash test command All relational operators are left to right associative < : test for less than <= : test for less than or equal to > : test for greater than >= : test for greater than or equal to == : test for equal to != : test for not equal

Boolean (Logical) Operators Boolean operators return 1 for true and 0 for false && : logical AND; tests that both expressions are true left to right associative || : logical OR ; tests that one or both of the expressions are true left to right associative ! : logical negation; tests that expression is true

Unlike bash, the comparison and relational operators don’t have different syntax for strings and numbers. ie: == in awk rather than == or eq using test

Comparison Operators ~ : pattern match !~ : pattern does not match && : logical AND || : logical OR == : equals (numeric or string) != : does not equal (numeric or string)

Built-In Variables FS: Field Separator NR: record number is another useful built-in awk variable it takes on the current line number, starting from 1 %awk –F”:” ‘ {if (NR==1) print $1, $3}’ /etc/passwd root 0 this is useful when headers are present in a file

RS : record separator specifies when the current record ends and the next begins default is “\n” or newline useful option is “”, or a blank line OFS : output field separator default is “ “ or a whitespace ORS : output record separator default is a “\n” or newline

NF : number of fields in the current record think of this as awk looking ahead to the next RS to count the number of fields in advance FILENAME : stores the current filename OFMT : output format for numbers example OFMT=“%.6f” would make all numbers output as floating points

Accessing shell variables in nawk 3 methods to access shell variables inside a nawk script...

1. Assign the shell variables to awk variables after the body of the script, but before you specify the input file VAR1=3 VAR2=“Hi” awk '{print v1, v2}' v1=$VAR1 v2=$VAR2 input_file 3 Hi Note that I am sneaking in the concept of awk variables here (v1,v2)

There are a couple of constraints with this method - Shell variables assigned using this method are not available in the BEGIN section - If variables are assigned after a filename, they will not be available when processing that filename awk '{print v1, v2}' v1=$VAR1 file1 v2=$VAR2 file2 In this case, v2 is not available to awk when processing file1.

Also note: awk variables are referred to by just their name (no $ in front) awk '{print v1, v2, NF, NR}' v1=$VAR1 file1 v2=$VAR2 file2

2. Use the -v switch to assign the shell variables to awk variables. This works with nawk, but not with all flavours of awk. nawk -v v1=$VAR1 -v v2=$VAR2 '{print v1, v2}' input_file

3. Protect the shell variables from awk by enclosing them with "'" (i.e. double quote - single quote - double quote). awk '{print "'"$VAR1"'", "'"$VAR2"'"}' input_file