CIS 218 Advanced UNIX1 CIS 218 – Advanced UNIX (g)awk.

Slides:



Advertisements
Similar presentations
Introduction to C Programming
Advertisements

Lecture 2 Introduction to C Programming
Objectives Using functions to organize PHP code
1 Unix Talk #2 AWK overview Patterns and actions Records and fields Print vs. printf.
PHP Functions and Control Structures. 2 Defining Functions Functions are groups of statements that you can execute as a single unit Function definitions.
ISBN Regular expressions Mastering Regular Expressions by Jeffrey E. F. Friedl –(on reserve.
2000 Copyrights, Danielle S. Lahmani UNIX Tools G , Fall 2000 Danielle S. Lahmani Lecture 6.
AWK: The Duct Tape of Computer Science Research Tim Sherwood UC San Diego.
Guide To UNIX Using Linux Third Edition
Testing a program Remove syntax and link errors: Look at compiler comments where errors occurred and check program around these lines Run time errors:
Shell Script Examples.
Shell Scripting Awk (part1) Awk Programming Language standard unix language that is geared for text processing and creating formatted reports but it.
© Copyright 1992–2004 by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved Streams Streams –Sequences of characters organized.
Chapter 9 Formatted Input/Output. Objectives In this chapter, you will learn: –To understand input and output streams. –To be able to use all print formatting.
CHAPTER 5: CONTROL STRUCTURES II INSTRUCTOR: MOHAMMAD MOJADDAM.
Agenda Sed Utility - Advanced –Using Script-files / Example Awk Utility - Advanced –Using Script-files –Math calculations / Operators / Functions –Floating.
ASP.NET Programming with C# and SQL Server First Edition Chapter 3 Using Functions, Methods, and Control Structures.
CS 403: Programming Languages Fall 2004 Department of Computer Science University of Alabama Joel Jones.
Shell Script Programming. 2 Using UNIX Shell Scripts Unlike high-level language programs, shell scripts do not have to be converted into machine language.
Linux+ Guide to Linux Certification, Third Edition
UNIX Shell Script (1) Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology
Introduction to Awk Awk is a convenient and expressive programming language that can be applied to a wide variety of computing and data manipulation tasks.
Programmable Text Processing with awk Lecturer: Prof. Andrzej (AJ) Bieszczad Phone: “UNIX for Programmers and Users”
Copyright © 2010 Certification Partners, LLC -- All Rights Reserved Perl Specialist.
Awk Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology
Chapter 2 Functions and Control Structures PHP Programming with MySQL 2 nd Edition.
Introduction to Unix – CS 21 Lecture 12. Lecture Overview A few more bash programming tricks The here document Trapping signals in bash cut and tr sed.
Chapter 12: gawk Yes it sounds funny. In this chapter … Intro Patterns Actions Control Structures Putting it all together.
13 More Advanced Awk Mauro Jaskelioff (originally by Gail Hopkins)
Chapter 05 (Part III) Control Statements: Part II.
Chapter 3 Functions, Events, and Control Structures JavaScript, Third Edition.
Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}
BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.
1 P51UST: Unix and Software Tools Unix and Software Tools (P51UST) Awk Programming (2) Ruibin Bai (Room AB326) Division of Computer Science The University.
Time to talk about your class projects!. Shell Scripting Awk (lecture 2)
LIN Unix Lecture 7 Hana Filip. LIN Text Processing Command Line Utility Programs (cont.) sed LAST WEEK wc sort tr uniq awk TODAY join paste.
Copyright © 2003 ProsoftTraining. All rights reserved. Perl Fundamentals.
1 LAB 4 Working with Trace Files using AWK. 2 Structure of Trace File.
Chapter Six Introduction to Shell Script Programming.
CSCI 330 UNIX and Network Programming Unit IX: Shell Scripts.
CSCI 330 UNIX and Network Programming
Awk- An Advanced Filter by Prof. Shylaja S S Head of the Dept. Dept. of Information Science & Engineering, P.E.S Institute of Technology, Bangalore
 In computer programming, a loop is a sequence of instruction s that is continually repeated until a certain condition is reached.  PHP Loops :  In.
1 P51UST: Unix and Software Tools Unix and Software Tools (P51UST) Awk Programming Ruibin Bai (Room AB326) Division of Computer Science The University.
CISC 1480/KRF Copyright © 1999 by Kenneth R. Frazer 1 AWK q A programming language for handling common data manipulation tasks with only a few lines of.
Review of Awk Principles
The awk command. Introduction Awk is a programming language used for manipulating data and generating reports. The data may come from standard input,
Sed. Class Issues vSphere Issues – root only until lab 3.
1 Lecture 10 Introduction to AWK COP 3344 Introduction to UNIX.
CSCI 330 UNIX and Network Programming Unit IX: awk II.
CSC 4630 Meeting 17 March 21, Exam/Quiz Schedule Due to ice, travel, research and other commitments that we all have: –Quiz 2, scheduled for Monday.
CSC 4630 Perl 3 adapted from R. E. Beck. Problem But we worked on it first: Input: Read from a text file named in a command line argument Output: List.
PHP Tutorial. What is PHP PHP is a server scripting language, and a powerful tool for making dynamic and interactive Web pages.
Awk- An Advanced Filter by Prof. Shylaja S S Head of the Dept. Dept. of Information Science & Engineering, P.E.S Institute of Technology, Bangalore
1 Agenda  Unit 7: Introduction to Programming Using JavaScript T. Jumana Abu Shmais – AOU - Riyadh.
FILES AND EXCEPTIONS Topics Introduction to File Input and Output Using Loops to Process Files Processing Records Exceptions.
Awk 2 – more awk. AWK INVOCATION AND OPERATION the "-F" option allows changing Awk's "field separator" character. Awk regards each line of input data.
1 Lecture 2 - Introduction to C Programming Outline 2.1Introduction 2.2A Simple C Program: Printing a Line of Text 2.3Another Simple C Program: Adding.
Arun Vishwanathan Nevis Networks Pvt. Ltd.
CSC 4630 Meeting 7 February 7, 2007.
Lecture 14 Programming with awk II
Miscellaneous Items Loop control, block labels, unless/until, backwards syntax for “if” statements, split, join, substring, length, logical operators,
PROGRAMMING THE BASH SHELL PART IV by İlker Korkmaz and Kaya Oğuz
CS 403: Programming Languages
Scripts & Functions Scripts and functions are contained in .m-files
John Carelli, Instructor Kutztown University
Arrays, For loop While loop Do while loop
T. Jumana Abu Shmais – AOU - Riyadh
awk- An Advanced Filter
Introduction to Bash Programming, part 3
Presentation transcript:

CIS 218 Advanced UNIX1 CIS 218 – Advanced UNIX (g)awk

CIS 218 Advanced UNIX2 Overview awk is a programming language Awk uses syntax based on grep and sed for handling numbers and text awk provides field level addressability. And within a field (word) using substring commands awk works field by field

CIS 218 Advanced UNIX3 awk command syntax There are two ways to execute an awk program/script: –awk [-F field-separator] ‘program’ target-file –awk [-F field-separator] -f program.file target From our discussion of sed, and Refrigerator Rule No. 5, I would hope you are firmly committed to the second form!

CIS 218 Advanced UNIX4 awk Variables There are a number of awk variables that are very useful –FS (The field separator, defaults to white space) –OFS (Output field separator, can be critical) –NR (Number of records, a sequential counter) –NF (Number of fields in the current record) –FILENAME (Name of the current target file)

CIS 218 Advanced UNIX5 awk Variables (cont.) –$0 (The entire line as read from the target file) –$n (Where n is the n th field in the record. This is how we get field level addressability in awk) nawk, gawk, etc give us more variables, the most significant two are: –ARGC (the count of the command line arguments) –ARGV (an array of the command line arguments)

CIS 218 Advanced UNIX6 Parts of a program All programs are composed of one or more of the following three constructs: –sequence (a series of instructions, one following the next, executed sequentially) –selection (the ability of the code to decide which instructions to execute, conditional execution) –iteration (adding looping so that selected code will be repeated over an over)

CIS 218 Advanced UNIX7 awk Program Format Awk programs are composed of pattern {action} pairs (actions must be enclosed in French braces {} ) –a pattern without a corresponding action takes the default action, print $0 –an action without a corresponding pattern is applied to every line –each input line is submitted to every pattern/action pair

CIS 218 Advanced UNIX8 awk Program Format (cont.) Placement of the open French brace is critical –pattern { both patterns are action 1 executed for lines action 2 matching the pattern } –pattern lines matching the pattern {action 1 are printed, and both action 2 actions are performed on } every line !

CIS 218 Advanced UNIX9 Patterns In an awk program, the pattern is the selection tool that decides what actions are applied to which lines. Patterns can be: –relational expressions –regular expressions –magic patterns

CIS 218 Advanced UNIX10 Relational Expression patterns

CIS 218 Advanced UNIX11 Regular Expression patterns Must be enclosed in slashes /RE/ Anchors apply to the entire line if they are used as the only pattern Remember, you can use regular expressions in relational patterns with ~ and !~ to apply them to fields Both true regular expressions and fixed patterns can be used as REs in awk

CIS 218 Advanced UNIX12 Pre/Post Processing There are two in awk: –BEGIN {the action associated is performed before the target file is opened} –END {the action associated is performed after the target file is successfully closed} Both are coded in UPPER CASE

CIS 218 Advanced UNIX13 # comments Like most scripting languages # indicates a comment awk scripts should be well documented Comments should explain what you are doing and why.

CIS 218 Advanced UNIX14 print The print command is the simplistic output tool for awk. Basically and “echo”/ You can direct print to send its data to a file with the > operator Generally print is used for simple output or debugging output

CIS 218 Advanced UNIX15 printf Similar in concept to the “C” language command. The format of a printf command is: printf (“formatting string”,variables) The formatting characters correspond to the variables one for one in both lists. Each formatting character is prefixed by %

CIS 218 Advanced UNIX16 printf (cont.) The formatting specifiers contain then following characters: –- indicates that the data should be left justifed –n indicates the minimum width of the field –.n indicates the maximum width of the field “%-5s” indicates a string field, left justified, of width 5 bytes

CIS 218 Advanced UNIX17 printf formatting characters

CIS 218 Advanced UNIX18 printf spacing characters There are two characters available to change the spacing of your text: –\n inserts a newline character. You must use this if you want your output to occur on successive lines. –\t inserts a tab character

CIS 218 Advanced UNIX19 getline getline is used to read from the keyboard It can also capture the results of a command but this form is seldom used Read from the keyboard using getline variable < “/dev/tty” If you don’t supply a variable, awk will use $0, so in most cases you want to use a variable.

CIS 218 Advanced UNIX20 rand() srand() The rand() function generates pseudo- random numbers in the range Given the same seed, it will always generate the same series of numbers. srand() is used to supply a new seed to rand(). If you don’t supply srand() a value, it uses the current time as the seed.

CIS 218 Advanced UNIX21 system() The system() function allows you to execute system commands within an awk script. You must enclose the system command in quotation marks. You cannot capture the output from the system() function within the script but you can capture the return code.

CIS 218 Advanced UNIX22 length() The length([argument]) function returns the length of the argument in bytes. If you give length() a number, it will return the number of digits in the number. If you don’t give length() an argument, it will use $0 by default.

CIS 218 Advanced UNIX23 index() The index(string,target) function returns the position of the first occurrence of the target within the string. The index() function is often used to set the boundary for the substr() function.

CIS 218 Advanced UNIX24 substr() The substr(string,start[,length]) function will return the part of the string beginning with start and continuing for length bytes. If you don’t give it a length, it will return all the bytes between the start and the end of the string.

CIS 218 Advanced UNIX25 split() You will use split(string, array[, separator]) to divide a string into parts using separator to parse them, storing the resultant parts in the array. If you don’t code a separator, the function will use the field separator to parse the string.

CIS 218 Advanced UNIX26 if Besides using patterns, if gives us another way to perform selection The format of an if statement is if (condition) {verb(s)} [else { verb(s)}] If you have more than one verb, they must be enclosed in French braces.

CIS 218 Advanced UNIX27 if conditions

CIS 218 Advanced UNIX28 if A sample if

CIS 218 Advanced UNIX29 exit The input file is closed Control is transferred to the action associated with the END magic pattern if there is one Generally used as a bailout in case of catastrophic errors

CIS 218 Advanced UNIX30 for loop This is a counted loop executes until the counter reaches the target value Increment (count up) or decrement (count down) also works with the elements of an array multiple verbs must be enclosed in { }

CIS 218 Advanced UNIX31 for loop example

CIS 218 Advanced UNIX32 while loop The while loop is an example of conditional execution The loop cycles as long as the condition specified is true A while loop always checks to see if it should execute multiple verbs must be enclosed in { }

CIS 218 Advanced UNIX33 while loop example

CIS 218 Advanced UNIX34 do/while Even though it has a while in it, this is an example of until logic. Until logic is shunned by conscientious coders. ‘nuff said

CIS 218 Advanced UNIX35 break Used to exit from a loop Control is passed to the line following the end of the loop Causes an exit from the loop but NOT the awk script. If you want to bail out of the whole script, use the exit command.

CIS 218 Advanced UNIX36 break example

CIS 218 Advanced UNIX37 continue Causes awk to skip the rest of the body of the loop for the current value In a for loop the counter is incremented, and the next cycle of the loop is started In a while loop, the next iteration of the loop starts

CIS 218 Advanced UNIX38 continue example

CIS 218 Advanced UNIX39 next Causes the script to start over takes the next element from standard input or the target file Like exit, this command effects the whole script

CIS 218 Advanced UNIX40 next example