Introduction to Awk Awk is a convenient and expressive programming language that can be applied to a wide variety of computing and data manipulation tasks.

Slides:



Advertisements
Similar presentations
CST8177 awk. The awk program is not named after the sea-bird (that's auk), nor is it a cry from a parrot (awwwk!). It's the initials of the authors, Aho,
Advertisements

1 Unix Talk #2 AWK overview Patterns and actions Records and fields Print vs. printf.
ISBN Regular expressions Mastering Regular Expressions by Jeffrey E. F. Friedl –(on reserve.
2000 Copyrights, Danielle S. Lahmani UNIX Tools G , Fall 2000 Danielle S. Lahmani Lecture 6.
Lecture 5 sed and awk. Last week Regular Expressions –grep (BRE) –egrep (ERE) Sed - Part I.
AWK: The Duct Tape of Computer Science Research Tim Sherwood UC Santa Barbara.
AWK: The Duct Tape of Computer Science Research Tim Sherwood UC San Diego.
Unix Shell Scripts. What are scripts ? Text files in certain format that are run by another program Examples: –Perl –Javascript –Shell scripts (we learn.
Using Unix Shell Scripts to Manage Large Data
Sed and awk.
Shell Scripting Awk (part1) Awk Programming Language standard unix language that is geared for text processing and creating formatted reports but it.
Agenda Sed Utility - Advanced –Using Script-files / Example Awk Utility - Advanced –Using Script-files –Math calculations / Operators / Functions –Floating.
Chap 3 – PHP Quick Start COMP RL Professor Mattos.
CIS 218 Advanced UNIX1 CIS 218 – Advanced UNIX (g)awk.
AWK. text processing languge awk Created for Unix by Aho, Weinberger and Kernighan Basicly an: ▫interpreted ▫text processing ▫programming language Updated.
CS 403: Programming Languages Fall 2004 Department of Computer Science University of Alabama Joel Jones.
1 awk awk is a file-processing programming language. Makes it easy to perform text manipulation tasks. Is used in –Generating reports –Matching patterns.
Programmable Text Processing with awk Lecturer: Prof. Andrzej (AJ) Bieszczad Phone: “UNIX for Programmers and Users”
Awk Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology
Sed, awk, & perl CS 2204 Class meeting 13 *Notes by Mir Farooq Ali and other members of the CS faculty at Virginia Tech. Copyright 2003.
M180: Data Structures & Algorithms in Java Arrays in Java Arab Open University 1.
Introduction to Unix – CS 21 Lecture 12. Lecture Overview A few more bash programming tricks The here document Trapping signals in bash cut and tr sed.
Chapter 12: gawk Yes it sounds funny. In this chapter … Intro Patterns Actions Control Structures Putting it all together.
Chapter 9: Perl Programming Practical Extraction and Report Language Some materials are taken from Sams Teach Yourself Perl 5 in 21 Days, Second Edition.
A talk about AWK Don Newcomb 18 Jan What is AWK? AWK is an interpreted computer language It is primarily used for text processing and data formatting.
Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}
Introduction to Unix – CS 21
BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.
5 1 Data Files CGI/Perl Programming By Diane Zak.
LIN Unix Lecture 7 Hana Filip. LIN Text Processing Command Line Utility Programs (cont.) sed LAST WEEK wc sort tr uniq awk TODAY join paste.
Searching and Sorting. Why Use Data Files? There are many cases where the input to the program may come from a data file.Using data files in your programs.
Chapter Twelve sed, awk & perl1 System Programming sed, awk & perl.
1 LAB 4 Working with Trace Files using AWK. 2 Structure of Trace File.
CSCI 330 UNIX and Network Programming Unit IX: Shell Scripts.
© 2006 KDnuggets [16/Nov/2005:16:32: ] "GET /jobs/ HTTP/1.1" "
CSCI 330 UNIX and Network Programming
Awk- An Advanced Filter by Prof. Shylaja S S Head of the Dept. Dept. of Information Science & Engineering, P.E.S Institute of Technology, Bangalore
Structuring Data: Arrays ANSI-C. Representing multiple homogenous data Problem: Input: Desired output:
1 P51UST: Unix and Software Tools Unix and Software Tools (P51UST) Awk Programming Ruibin Bai (Room AB326) Division of Computer Science The University.
Alon Efrat Computer Science Department University of Arizona Unix Tools.
CISC 1480/KRF Copyright © 1999 by Kenneth R. Frazer 1 AWK q A programming language for handling common data manipulation tasks with only a few lines of.
Review of Awk Principles
The awk command. Introduction Awk is a programming language used for manipulating data and generating reports. The data may come from standard input,
Sed. Class Issues vSphere Issues – root only until lab 3.
1 Lecture 10 Introduction to AWK COP 3344 Introduction to UNIX.
CSCI 330 UNIX and Network Programming Unit IX: awk II.
Sed and awk CS 2204 Class meeting 13. © Mir Farooq Ali, sed Stream editor Originally derived from ed line editor Used primarily for non interactive.
Dept. of Animal Breeding and Genetics Programming basics & introduction to PERL Mats Pettersson.
CHAPTER 6 ARRAYS IN C 1 st semester King Saud University College of Applied studies and Community Service Csc 1101 F. Alakeel.
By Dr P.Padmanabham Professor (CSE)&Director Bharat Institute of Engineering &Technology Hyderabad Mobile
Awk- An Advanced Filter by Prof. Shylaja S S Head of the Dept. Dept. of Information Science & Engineering, P.E.S Institute of Technology, Bangalore
Programming Languages Meeting 12 November 18/19, 2014.
Awk 2 – more awk. AWK INVOCATION AND OPERATION the "-F" option allows changing Awk's "field separator" character. Awk regards each line of input data.
LESSON 8: INTRODUCTION TO ARRAYS. Lesson 8: Introduction To Arrays Objectives: Write programs that handle collections of similar items. Declare array.
AWK One tool to create them all AWK Marcel Nijenhof Eth-0 11 Augustus 2010.
CSE 303 Concepts and Tools for Software Development Richard C. Davis UW CSE – 10/9/2006 Lecture 6 – String Processing.
Arun Vishwanathan Nevis Networks Pvt. Ltd.
CSC 4630 Meeting 7 February 7, 2007.
Lecture 14 Programming with awk II
PROGRAMMING THE BASH SHELL PART IV by İlker Korkmaz and Kaya Oğuz
CS 403: Programming Languages
John Carelli, Instructor Kutztown University
The ‘grep’ Command Colin Masterson.
Programming Languages
Sed and awk.
Awk.
awk- An Advanced Filter
Introduction to Bash Programming, part 3
CIS 136 Building Mobile Apps
Presentation transcript:

Introduction to Awk Awk is a convenient and expressive programming language that can be applied to a wide variety of computing and data manipulation tasks.

Awk Works well on record-type data Reads input file(s) a line at a time Parses each line into fields Performs user-defined tests against each line, performs actions on matches

Other Common Uses Input validation Every record have same # of fields? Do values make sense (negative time, hourly wage > $1000, etc.)? Filtering out certain fields Searches Who got a zero on lab 3? Who got the highest grade? Many others

Invocation Can write little one-liners on the command line (very handy): print the 3 rd field of every line: $ awk '{ print $3 }' input.txt Execute an awk script file: $ awk –f script.awk input.txt Or, use this sha-bang as the first line, and give your script execute permissions: #!/bin/awk -f

Form of an AWK program AWK programs are entries of the form: pattern { action } pattern – some test, looking for a pattern (regular expressions) or C-like conditions  if null, actions are applies to every line action – a statement or set of statements  if not provided, the default action is to print the entire line, much like grep

Form of an AWK program Input files are parsed, a record (line) at a time Each line is checked against each pattern, in order There are 2 special patterns: BEGIN – true before any records are read END – true at end of input (after all records have been read)

Awk Features Patterns can be regular expressions or C like conditions. Each line of the input is matched against the patterns, one after the next. If a match occurs the corresponding action is performed. Input lines are parsed and split into fields, which are accessed by $1,…,$NF, where NF is a variable set to the number of fields. The variable $0 contains the entire line, and by default lines are split by white space (blanks, tabs)

Variables Not declared, nor typed No character type Only strings and floats (support for ints) $n refers to the nth field (where n is some integer value) # prints each field on the line for( i=1; i<=NF; ++i ) print $i

Some Built-in Variables FS – the input field separator OFS – the output field separator NF – # of fields; changes w/each record NR – the # of records read (so far). So, the current record # FNR – the # of records read so far, reset for each named file $0 – the entire input line

Example $ cat emp.data Beth Dan Kathy Mark Mary Susie Print pay for those employees who actually worked $ awk ‘$3>0 {print $1, $2*$3}’ emp.data Kathy 40 Mark 100 Mary 121 Susie 76.5

Example – CSV file $ cat students.csv smith,john,js12 jones,fred,fj84 bee,sue,sb23 fife,ralph,rf86 james,jim,jj22 cook,nancy,nc54 banana,anna,ab67 russ,sam,sr77 loeb,lisa,guitarHottie $ cat get s.awk #!/bin/awk -f BEGIN { FS = "," } { printf( "%s's is: $2, $3 ); } $ get s.awk students.csv john's is: fred's is: sue's is: ralph's is: jim's is: nancy's is: anna's is: sam's is: lisa's is:

Example – output separator $ cat out.awk #!/bin/awk -f BEGIN { FS = ","; OFS = "-*-"; } { print $1, $2, $3; } $ out.awk students.csv smith-*-john-*-js12 jones-*-fred-*-fj84 bee-*-sue-*-sb23 fife-*-ralph-*-rf86 james-*-jim-*-jj22 cook-*-nancy-*-nc54 banana-*-anna-*-ab67 russ-*-sam-*-sr77 loeb-*-lisa-*-guitarHottie

Flow Control Awk syntax is much like C Same loops, if statements, etc. AWK: Aho, Weinberger, Kernighan Kernighan and Ritchie wrote the C language

Associative Arrays Awk also supports arrays that can be indexed by arbitrary strings. They are implemented using hash tables. Total[“Sue”] = 100; It is possible to loop over all indices that have currently been assigned values. for (name in Total) print name, Total[name];

Example using Associative Arrays $ cat scores Fred 90 Sue 100 Fred 85 Sam 70 Sue 98 Sam 50 Fred 70 $ cat total.awk { Total[$1] += $2} END { for (i in Total) print i, Total[i]; } $ awk -f total.awk scores Sue 198 Sam 120 Fred 245

Useful one-liners Line count: awk 'END {print NR}' grep awk '/pat/' head awk 'NR<=10' Add line #s to a file awk '{print NR, $0}' awk '{ printf( "%5d %s", NR, $0 )}' Many more. See the resources tab on the course webpage for links to more examples.