9 The sed Editor Mauro Jaskelioff (based on slides by Gail Hopkins)

Slides:



Advertisements
Similar presentations
CST8177 awk. The awk program is not named after the sea-bird (that's auk), nor is it a cry from a parrot (awwwk!). It's the initials of the authors, Aho,
Advertisements

CST8177 sed The Stream Editor. The original editor for Unix was called ed, short for editor. By today's standards, ed was very primitive. Soon, sed was.
Character Arrays (Single-Dimensional Arrays) A char data type is needed to hold a single character. To store a string we have to use a single-dimensional.
An Introduction to Sed & Awk Presented Tues, Jan 14 th, 2003 Send any suggestions to Siobhan Quinn
CIS 118 – Intro to UNIX Shells 1. 2 What is a shell? Bourne shell – Developed by Steve Bourne at AT&T Korn shell – Developed by David Korn at AT&T C-shell.
Editing with vi Or more fun than you thought you’d have without a mouse Prof. Chris GauthierDickey.
7 Searching and Regular Expressions (Regex) Mauro Jaskelioff.
2000 Copyrights, Danielle S. Lahmani UNIX Tools G , Fall 2000 Danielle S. Lahmani Lecture 6.
Quotes: single vs. double vs. grave accent % set day = date % echo day day % echo $day date % echo '$day' $day % echo "$day" date % echo `$day` Mon Jul.
Guide To UNIX Using Linux Third Edition
Regular Expressions Comp 2400: Fall 2008 Prof. Chris GauthierDickey.
Unix Filters Text processing utilities. Filters Filter commands – Unix commands that serve dual purposes: –standalone –used with other commands and pipes.
UNIX Filters.
Shell Script Examples.
Chapter 4: UNIX File Processing Input and Output.
Advanced File Processing
3 File Processing Mauro Jaskelioff. Introduction More UNIX commands for handling files Regular Expressions and Searching files Redirection and pipes Bash.
Agenda Sed Utility - Advanced –Using Script-files / Example Awk Utility - Advanced –Using Script-files –Math calculations / Operators / Functions –Floating.
File Processing. Introduction More UNIX commands for handling files Regular Expressions and Searching files Redirection and pipes Bash facilities.
Chapter Four UNIX File Processing. 2 Lesson A Extracting Information from Files.
Guide To UNIX Using Linux Fourth Edition
Unix Talk #2 (sed). 2 You have learned…  Regular expressions, grep, & egrep  grep & egrep are tools used to search for text in a file  AWK -- powerful.
Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.
Unix programming Term: III B.Tech II semester Unit-II PPT Slides Text Books: (1)unix the ultimate guide by Sumitabha Das (2)Advanced programming.
Sed sed is a program used for editing data. It stands for stream editor. Unlike ed, sed cannot be used interactively. However, its commands are similar.
Regular expressions Used by several different UNIX commands, including ed, sed, awk, grep A period ‘.’ matches any single characters.X. matches any X.
1 Lecture 5 Additional useful commands COP 3353 Introduction to UNIX.
Perl and Regular Expressions Regular Expressions are available as part of the programming languages Java, JScript, Visual Basic and VBScript, JavaScript,
Advanced File Processing. 2 Objectives Use the pipe operator to redirect the output of one command to another command Use the grep command to search for.
Linux+ Guide to Linux Certification, Third Edition
UNIX Shell Script (1) Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology
Chapter Five Advanced File Processing Guide To UNIX Using Linux Fourth Edition Chapter 5 Unix (34 slides)1 CTEC 110.
Chapter Five Advanced File Processing. 2 Objectives Use the pipe operator to redirect the output of one command to another command Use the grep command.
(Stream Editor) By: Ross Mills.  Sed is an acronym for stream editor  Instead of altering the original file, sed is used to scan the input file line.
Agenda Regular Expressions (Appendix A in Text) –Definition / Purpose –Commands that Use Regular Expressions –Using Regular Expressions –Using the Replacement.
Chapter 13: sed Say what?. In this chapter … Basics Programs Addresses Instructions Control Spaces Examples.
I/O Redirection and Regular Expressions February 9 th, 2004 Class Meeting 4.
Sed Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology
Introduction to Unix – CS 21 Lecture 12. Lecture Overview A few more bash programming tricks The here document Trapping signals in bash cut and tr sed.
13 More Advanced Awk Mauro Jaskelioff (originally by Gail Hopkins)
WHAT IS SED? A non-interactive stream editor Interprets sed instructions and performs actions Use sed to: Automatically perform edits on file(s) ‏ Simplify.
Appendix A: Regular Expressions It’s All Greek to Me.
Chapter Five Advanced File Processing. 2 Lesson A Selecting, Manipulating, and Formatting Information.
Introduction to sed. Sed : a “S tream ED itor ” What is Sed ?  A “non-interactive” text editor that is called from the unix command line.  Input text.
Searching and Sorting. Why Use Data Files? There are many cases where the input to the program may come from a data file.Using data files in your programs.
Lesson 4-Mastering the Visual Editor. Overview Introducing the visual editor. Working in an existing file with vi. Understanding the visual editor. Navigating.
Advanced Text Processing. 222 Lecture Overview  Character manipulation commands cut, paste, tr  Line manipulation commands sort, uniq, diff  Regular.
CSCI 330 UNIX and Network Programming Unit IV Shell, Part 2.
ICS3U_FileIO.ppt File Input/Output (I/O)‏ ICS3U_FileIO.ppt File I/O Declare a file object File myFile = new File("billy.txt"); a file object whose name.
Linux+ Guide to Linux Certification, Second Edition
ORAFACT Text Processing. ORAFACT Searching Inside Files grep - searches for patterns within files grep [options] [[-e] pattern] filename [...] -n shows.
-Joseph Beberman *Some slides are inspired by a PowerPoint presentation used by professor Seikyung Jung, which was derived from Charlie Wiseman.
CSCI 330 UNIX and Network Programming
File Processing. Introduction More UNIX commands for handling files Regular Expressions and Searching files Redirection and pipes Bash facilities.
Filters and Utilities. Notes: This is a simple overview of the filtering capability Some of these commands are very powerful ▫Only showing some of the.
CSE 303 Concepts and Tools for Software Development Richard C. Davis UW CSE – 10/9/2006 Lecture 6 – String Processing.
Lesson 5-Exploring Utilities
CSCI The UNIX System sed - Stream Editor
Looking for Patterns - Finding them with Regular Expressions
CIRC Summer School 2017 Baowei Liu
CST8177 sed The Stream Editor.
PROGRAMMING THE BASH SHELL PART IV by İlker Korkmaz and Kaya Oğuz
Vi Editor.
Folks Carelli, Instructor Kutztown University
In the last class, sed to edit an input stream and understand its addressing mechanism Line addressing Using multiple instructions Context addressing Writing.
Guide To UNIX Using Linux Third Edition
Lecture 5 Additional useful commands COP 3353 Introduction to UNIX 1.
Unix Talk #2 (sed).
Chapter Four UNIX File Processing.
Lecture 5 Additional useful commands COP 3353 Introduction to UNIX 1.
Presentation transcript:

9 The sed Editor Mauro Jaskelioff (based on slides by Gail Hopkins)

Introduction sed is a Stream Editor Designed to edit files in a batch fashion –Not interactive Often used for text substitution When you have multiple changes to make to one or more files: –Write down the changes in an editing script –Apply the script to all the files

What does sed do? Used to edit input streams –Input stream can be from a file, from a pipe or from the keyboard Produces results on standard output –…but results can be put in a file or sent through a pipe

Typical Uses of sed Editing one or more files automatically –E.g. replace all occurrences of a string within a file for a different string Simplifying repetitive edits to multiple files –E.g. perform the same operation on lots of similar files

How Does sed Work? Each line of input is copied into an internal buffer known as a “pattern space” All editing commands in a sed script are applied, in order, to each line of input (in the buffer) Editing commands are applied to all lines in the buffer –Unless line addressing is used to restrict the lines affected

How does sed Work? (2) If a sed command changes the input, the next command will apply to this new (changed) line of input, not the original one –More on this later! s/caterpillars/spiders/ s/crawl/run/ Furry caterpillars crawl slowly Furry spiders crawl slowly Furry spiders run slowly sed script Pattern space

How does sed Work? (3) When sed edits an input file, the original input file is unchanged –The editing commands modify a copy of each original line of input –When sed outputs the result, it is the copy that is sent to STDOUT (or redirected to a file) sed keeps a separate buffer, known as the “hold space” –Can be used to save data for later retrieval –For most edits this isn’t needed - only if a command refers to it

How to Run sed from the Command Line sed [-n] [-e] ’command’ file(s) –For specifying an editing command on the command line –E.g.: sed 's/ant/flea/g' myCreaturesFile sed -e 's/ant/flea/g' -e 's/worm/slug/g' myCreaturesFile (what does this mean??? - more about sed commands shortly…) sed [-n] -f scriptfile file(s) –For specifying a scriptfile containing sed commands –E.g.: sed -f myScript myCreaturesFile If no file specified, sed reads from STDIN

The -n flag sed can be given a -n option –This tells sed NOT to write the contents of the pattern space by default to stdout: sed -n 's/ant/flea/g’ myCreaturesFile –Another way of specifying this is to put #n at the start of a sed script Why do we want to stop sed’s output? –We can then tell sed to print specific lines of output, rather than the whole pattern space: –sed -n 's/swan/coot/p’ myCreaturesFile –NOTE the p in the above example…

sed Regular Expressions sed uses regular expressions The format of these is very similar to those used by grep

sed Regular Expressions SymbolMatchesExample ^ Beginning of line/^He/ Line starts with He $ End of line/nd$/ Line end in nd. Any single character/./ Would match, a, b, 1, 2, and so on… * 0 or more occurrences of preceding character /we*/ Matches w, we, wee, weee, etc… ? 0 or 1 occurrence of preceding character /we?/ Matches w, or we [ ] Any character enclosed in [ ][abc] Matches a, b or c [^] Any character NOT enclosed in [ ] [^abc] Matches d, e, f, etc. but NOT a, b or c

sed Regular Expression (2) SymbolMatchesExample \{m,n\} m-n repetitions of preceding character x\{1,3\} Matches x, xx or xxx \{m,\} m or more repetitions of preceding character y\{4,\} Matches yyyy, yyyyy, yyyyyy, etc… \{,n\} n or fewer (possibly 0) repetitions of preceding character we\{,5\} Matches weeeee, weeee, weee, wee, we or w \{n\} Exactly n repetitions of preceding character z\{6\} Matches zzzzzz \(expression\) Group operator or region of interest SEE LATER EXAMPLE \n nth groupSEE LATER EXAMPLE

sed Commands - Syntax sed instructions consist of addresses and editing commands They have the general form: –[address[,address]][!]command [arguments] –NOTE: here, [] denotes something is optional –Therefore: If the address of the command matches the line of the pattern space (internal buffer), the command is applied to that line [address[,address]][!]command [arguments] Zero or more addresses If ! is present then it means anything NOT in the address(es) stated The sed command to be executed Optional arguments to the command

sed Addresses A sed command can have 0, 1 or 2 addresses An address in a sed command can be: –A line number –The symbol $ (meaning the last line) –A regular expression enclosed in slashes (/regex /) Therefore, an address can be thought of as “something that matches” in the pattern space

sed Addresses (2) If no address is specified: –The command applies to each input line If one address is specified: –The command applies to any line matching the address –REMEMBER: an address can be a regular expression! If two comma-separated addresses are specified –The command applies to the first matching line and all succeeding lines up to and including a line matching the second address If an address followed by ! is specified –The command applies to all lines that DO NOT match the address

sed Commands Consist of a single letter or symbol –They tell sed to “do something” to the text at the address specified –E.g.: s means substitute g is a flag to the s command. It means global, or all occurrences of… (more on this later) sed 's/ant/flea/g’ myCreaturesFile …means substitute all occurrences of the word ant with the word flea in the file myCreaturesFile –…in this example, no address is specified and so sed applies the command to all lines in the pattern space

sed Commands (2) Another example: –sed -n ’/^squirrel/,/^swift/p’ myCreaturesFile Print everything between the line starting squirrel and the line starting swift, inclusive Here, there are 2 addresses, both are regular expressions: /^squirrel/ –The first address is the first line matching “squirrel” at the start of the line /^swift/ –The second address is the first line matching “swift” at the start of the line –REMEMBER: regular expressions are written between / and / sed therefore prints between the first matching line (with squirrel at the start) and all succeeding lines up to and including a line matching the second address (with swift at the start)

sed Commands (3) An example using ! –sed ’/aardvark/!d’ myCreaturesFile –Delete any line that doesn’t contain the text “aardvark” in the file myCreaturesFile An example using line numbers: –sed ’5s/wombat/womble/g’ myCreaturesFile –Substitute all occurrences of wombat with womble on line 5

An example of two elements together: –Input file: –sed -e 's/ant/flea/g’ -e ‘s/alarm/to itch/g’ myCreaturesFile –Output: a, a, ants on my arm a, a, ants on my arm a, a, ants on my arm they’re causing me alarm! a, a, fleas on my arm a, a, fleas on my arm a, a, fleas on my arm they’re causing me to itch! Putting more than one sed Element in a Command

Putting more than one sed Element in a Command (2) Input file: sed -e ‘s/parrot/lizard/g’ -e ‘s/lizard/koala/g’myCreaturesFile Output from sed: Why??? At the top of the tree there were 4 parrots and 2 lizards At the top of the tree there were 4 koalas and 2 koalas

…because sed read in the line in the file and executed: –s/parrot/lizard/g …to produce the text: sed then performed the command: –s/lizard/koala/g –…on this new edited line to produce: REMEMBER from previously: –If a sed command changes the input, the next command will apply to this new (changed) line of input, not the original one At the top of the tree there were 4 lizards and 2 lizards At the top of the tree there were 4 koalas and 2 koalas

Summary of sed Commands (4) a\ append text after a line c\ replace text i\ insert text before a line d delete lines s substitute y translate characters Basic Editing = display line number of a line p display the line l display control characters in ascii Line Information h copy into hold space; clear out what’s there H copy into hold space; append to what’s there g get the hold space back; wipe out the destination line G get the hold space back; append to the pattern space x exchange contents of hold space and pattern space Yanking and Putting n skip current line and go to line below r read another file’s contents into the output stream wwrite input lines to another file qquit the sed script Input/Output Processing

Examples of commonly used sed Commands s sed ‘s/dog/cat/’ myfilesubstitute the first occurrence of dog with cat for each line found in myfile sed ‘s/dog/cat/g’ myfilesubstitute all occurrences of dog with cat in myfile sed ‘s/dog/cat/4’ myfile find every line in myfile with 4 “dog” strings and substitute the 4th occurrence of dog with cat on each sed ‘1,2s/dog/cat/g’ myfilesubstitute all occurrences of dog with cat in the first 2 lines of myfile ONLY sed ‘/dog/,/cat/s/.*//’ myfilelook for the text dog followed by the text cat. Remove the lines containing them plus all text (possibly more than one line) in between. Repeat until end of file myfile. s/.*// means substitute all text found for an empty string

d sed ‘1,2d’ myfiledelete everything in myfile between line 1 and line 2 sed ‘5d’ myfiledelete the fifth line from myfile sed ‘/^#/d’ myfiledelete all lines starting with # in myfile p sed -n ‘/BEGIN/,/END/p’ myfilefind a line containing BEGIN and print that line and all following lines up to and including a line containing END. Note: if there is no END, sed will still print all text after BEGIN due to its stream oriented nature - it doesn’t know there is no END until it gets to the end of the file! Examples of commonly used sed Commands (2)

Flags to commands sed commands can be given flags. We have already seen the substitute command with the g flag: –s/lizard/koala/g Other flags to s are: –n - replace the nth occurrence of pattern with replacement text e.g. sed ‘s/dog/cat/4’ myfile –p - print pattern space to stdout if substitution successful e.g. sed -n ‘s/dog/cat/p’ myfile A flag to the s command. It tells s to substitute ALL occurences of…

Flags to Commands (2) –w filename - write the pattern space of lines that are changed to resultsfile if substitution successful e.g. sed ‘s/dog/cat/w resultsfile’ myfile NOTE: here there must be exactly ONE SPACE between the w and the resultsfile resultsfile will contain only those lines that sed applied the substitution to

Running sed from a Script sed commands can be put in a file called a script E.g.: …and run from the command line: # this is my sed script s/horse/cow/g s/chicken/duck/g s/newt/lizard/g script.sed $ sed -f script.sed myCreaturesFile A comment in sed

Piping to and from sed (and a much more complicated example!) The UNIX who command gives an output: $ who zliybbs pts/5 Apr 8 19:11 zliybsj2 pts/6 Apr 8 18:42 (ss-226-host39.nottingham.edu.cn) zliybyk2 pts/9 Apr 6 14:30 zliybyk2 pts/10 Apr 6 14:31 zliybbs pts/11 Apr 8 19:15 ( ) zliybyy2 pts/12 Apr 8 20:10 zliybwj pts/15 Apr 6 14:34 zuczpd pts/17 Apr 6 14:44 zuczpd pts/18 Apr 6 14:44 zuczpd pts/19 Apr 6 14:44 (ss-226-host67.nottingham.edu.cn) zuczpd pts/20 Apr 6 14:45 (ss-226-host67.nottingham.edu.cn) zlizmj pts/1 Apr 9 08:49 ( )

Piping to and from sed (2) (and a much more complicated example!) If we wanted to extract only the machine names from this output, we could use the following command: who | sed -n ‘s/.*(\(.*\))/\1/p’ What ON EARTH does this mean???? ☺

who | sed -n ‘s/.*(\(.*\))/\1/p’ Take the output from the UNIX who command and pipe it onto sed Take everything up to and including the first open bracket … This denotes the start of a region of interest This denotes the end of a region of interest Take everything after the first open bracket “(“up to, but not including, the close bracket “)”and keep it for future referencing in a region of interest …and substitute it with the region of interest that was saved earlier, referenced as number 1 (REMEMBER from earlier: \n means nth group)

Piping to and from sed (3) If we then wanted to sort the result into alphabetical order, we could pipe it onto sort: who |sed -n ‘s/.*(\(.*\))/\1/p’ | sort We could then redirect the whole output to a file: who | sed -n ‘s/.*(\(.*\))/\1/p’ | sort > machines.txt

An Example of Data Manipulation using sed Suppose we had a file names.txt in the form forename:surname (with a colon in between): …and we wanted to reverse the names so that they were in the order surname,forename (with a comma in between)… Steve:Bradford Saun : Higgins Gail : Hopkins Sara:Mead Fred : Smith Henry:Taylor

An Example of Data Manipulation using sed (2) sed -e ‘s/\(.*\):\(.*\)/\2,\1/’ …would produce the following output: Bradford,Steve Higgins,Saun Hopkins,Gail Mead,Sara Smith,Fred Taylor,Henry EXPLANATION: This uses regions of interest. It puts the forename in a region of interest and then puts the surname in another region of interest. It then outputs the second region of interest followed by the first.

Using Different Delimiters Often, / is used in sed scripts as a delimiter However, other characters can be used as delimiters instead –sed takes the first character that it expects to be the delimiter as the delimiter All of these are therefore equally viable: Why would we want a different delimiter? s/horse/cow/gs,horse,cow,g s:horse:cow:gs$horse$cow$g

Using Different Delimiters (2) Suppose we had an HTML file which we wanted to convert to XHTML –We therefore want to change all occurrences of to all occurrences of all occurrences of to and so on… s/ / /g s: : :g … Here we have used : as a delimiter because there are slashes in the data

sed Tries to Match the Longest Expression! Suppose we had an HTML file and we wanted to remove all the markup: We could instruct sed to find a ‘ ’ character: sed -e 's/ //g' UST.html This would produce: Welcome to the UST website. website. Why??

sed Tries to Match the Longest Expression! (2) …because sed tries to find the longest expression that matches: Welcome to the UST …instead, we need to specify that sed looks for a ‘ ’ characters followed by a ‘>’ character: sed -e 's/ ]*>//g' UST.html sed will then match and and, and so on…

Character Classes - POSIX Compliant sed Often in sed you want to specify a regular expression that contains white space (TABs, spaces, etc.) POSIX compliant sed offers a simple way of doing this with a character class: sed ‘s/[[:space:]]//g’ myfile Character classes give you a way of specifying, within a regular expression, types of characters to search for

Character Classes (2) [:alnum:] Alphanumeric [a-z A-Z 0-9] [:alpha:] Alphabetic [a-z A-Z] [:blank:] Spaces or tabs [:cntrl:] Any control characters [:digit:] Numeric digits [0-9] [:graph:] Any visible characters (no whitespace) [:lower:] Lower-case [a- z] [:print:] Non-control characters [:punct:] Punctuation characters [:space:] Whitespace [:upper:] Upper-case [A- Z] [:xdigit:] hex digits [0-9 a-f A-F]

Summary An introduction to sed Format of sed statements Addresses Types of command Putting sed inside a script Some more advanced examples of sed