CS 497C – Introduction to UNIX Lecture 29: - Filters Using Regular Expressions – grep and sed Chin-Chih Chang
Published byModified over 6 years ago
Presentation on theme: "CS 497C – Introduction to UNIX Lecture 29: - Filters Using Regular Expressions – grep and sed Chin-Chih Chang"— Presentation transcript:
CS 497C – Introduction to UNIX Lecture 29: - Filters Using Regular Expressions – grep and sed Chin-Chih Chang email@example.com firstname.lastname@example.org
Regular Expressions egrep’s extended set includes two special characters - + and ?. They are often used in place of * to restrict the matching scope. + - matches one or more occurrences of the previous character. ? – matches zero or one occurrence of the previous character. $ egrep “true?man” emp.lst
Regular Expressions The |, ( and ) can be used to search for multiple patterns. $ egrep ‘wood(house|cock)’ emp.lst sed is a multipurpose too which combines the work of several filters. Designed by Lee McMahon, it is derived from the ed line editor. sed is used to perform noniteractive operations.
sed: The Stream Editor sed has numerous features – almost bordering on a programming language but its functions have been taken over by perl. Everything in sed is an instruction. An instruction combines an address for selecting lines with an action to be taken on them: sed options ‘address action’ file(s) The address and action are enclosed within single quotes.
sed: The Stream Editor The components of a sed instruction are shown as below: sed ’1,$ s/^bold/BOLD/g’ foo address action You can have multiple instructions in a single sed command, each with its own address and action components. Addressing in sed is done in two ways: –By line number (like 3,7p). –By specifying a pattern (like /From:/p).
Line Addressing In the first form, the address specifies either a single line or a set of two (3,7) to select a group of contiguous lines. The second one uses one or two patterns. In either case, the action (p, the print command) is appended to this address. You can simulate head -3 by the 3q instruction in which 3 is the address and q is the quit action.
Line Addressing $ sed ‘3q’ emp.lst sed uses the p (print) command to print the output. $ sed ‘1,2p’ emp.lst By default, sed prints all lines on the standard output in addition to the lines affected by the action. So the addressed lines are printed twice.