Presentation is loading. Please wait.

Presentation is loading. Please wait.

Regular Expressions and Grep

Similar presentations


Presentation on theme: "Regular Expressions and Grep"— Presentation transcript:

1 Regular Expressions and Grep
Michael Hoffman

2 What will be covered Regular expressions
Grep and it's command line syntax Grep usage A brief look at regular expressions will be covered first, since grep is largely dependent on them.

3 Regular Expressions Dictate patterns which match strings
Used with grep for finer grained searches Several syntax variants POSIX Syntax Perl Syntax - Regular expressions are patterns which match to collections of strings. - There are several different variants of regular expression syntax, although they fall largely into two categories, POSIX syntax and Perl syntax. Grep uses POSIX syntax.

4 Regular Expression Syntax
Strings only match a regular expression when all of it's conditions are fulfilled Lone characters in a regular expression are literals Most power comes from Metacharacters - Strings only match a regular expression when all of it's conditions are fulfilled - Characters which appear in a regular expression are considered literals. That is, they only match themselves. - Most of the power behind regular expressions comes from metacharacters, which are groups of characters beginning with a control character and perform different functions.

5 Common Metacharacters
Description Example . A period matches any single character, excepting newlines. [ ] Bracket Expressions match any single character found in the group between them. If the first character is a ^, then it matches any single character NOT found inside the brackets. [abcd] will match a, b, c, or d [a-z] will match any character from a to z. [^abcd] will match any character except a, b, c, or d * Matches previous substring 0 or more times [ab]* will match a, b, aa, bb, ab, ba

6 Common Metacharacters (cont)
Description Example ? Makes the previous substring optional, and matches it no more than once + Matches previous substring one or more times {n} The previous substring is matched exactly n times. A{3} will match only AAA {n,m} Matches the previous substring at least n times, but no more than m times. n or m may omitted. a{3,5} will match aaa, aaaa, and aaaaa a{1,} will match a, aa, etc a{,2} will match a,and aa

7 Bracket Expressions [:alnum:] [:alpha:] [:digit:] [:lower:] [:upper:]
Equivalent to [:alnum:] [a-zA-Z0-9] [:alpha:] [a-zA-Z] [:digit:] [0-9] [:lower:] [a-z] [:upper:] [A-Z] Bracket expressions in grep can also use certain constants which refer to ranges of characters. These constant bracket expressions must be used inside another set of brackets.

8 Regular Expression Examples
Matches [^b]at Any 3 letter word ending in 'at' except bat [hc]+at Hat, cat, chat, ccchat NOT at [hc]?at Hat, cat, at [[:alnum:]]{2} Any two letter alphanumeric combination aa,bb,a1,Ac The first expression matches every character except b with the literals a and t The second will match any number of h's and c's, but at least one is required, hence why at, which has neither an h or c, is not a match. The third wil match no more than one h or c. Since the question mark expression does not require it's substring top be present, at is also a match. The fourth will match any two alphanumeric characters. Note that the bracket expression is required to be enclosed in a second set of brackets.

9 Grep Standard on all Unix systems, widely ported to others
Orginally a feature of the unix editor ed Global – Regular Expression – Print Case sensitive grep options pattern input_file_names Grep is standard on all Unix and Unix like systems and implementations have been ported to many other systems. Grep was originally a feature of the unix text editor ed, which in turn was one of the first programs to use regular expressions. It's name comes from the commands used to invoke the feature in ed, Global Regular Expression Print. Like most Unix utilities, grep is case sensitive, although this can be changed through a command line argument. Grep is invoked with that syntax, where options are optional command line arguments, pattern is the regex pattern to be searched for, and inputfilenames is the list of file names to search

10 Grep command line options
Posix Option Effect --regexp= -e Specifies a pattern to be used for search. --file= -f Specifies a file to read patterns from. --ignore-case -i Ignores case in both the pattern and searched files --invert-match -v Inverts the behavior of grep, printing all characters NOT matching the pattern. --word-regexp -w Returns only matches that form whole words. --line-regexp -l Returns only matches that match the whole line. --count -c Instead of printing matches, print the number of matches from each input file --files-without-match -L Print the name of each file which does NOT contain the pattern. --files-with-matches Print the name of each file containing a match.

11 Grep command line options (cont)
Posix Option Effect --only-matching -o Displays only the matching text instead of the entire line --no-messages -s Suppress errors about nonexistant or unreadable files. --recursive -r -R Every directory given as an input file is entered and recursively read.

12 Examples Using grep grep Michael authors.txt
grep -i POSIX software.txt grep -i -w hat story.txt grep -l 'int main(' *.c grep -c -r 'somefunc(' *.c source grep 'w.*t' story.txt The first example searches for all occurances of the name Michael in the file authors.txt Because of greps case sensitivity, michael with a lowercase m, or any other combination of cases will not be matched The second searches for the string POSIX in the file software.txt. Note that we have explicitly turned off case sensitivity with the -i switch. The third searches for the word hat in story.txt. Again, we remove case sensitivity. We also supply the w switch to indicate that we want to match the word hat, but not words containing it, like what or hate. The Fourth searchs all the .c files in the current directory for the string int main. The lowercase l switch causes the output to change from displaying the entire line where a match is found to displaying the filenames where the matches are found. The fifth searches all the .c files in the current directory and recursively searches all the files in the folder source. The c switch denotes that we want to switch output from showing the matches found, to counting and displaying the number of matches found for each file. The final example finds every string in story.txt which begins with w and ends with t.


Download ppt "Regular Expressions and Grep"

Similar presentations


Ads by Google