Download presentation
Presentation is loading. Please wait.
Published byLester Webb Modified over 9 years ago
1
Unix Filters Text processing utilities
2
Filters Filter commands – Unix commands that serve dual purposes: –standalone –used with other commands and pipes Why are filters important? –The output from a command may be overwhelming, e.g. recall the output from the: # command to list all of the files in the /home directory ls –l /home filters may be a way of reducing output to only the pertinent data –We may wish to organize the data presented. –We may wish to reformat the data presented.
3
Filter Commands Filter commands include: –head grep –tail sed –More awk –sort –uniq –wc –diff –cut –paste –Cmp –comm –tr –cat
4
head head outputs the first part of files Format: head [options] filename Selected options: -n : print the first n lines the default is 10 head –5 abc displays first 5 lines of abc file head abc displays first 10 lines of abc file
5
tail tail outputs the last part of files Format: tail [options] filename Selected options: -n : print the last n lines the default is 10 head abc | tail -5 prints lines 6 through 10 of file abc
6
more more displays files on a page-by-page basis Format: more [options] filename Options (selected): –-n : number of lines to show at a time –-d :displays errors –+/pattern : begin at pattern Commands to use while in more: –space : advance a page at a time –return : advance a line at a time –q : quit more
7
more Examples more abc lists file abc one page/line at a time ls –l | more lists directory contents one page/line at a time
8
sort sort sorts lines of text [files] usually (but not always) used with redirection Format: sort [options] inputfiles Options (selected): –-b : ignore leading blanks –-f : ignore case –-r : sort descending rather than ascending – -M : sort using month order (i.e. Jan comes before Feb) –-n : sort numerical rather than by string –-u : unique, eliminate duplicate lines –-tchar : use char as field separator (instead of whitespace)
9
sort We can also sort the files by specifying fields. When using fields to sort, you may also use characters within a field. Field specifier has following format: +number1 –number2 Examples: –sort +0 -1 file1 (uses fisrt fieldto sort the file) –sort –n +4 -5 file1 (sorts starting with fifth field, values in numerical order) –sort –t, +1 -2 file1 (uses, as delimiter instead of white space)
10
Sort Fields field1field2 field3field4 field5 field6field7 0102Smith, BobMay 12, 19922 3055Ye, ChanApril 1, 19878 2337McFadden, MabelDecember 12, 199121 8441Gupta, SoumyaroopMay 1, 199219 1198Crockett, BobJune 5, 19891
11
sort When sorting on multiple fields, you may also use the –k pos1[,pos2] options. If you wish to indicate the format of the key to be sorted, place the format value adjacent to the field number. Examples: CommandResults sort –t” “ –k 2n –k 5M –k 6 filenameSorts filename, using a space delimiter, 2 nd field using numeric ordering, 5 th field based on month names and 6 th field sort –t”,” –k 3,4 filenameSorts filename on 3 rd and 4 th comma separated fields
12
uniq uniq removes duplicate adjacent lines from a file To ensure that all of the data is unique you should sort the file first (recall that sort also can produce unique values with the – u option) Format: uniq [options] filename Options (selected): –-c : count the instances –-d : print only the duplicates (once) –-u : print only the unique lines
13
uniq Examples uniq –u abc only prints out the unique lines in file abc assuming duplicates are next to each other (without the –u, uniq will print out all of the lines, but only once, the 2 nd or more duplicate lines will not print out) uniq –c abc only prints out the count of unique lines assuming duplicates are next to each other sort abc | uniq –c put abc in order and tell me how many lines are unique
14
wc wc prints the number of newlines, words, and characters (bytes) in files Format: wc [options] filename Options: –-c : print the number of characters in the file –-w: print the number of words in a file –-l: print the number of lines in a file If no options given, wc prints newlines, words, and bytes in files in that order
15
cut cut removes columns or fields from a line Format: cut [options] filename Options: *lists may be specified using integers separated by commas, ranges are separated by hyphens (-) Examples: 1,5-7 means units 1,5,6,7 and 1,5- means units 1,5 and beyond to end of line -c list*: the list of columns to cut -f list *: the list of fields to cut -dchar: the delimiter for fields. Only one delimiter may be specified (tab is the default) -s: suppress lines without delimiters
16
cut Examples: cut -d: -f5 /etc/passwd (print only the 5 th field, delimited by colon(:), of the /etc/passwd file) cut –d” “ –f2 shuffled (print only the 2 nd field, delimited by a space, from file shuffled) cut –c4-8 shuffled (print only the columns 4 through 8 of each line in shuffled)
17
paste paste command merge the lines of one or more files into vertical columns separated by a tab. paste testfile testfile2 > outputfile this is firstline this is testfile2
18
Diff, cmp and comm Diff command. diff command will compare the two files line by line and print out the differences between. Syntax : diff [option] files Options are: -b -w -i cmp command compares the two files byte by byte with two options : -l, -s Comm command finds lines that are identical in two files
19
tr command manipulates individual characters in a character stream. tr [options] string1 string2 When executed, the program reads from the standard input and writes to the standard output. It takes as parameters two sets of characters, and replaces occurrences of the characters in the first set (string 1)with the corresponding elements from the other set (string 2). 19L6.2 tr: translating characters
20
Examples: $ tr "aeiou" “AEIOU" < computer.txt $ tr –d ‘|/’ <shortlist | head -3 $ tr ‘|’ ‘\012’ <shortlist | head -6 $ tr ‘|/’ ‘~-’ < shortlist |head -3 20L6.3
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.