Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 497C – Introduction to UNIX Lecture 25: - Simple Filters Chin-Chih Chang

Similar presentations

Presentation on theme: "CS 497C – Introduction to UNIX Lecture 25: - Simple Filters Chin-Chih Chang"— Presentation transcript:

1 CS 497C – Introduction to UNIX Lecture 25: - Simple Filters Chin-Chih Chang

2 sort: Ordering a File The sort command is used to sort individual fields, and columns within these fields. When sort is invoked without options, the entire line is sorted in ASCII collating sequence. Using the -t option, you can sort the file on any field. You can sort the file on the fifth field. sort -t: +4 /etc/passwd

3 sort: Ordering a File You can sort on the more than one field. If the primary key is fifth field, and the secondary key the first field. sort -t: +4 -5 +0 /etc/passwd With the –n (numeric) option, you can sort in a numeric sequence. sort -t:0 +2 -3 -n group The -u (unique) option lets you purge duplicate lines from a file.

4 sort: Ordering a File cut -d’:’ -f3 shortlist | sort -u | tee des.lst sort uses the -o (output) option to output the result to a file. sort -o sortedlist +3 list You can check if the file actually been sorted with the -c (check) option. To merge two sorted files, use the -m option. sort -m foo1 foo2 foo3

5 tr: Translating Characters The tr (translate) command translates characters and can be used to change the case of letters. The syntax is: tr options expression1 expression2 < standard input You can use tr to replace the : with a | (tilde), and the / with \. tr ‘:/’ ‘|\’ < /etc/group We can change the case of the first three lines from lower to upper: head -3 /etc/group | tr ‘[a-z]’ [A-Z]’

6 tr: Translating Characters The -d (delete) option is used to delete characters. The -s (squeeze) option is used to compress multiple consecutive characters. tr -s ‘ ‘ < shortlist The -c (complement) option complements (negates) the set of characters in the expression. You can also use octal values in tr. tr ‘|’ ‘\012’ < shortlist

7 uniq: Locate Repeated and Nonrepeated Lines uniq removes duplicate lines. It is usually sort a file and pipe the process to uniq. sort dept.lst | uniq - The -u (unique) option selects only nonduplicate lines. The -d (duplicate) option selects only the repated ones. The -c (count) option option displays the frequency of all lines.

8 nl: Line Numbering The nl command numbers only logical lines. nl uses the tab as the default delimiter, but we can change it to the : with the -s option. You can set the width (-w) of the number format. nl -w1 -s: calc.lst

9 dos2unix and unix2dos: DOS and UNIX Files UNIX and DOS files differ in structure. Lines in DOS are terminated by the carriage return - linefeed characters, while a UNIX line uses only linefeed. Some UNIX systems feature two utilities - dos2unix and unix2dos - for converting files between DOS and UNIX. unix2dos catalog.html catalog.html cat *.html | unix2dos > combined.html

10 Spell (ispell): Check Your Spellings spell is used to spell-check a document. The command reads a file and generates a list of all spellings that are recognized as mistakes. The -b (british) option uses the British dictionary. Linux has an interactive spell-checking program - ispell. When used with the -l option, ispell works noninteractively like spell.

11 Applying the Filters A three stage operation is shown as below: –Cut out the third field with cut -d’|’ -f3 shortlist. –Sort it next with sort. –Finally, run uniq -c on the sorted output. This can be done together using a pipeline: cut -d’|’ -f3 shortlist | sort | uniq -c To output the manual page in a plaintext format: man ls | col -b >

Download ppt "CS 497C – Introduction to UNIX Lecture 25: - Simple Filters Chin-Chih Chang"

Similar presentations

Ads by Google