CS 497C – Introduction to UNIX Lecture 25: - Simple Filters Chin-Chih Chang

Slides:



Advertisements
Similar presentations
Tr. translate characters - standard input. tr x y < namesAndNumbers.txt translated from x to y in file namesAndNumbers.txt tr can be used to produce more.
Advertisements

EMT 2390L Lecture 4 Dr. Reyes Reference: The Linux Command Line, W.E. Shotts.
A Guide to Unix Using Linux Fourth Edition
CS 497C – Introduction to UNIX Lecture 29: - Filters Using Regular Expressions – grep and sed Chin-Chih Chang
CS 497C – Introduction to UNIX Lecture 22: - The Shell Chin-Chih Chang
 *, ? And [ …] . Any single character  ^ beginning of a line  $ end of the line.
CS 497C – Introduction to UNIX Lecture 24: - Simple Filters Chin-Chih Chang
Quotes: single vs. double vs. grave accent % set day = date % echo day day % echo $day date % echo '$day' $day % echo "$day" date % echo `$day` Mon Jul.
CS 497C – Introduction to UNIX Lecture 21: - The Shell Chin-Chih Chang
Chapter 7 Data Management. Agenda Database concept Import data Input and edit data Sort data Function Filter data Create range name Calculate subtotal.
CS 497C – Introduction to UNIX Lecture 15: - File Attributes Chin-Chih Chang
Unix Utilities (sort/uniq) CS465 – Unix. The sort command Sorts lines Default behavior: Do a case-sensitive, ascii- alphabetic line sort, starting at.
CS 497C – Introduction to UNIX Lecture 23: - Simple Filters Chin-Chih Chang
CS 497C – Introduction to UNIX Lecture 16: - File Attributes Chin-Chih Chang
CS 497C – Introduction to UNIX Lecture 30: - Filters Using Regular Expressions – grep and sed Chin-Chih Chang
Guide To UNIX Using Linux Third Edition
Guide To UNIX Using Linux Third Edition
Lecture 02CS311 – Operating Systems 1 1 CS311 – Lecture 02 Outline UNIX/Linux features – Redirection – pipes – Terminating a command – Running program.
Grep, comm, and uniq. The grep Command The grep command allows a user to search for specific text inside a file. The grep command will find all occurrences.
Introduction to UNIX GPS Processing and Analysis with GAMIT/GLOBK/TRACK T. Herring, R. King. M. Floyd – MIT UNAVCO, Boulder - July 8-12, 2013 Directory.
CSCI 330 T HE UNIX S YSTEM File operations. OPERATIONS ON REGULAR FILES 2 CSCI The UNIX System Create Edit Display Contents Display Contents Print.
Unix Files, IO Plumbing and Filters The file system and pathnames Files with more than one link Shell wildcards Characters special to the shell Pipes and.
CSC 4630 Meeting 2 January 22, Filters Definition: A filter is a program that takes a text file as an input and produces a text file as an output.
Unix Filters Text processing utilities. Filters Filter commands – Unix commands that serve dual purposes: –standalone –used with other commands and pipes.
UNIX Filters.
CS 124/LINGUIST 180 From Languages to Information Unix for Poets (in 2014) Dan Jurafsky (From Chris Manning’s modification of Ken Church’s presentation)
CS 141 Labs are mandatory. Attendance will be taken in each lab. Make account on moodle. Projects will be submitted via moodle.
Chapter Seven Advanced Shell Programming. 2 Lesson A Developing a Fully Featured Program.
Advanced File Processing
Agenda User Profile File (.profile) –Keyword Shell Variables Linux (Unix) filters –Purpose –Commands: grep, sort, awk cut, tr, wc, spell.
Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.
Unix programming Term: III B.Tech II semester Unit-II PPT Slides Text Books: (1)unix the ultimate guide by Sumitabha Das (2)Advanced programming.
Chapter 5: Advanced Editors awk, sed, tr, cut. Objectives: After studying this lesson, you should be able to: –awk: a pattern scanning and processing.
CS 497C – Introduction to UNIX Lecture 7: General-Purpose Utilities Chin-Chih Chang
CS 403: Programming Languages Lecture 21 Fall 2003 Department of Computer Science University of Alabama Joel Jones.
Regular expressions Used by several different UNIX commands, including ed, sed, awk, grep A period ‘.’ matches any single characters.X. matches any X.
CS 403: Programming Languages Fall 2004 Department of Computer Science University of Alabama Joel Jones.
Advanced File Processing. 2 Objectives Use the pipe operator to redirect the output of one command to another command Use the grep command to search for.
Chapter Five Advanced File Processing Guide To UNIX Using Linux Fourth Edition Chapter 5 Unix (34 slides)1 CTEC 110.
Chapter Five Advanced File Processing. 2 Objectives Use the pipe operator to redirect the output of one command to another command Use the grep command.
Module 6 – Redirections, Pipes and Power Tools.. STDin 0 STDout 1 STDerr 2 Redirections.
Introduction to Unix – CS 21 Lecture 12. Lecture Overview A few more bash programming tricks The here document Trapping signals in bash cut and tr sed.
Introduction to Unix (CA263) File Processing (continued) By Tariq Ibn Aziz.
Chapter Five Advanced File Processing. 2 Lesson A Selecting, Manipulating, and Formatting Information.
Chapter Four I/O Redirection1 System Programming Shell Operators.
CSC 4630 Meeting 5 January 31, Next Time Enhance the steps that you used to clean the Moby Dick chapter to create a shell script that takes any.
TEXT PROCESSING UTILITIES. THE cat COMMAND $ cat emp1.lst $ cat emp1.lst 2233 | shukla | g.m | sales | 12/12/52 | | sharma |d.g.m |product.
Week 9 - Nov 7, Week 9 Agenda I/O redirection I/O redirection pipe pipe tee tee.
CS 124/LINGUIST 180 From Languages to Information Unix for Poets (in 2013) Christopher Manning Stanford University.
Lesson 3-Touring Utilities and System Features. Overview Employing fundamental utilities. Linux terminal sessions. Managing input and output. Using special.
– Introduction to the Shell 1/21/2016 Introduction to the Shell – Session Introduction to the Shell – Session 3 · Job control · Start,
CS 124/LINGUIST 180 From Languages to Information
CSC 352– Unix Programming, Spring 2015 February 2015 Unix Filters.
ORAFACT Text Processing. ORAFACT Searching Inside Files grep - searches for patterns within files grep [options] [[-e] pattern] filename [...] -n shows.
Uniq The uniq command is useful when you need to find duplicate lines in a file. The basic format of the command is uniq in_file out_file In this format,
Lesson 6-Using Utilities to Accomplish Complex Tasks.
In the last class, Filters and delimiters The sample database pr command head and tail commands cut and paste commands.
Comp 145 – Introduction to UNIX $200 $400 $600 $800 $1000 $200 $400 $600 $800 $1000 $200 $400 $600 $800 $1000 $200 $400 $600 $800 $1000 UNIX Processes.
Filters and Utilities. Notes: This is a simple overview of the filtering capability Some of these commands are very powerful ▫Only showing some of the.
SIMPLE FILTERS. CONTENTS Filters – definition To format text – pr Pick lines from the beginning – head Pick lines from the end – tail Extract characters.
Lesson 5-Exploring Utilities
CS 124/LINGUIST 180 From Languages to Information
Chapter 6 Filters.
INTRODUCTION TO UNIX: The Shell Command Interface
CS 124/LINGUIST 180 From Languages to Information
The Linux Command Line Chapter 6
Guide To UNIX Using Linux Third Edition
CS 124/LINGUIST 180 From Languages to Information
Lab 7: Filtering.
Software I: Utilities and Internals
Presentation transcript:

CS 497C – Introduction to UNIX Lecture 25: - Simple Filters Chin-Chih Chang

sort: Ordering a File The sort command is used to sort individual fields, and columns within these fields. When sort is invoked without options, the entire line is sorted in ASCII collating sequence. Using the -t option, you can sort the file on any field. You can sort the file on the fifth field. sort -t: +4 /etc/passwd

sort: Ordering a File You can sort on the more than one field. If the primary key is fifth field, and the secondary key the first field. sort -t: /etc/passwd With the –n (numeric) option, you can sort in a numeric sequence. sort -t: n group The -u (unique) option lets you purge duplicate lines from a file.

sort: Ordering a File cut -d’:’ -f3 shortlist | sort -u | tee des.lst sort uses the -o (output) option to output the result to a file. sort -o sortedlist +3 list You can check if the file actually been sorted with the -c (check) option. To merge two sorted files, use the -m option. sort -m foo1 foo2 foo3

tr: Translating Characters The tr (translate) command translates characters and can be used to change the case of letters. The syntax is: tr options expression1 expression2 < standard input You can use tr to replace the : with a | (tilde), and the / with \. tr ‘:/’ ‘|\’ < /etc/group We can change the case of the first three lines from lower to upper: head -3 /etc/group | tr ‘[a-z]’ [A-Z]’

tr: Translating Characters The -d (delete) option is used to delete characters. The -s (squeeze) option is used to compress multiple consecutive characters. tr -s ‘ ‘ < shortlist The -c (complement) option complements (negates) the set of characters in the expression. You can also use octal values in tr. tr ‘|’ ‘\012’ < shortlist

uniq: Locate Repeated and Nonrepeated Lines uniq removes duplicate lines. It is usually sort a file and pipe the process to uniq. sort dept.lst | uniq - The -u (unique) option selects only nonduplicate lines. The -d (duplicate) option selects only the repated ones. The -c (count) option option displays the frequency of all lines.

nl: Line Numbering The nl command numbers only logical lines. nl uses the tab as the default delimiter, but we can change it to the : with the -s option. You can set the width (-w) of the number format. nl -w1 -s: calc.lst

dos2unix and unix2dos: DOS and UNIX Files UNIX and DOS files differ in structure. Lines in DOS are terminated by the carriage return - linefeed characters, while a UNIX line uses only linefeed. Some UNIX systems feature two utilities - dos2unix and unix2dos - for converting files between DOS and UNIX. unix2dos catalog.html catalog.html cat *.html | unix2dos > combined.html

Spell (ispell): Check Your Spellings spell is used to spell-check a document. The command reads a file and generates a list of all spellings that are recognized as mistakes. The -b (british) option uses the British dictionary. Linux has an interactive spell-checking program - ispell. When used with the -l option, ispell works noninteractively like spell.

Applying the Filters A three stage operation is shown as below: –Cut out the third field with cut -d’|’ -f3 shortlist. –Sort it next with sort. –Finally, run uniq -c on the sorted output. This can be done together using a pipeline: cut -d’|’ -f3 shortlist | sort | uniq -c To output the manual page in a plaintext format: man ls | col -b > ls.man