Presentation is loading. Please wait.

Presentation is loading. Please wait.

© 2006 KDnuggets 152.152.98.11 - - [16/Nov/2005:16:32:50 -0500] "GET /jobs/ HTTP/1.1" 200 15140 "http://www.google.com/search?q=salary+for+data+mining&hl=en&lr=&start=10&sa=N"

Similar presentations


Presentation on theme: "© 2006 KDnuggets 152.152.98.11 - - [16/Nov/2005:16:32:50 -0500] "GET /jobs/ HTTP/1.1" 200 15140 "http://www.google.com/search?q=salary+for+data+mining&hl=en&lr=&start=10&sa=N""— Presentation transcript:

1 © 2006 KDnuggets 152.152.98.11 - - [16/Nov/2005:16:32:50 -0500] "GET /jobs/ HTTP/1.1" 200 15140 "http://www.google.com/search?q=salary+for+data+mining&hl=en&lr=&start=10&sa=N" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1;.NET CLR 1.1.4322)“ 252.113.176.247 - - [16/Feb/2006:00:06:00 -0500] "GET / HTTP/1.1" 200 12453 "http://www.yisou.com/search?p=data+mining&source=toolbar_yassist_button&pid=400 740_1006" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; MyIE2)" 252.113.176.247 - - [16/Feb/2006:00:06:00 -0500] "GET /kdr.css HTTP/1.1" 200 145 "http://www.kdnuggets.com/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; MyIE2)" 252.113.176.247 - - [16/Feb/2006:00:06:00 -0500] "GET /images/KDnuggets_logo.gif HTTP/1.1" 200 784 "http://www.kdnuggets.com/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; MyIE2)" 2b: Unix Tools for Web Log Analysis

2 © 2006 KDnuggets Unix Unix is a very powerful operating system, with a rich set of tools especially suitable for log analysis. Many flavors: Linux, Cygwin (Unix for Windows), SunOS, … We use basic commands that should work the same on all flavors. We only explain basic options. For more, see man (manual page) E.g. man sort gives the details for sort command.

3 © 2006 KDnuggets Unix: cat – print a file cat file - print the file to standard output ip7509:56:00/index.html ip7509:56:00/a.jpg ip0209:56:01/ ip7509:56:02/b.jpg ip3009:56:02/test.htm ip0209:56:03/software.html Examples use file a.txt

4 © 2006 KDnuggets Unix: cat – print a file cat file - print the file to standard output % cat a.txt ip75 09:56:00 /index.html ip75 09:56:00 /a.jpg ip02 09:56:01 / ip75 09:56:02 /b.jpg ip30 09:56:02 /test.htm ip02 09:56:03 /software.html Unix prompt command output

5 © 2006 KDnuggets Unix: head – first n lines head -n file -- print the first n lines from file; if n is omitted, prints the first 10 lines. Example: % head -2 a.txt ip75 09:56:00 /index.html ip75 09:56:00 /a.jpg

6 © 2006 KDnuggets Unix: cut – select a column cut file extract a column or set of columns from file Example: % cut -d" " -f1 a.txt ip75 ip02 ip75 ip30 ip02

7 © 2006 KDnuggets Unix: sort – sort a file sort file sort the file in ascending order Example: % sort a.txt ip02 09:56:01 / ip02 09:56:03 /software.html ip30 09:56:02 /test.htm ip75 09:56:00 /a.jpg ip75 09:56:00 /index.html ip75 09:56:02 /b.jpg

8 © 2006 KDnuggets Unix: sort – sort a file, 2 sort –t"d" –k n file sort by field # n, where fields are separated by the delimiter character d Example: % sort -t" " -k 3 a.txt ip02 09:56:01 / ip75 09:56:00 /a.jpg ip75 09:56:02 /b.jpg ip75 09:56:00 /index.html ip02 09:56:03 /software.html ip30 09:56:02 /test.htm

9 © 2006 KDnuggets Unix: | (pipe) – combine commands command1 | command2 send the output of command1 to be the input to command2 Example: % sort -t" " -k 3 a.txt | head -1 ip02 09:56:01 /

10 © 2006 KDnuggets Unix: uniq – unique lines uniq –c file keeps the unique lines in the sorted file, -c option also produces counts of each line Example: the following commands get a unique list of IP addresses, and also counts % cut -d" " -f1 a.txt | sort | uniq ip02 ip30 ip75 % cut -d" " -f1 a.txt | sort | uniq -c 2 ip02 1 ip30 3 ip75

11 © 2006 KDnuggets Unix: wc – word/line count wc -l file count lines, words, and characters in file with –l option count only lines Note: –l is ell -- lowercase L -- not one. % cat a.txt | wc 6 18 143 % cat a.txt | wc -l 6 % cut -d" " -f1 a.txt | sort | uniq | wc -l 3 count # of unique values in the first column

12 © 2006 KDnuggets Unix – sed (string editor) sed command [file...] very powerful string editor. E.g. to change "a" to "b" in file a.txt, we can use % cat a.txt | sed 's/a/b/' To change "/index.html" to "/", use % cat a.txt | sed 's/index.html//'

13 © 2006 KDnuggets Unix: gzip, gunzip, zcat – compress or expand files gzip file -- compress file gunzip file – uncompress file zcat file -- uncompress and cat file Log files are generally very large and stored in compressed form. zcat command allows you to process them without uncompressing them

14 © 2006 KDnuggets Unix: man – manual page man command print manual page for command you can usually find manual page by googling for unix man command


Download ppt "© 2006 KDnuggets 152.152.98.11 - - [16/Nov/2005:16:32:50 -0500] "GET /jobs/ HTTP/1.1" 200 15140 "http://www.google.com/search?q=salary+for+data+mining&hl=en&lr=&start=10&sa=N""

Similar presentations


Ads by Google