CIS392 Sp 03Assign#11 CIS392 Text Processing, Retrieval, and Mining Spring 03 Instructor: Dr. Y. F. Brook Wu BOW toolkit:

Slides:



Advertisements
Similar presentations
1 Copyright © 2002 Pearson Education, Inc.. 2 Chapter 2 Getting Started.
Advertisements

Introduction to the Omega Server CSE Overview Intro to Omega Basic Unix Command Files Directories Printing C and C++ compilers GNU Debugger.
Linux Orientation Computer Systems Lab Computer Sciences Department Room 2350.
FILE TRANSFER PROTOCOL Short for File Transfer Protocol, the protocol for exchanging files over the Internet. FTP works in the same way as HTTP for transferring.
Introducing the Command Line CMSC 121 Introduction to UNIX Much of the material in these slides was taken from Dan Hood’s CMSC 121 Lecture Notes.
CIS101 Introduction to Computing Week 05. Agenda Your questions CIS101 Survey Introduction to the Internet & HTML Online HTML Resources Using the HTML.
Its easy to be an information provider Tutorial: Web Publishing.
Web Pages Publishing your page on ASUWlink. Unix Directory Commands ls –la –will show all directories and files –will show directory and file permissions.
The Internet. Telnet Telnet means using your computer as a terminal. All commands you type are sent to the host computer you are connected to and executed.
1 Computing for Todays Lecture 22 Yumei Huo Fall 2006.
UNIX Chapter 00 A “ Quick Start ” into UNIX Operating System Mr. Mohammad Smirat.
Unix Basics. Systems Programming: Unix Basics 2 Unix Basics  Unix directories  Important Unix file commands  File and Directory Access Rights through.
CPSC 203 Introduction to Computers Lab 21, 22 by Jie (Jeff) Gao Location: ES650.
A Mini UNIX Tutorial. What’s UNIX?  An operating system run on many servers/workstations  Invented by AT&T Bell Labs in late 60’s  Currently there.
A crash course in njit’s Afs
L INUX C OMMAND L INE I NTERFACE G UNAANBAN.G
Using Macs and Unix Nancy Griffeth January 6, 2014 Funding for this workshop was provided by the program “Computational Modeling and Analysis of Complex.
SoftwareTools CGS 3460, Lecture 7 Jan 25, 2006 Zhen Yang.
Notes Assignment #1 is due next Friday by 11:59 pm via Test #1 will be held Thursday February 18 at the start of class (one period long) Format:
Help session: Unix basics Keith 9/9/2011. Login in Unix lab  User name: ug0xx Password: ece321 (initial)  The password will not be displayed on the.
Unix Primer. Unix Shell The shell is a command programming language that provides an interface to the UNIX operating system. The shell is a “regular”
Lesson 7-Creating and Changing Directories. Overview Using directories to create order. Managing files in directories. Using pathnames to manage files.
Linux Directory Navigation. File & Directory Commands This is a minimal list of Unix commands that you must know for file management: ls (list) mkdir.
Using the Web Publishing Services at WSHS February 24, 1997 Phil Wherry Walt Sanford
Essential Unix at ACEnet Joey Bernard, Computational Research Consultant.
Unix Basics Chapter 4.
AN INTRO TO UNIX/LINUX COMMANDS BY: JIAYANG WANG.
ECT 250: Survey of E-Commerce Technology FrontPage Publishing pages Unix.
UNIX Commands. Why UNIX Commands Are Noninteractive Command may take input from the output of another command (filters). May be scheduled to run at specific.
UNIX Workshop Freshmen Orientation UNIX workshop Before we begin…  Does everybody have a computer?  Does everybody have your account slips?
1 Operating Systems and Using Linux Topics What is an Operating System? Linux Overview Frequently Used Linux Commands Reading None.
Lesson 2-Touring Essential Programs. Overview Development of UNIX and Linux. Commands to execute utilities. Communicating instructions to the shell. Navigating.
BIF713 Basic Unix/Linux Commands Getting Help with Commands.
Welcome to CS323 Operating System lab 1 TA: Nouf Al-Harbi NoufNaief.net.
Tera Term Brian Smith Chris Vasse Zaheemat Adetoro William Newton Tom Presgraves.
Basic Unix Commands CGS 3460, Lecture 6 Jan 23, 2006 Zhen Yang.
Unix and Samba By: IC Labs (Raj Kidambi). What is Unix?  Unix stands for UNiplexed Information and Computing System. (It was originally spelled "Unics.")
Unix Commands PowerPoint Presentation developed for LS 560 Information Technology online class - University of Alabama by Debey Sklenar TENacious Cohort.
1May 16, 2005 Week 2 Lab Agenda Command Line FTP Commands Review More UNIX commands to learn File name expansion - * Introduction of vi.
Introduction to Programming Using C An Introduction to Operating Systems.
CPSC 203 Introduction to Computers T43, T46 & T68 TA: Jie (Jeff) Gao.
Free Powerpoint Templates Page 1 Free Powerpoint Templates Users and Documents.
1 Lecture 2 Working with Files and Directories COP 3353 Introduction to UNIX.
Basic Unix Commands & GCC Saurav Karmakar Spring 2007.
AN INTRO TO UNIX/LINUX COMMANDS BY: JIAYANG WANG.
1 Introduction to Unix. 2 What is UNIX?  UNIX is an Operating System (OS).  An operating system is a control program that helps the user communicate.
The Kernel At a high level, the kernel in an operating system serves as the bridge between applications and the actual data processing of the hardware.
1 CS3695 – Network Vulnerability Assessment & Risk Mitigation – Introduction to Unix & Linux.
Unix Fundamentals CS 127. File navigation cd - change directory cd /var/log cd /etc/apache2 cd ~/Desktop ~ is a shortcut for the home directory.
A Mini UNIX Tutorial. What’s UNIX?  An operating system run on many servers/workstations  Invented by AT&T Bell Labs in late 60’s  There are many different.
Learning basic Unix command It 325 operating system.
Linux Tutorial Lesson Two *Getting Help in Linux *Data movement and manipulation *Relative and Absolute path *Processes Note: see chapter 1,2,3 from Linux.
CMSC 104, Version 8/061L03OperatingSystems.ppt Operating Systems and Using Linux Topics What is an Operating System? Linux Overview Frequently Used Linux.
CS 120 Extra: The CS1 Server Tarik Booker CS 120.
XP Creating Web Pages with Microsoft Office
CMSC 104, Version 9/011 Operating Systems and Using Linux Topics What is an Operating System? Linux Overview Frequently Used Linux Commands Reading None.
UNIX To do work for the class, you will be using the Unix operating system. Once connected to the system, you will be presented with a login screen. Once.
CS1010: Intro Workshop.
Department of Computer Engineering
Web Programming Essentials:
Andy Wang Object Oriented Programming in C++ COP 3330
FTP and UNIX TOPICS Exploring your Web Hosting Site FTP UNIX
Operating Systems and Using Linux
Operating Systems and Using Linux
Web Programming Essentials:
Andy Wang Object Oriented Programming in C++ COP 3330
Module 6 Working with Files and Directories
Lab 2: Terminal Basics.
January 26th, 2004 Class Meeting 2
Presentation transcript:

CIS392 Sp 03Assign#11 CIS392 Text Processing, Retrieval, and Mining Spring 03 Instructor: Dr. Y. F. Brook Wu BOW toolkit:

CIS392 Sp 03Assign#12 Login in to AFS On campus: go to a computer lab in GITC At home: make sure the internet connection has been established. Assume everyone has Windows at home. Click on Start  Run Type in “telnet afs1.njit.edu” (without quotes; the first screen shows some useful information.) Enter user name and password What if your account doesn’t work: Call help desk , they can reset your password for you.

CIS392 Sp 03Assign#13 Useful UNIX commands Note: All filenames and commands in UNIX system are case sensitive. General syntax: Command [option] Argument Options modify the way command works, and they are optional. Arguments are usually files; sometimes they are optional too. Ex: rm –r directory_name

CIS392 Sp 03Assign#14 Note Typing two “-” next to each other in MS PowerPoint will make them look like “—”. Those BOW and UNIX commands you see in these slides, therefore, are confusing. So, please refer to BOW help file and UNIX documentations for their actual usages.

CIS392 Sp 03Assign#15 Useful UNIX commands man (for manual) ex: man ls (manual for ls command) cd (change directory) ls (list files and attributes) dir (list files) mkdir (crete a directory) rm (delete a file) rm –fr directory_name (delete the whole directory and files inside it.)

CIS392 Sp 03Assign#16 Useful UNIX commands rmdir (remove directory) cp (copy) pwd (current working directory) pico (a text editor) more filename (read plain text file one screen at a time. Press space bar to continue and “q” to quit.) quota (disk space)

CIS392 Sp 03Assign#17 More useful UNIX commands D/Academic_Computing/Manuals/UNIX/ UNIX.html D/Academic_Computing/Manuals/UNIX/ UNIX.html

CIS392 Sp 03Assign#18 How to create your home page on AFS system? Help info: ec.njit.edu/ec_info/newuser/web/web.ht mlhttp://www- ec.njit.edu/ec_info/newuser/web/web.ht ml Execute this command at the UNIX prompt: /usr/ec/bin/home.page.setup Your URL:

CIS392 Sp 03Assign#19 Overview of Retrieval Experiment Create a sub-directory for CIS392 assignments under ~your_user_name/public_html Create 3 sub-directories under the above directory for the 3 automatic indexing activities Perform 3 automatic indexing activities with 3 different options

CIS392 Sp 03Assign#110 Overview of Retrieval Experiment (cont) Perform 3 retrievals for each of the above 3 auto indexing activities Analyze how different indexing options affect retrieval Make an html page to present your results.

CIS392 Sp 03Assign#111 Creating sub directories Change directory to public_html by typing: cd public_html mkdir cis392 (now you’ve created a directory for your CIS392 retrieval assignments) cd cis392 (go inside cis392 directory)

CIS392 Sp 03Assign#112 Creating three sub-directories mkdir model1 (this directory stores results from default settings: no stemming and stopped words removed.) mkdir model2 (this directory stores results from the following settings: no stemming, and stopped words INCLUDED.) mkdir model3 (this directory stores results from the following settings: stemming, and stopped words removed.)

CIS392 Sp 03Assign#113 URL of your retrieval experiment ec.njit.edu/~yourusername/cis392/cis39 2re.html ec.njit.edu/~yourusername/cis392/cis39 2re.html See a sample page created by Prof Wu: ec.njit.edu/~wu/cis392/cis392re.html ec.njit.edu/~wu/cis392/cis392re.html

CIS392 Sp 03Assign#114 Getting Access to BOW and Test Collection there are three directories under ~wu/IR_Tools: bow (for BOW system), to execute BOW, change directory to: ~wu/IR_Tools/bow/bin som (for self-organizing map program. Do NOT use it now!) tc (test collection, Library and Information Science Abstracts) the text is under ~wu/IR_Tools/tc/lisa/text/group0 to group5

CIS392 Sp 03Assign#115 Test Collection: LISA The sample queries are stored in ~wu/IR_Tools/tc/lisa/LISA.QUE The relevant documents corresponding to queries are stored in: ~wu/IR_Tools/tc/lisa/LISA.REL (“-1” marks the end of the entry.)

CIS392 Sp 03Assign#116 Operating Arrow of BOW Read information from BOW’s web site (again, the URL is list on the “Resources” section of the class syllabus) Read Arrow’s help file (available on syllabus page; You should print a copy of the help file.)

CIS392 Sp 03Assign#117 Automatic Indexing To begin the retrieval tasks, first you need to index the whole document collection. Specify lexing options (stopped words removal and/or stemming) at this time. arrow -d ~yourusername/public_html/cis index ~wu/IR_Tools/tc/lisa/text/* The * sign is a wildcard represents all files and directories under ~wu/IR_Tools/tc/lisa/text

CIS392 Sp 03Assign#118 Automatic Indexing -d parameter specifies where you will store the statistics resulted from indexing. (You will have to specify this directory when you want to index and retrieve documents.) The path after –index specifies the location of text collection. The default lexing settings of the above task include: NO stemming performed, and stopped words REMOVED.

CIS392 Sp 03Assign#119 Query assigned for retrieval Please refer to retrieval experiment section of the online syllabus to see which query you get for the experiment. ( IS392/CIS392-Sp03.htm) IS392/CIS392-Sp03.htm

CIS392 Sp 03Assign#120 Retrieval First, please specify where the indexing statistics is stored, and then the query to be performed. arrow –d ~yourusername/public_html/cis392/model1 -- num-hits-to-show=25 –query > ~yourusername/public_html/cis392/model1/re trieved_docs The greater-than sign (>) specifies the output filename and where it will be stored.

CIS392 Sp 03Assign#121 Presenting your RE create a page under your ~/public_html/cis392 directory named: cis392re.html this page should contain several pieces of information, see: html html

CIS392 Sp 03Assign#122 Presenting your RE You can create this html page with the pico editor in UNIX (if you know basic html tags), Microsoft Word (save the file in html format), or Netscape composer. If you use an html editor, you might need FTP software. Before due date: Please check all items on your html page and make sure all of them are displayed properly. After due date: do not make changes. I can check when the files were last updated.