Programming for WWW (ICE 1338) Lecture #4 Lecture #4 July 2, 2004 In-Young Ko iko.AT. icu.ac.kr Information and Communications University (ICU) iko.AT.

Slides:



Advertisements
Similar presentations
Introducing JavaScript
Advertisements

JavaScript FaaDoOEngineers.com FaaDoOEngineers.com.
Java Script Session1 INTRODUCTION.
The Web Warrior Guide to Web Design Technologies
ICE0534 – Web-based Software Development ICE1338 – Programming for WWW Lecture #3 Lecture #3 In-Young Ko iko.AT. icu.ac.kr iko.AT. icu.ac.kr Information.
Chapter One The Essence of UNIX.
CPSC 203 Introduction to Computers Tutorial 59 & 64 By Jie (Jeff) Gao.
Kyung Hee University 1 1 Application Layer. 2 Kyung Hee University Position of Application Layer.
Browsers and Servers CGI Processing Model ( Common Gateway Interface ) © Norman White, 2013.
1 Chapter 12 Working With Access 2000 on the Internet.
© 2010, Robert K. Moniot Chapter 1 Introduction to Computers and the Internet 1.
1 Computing for Todays Lecture 22 Yumei Huo Fall 2006.
Very Quick & Basic Unix Steven Newhouse Unix is user-friendly. It's just very selective about who its friends are.
Mgt 240 Lecture Website Construction: Software and Language Alternatives March 29, 2005.
2440: 141 Web Site Administration Web Server-Side Programming Professor: Enoch E. Damson.
Chapter 10 Publishing and Maintaining Your Web Site.
Quick Tour of the Web Technologies: The BIG picture LECTURE A bird’s eye view of the different web technologies that we shall explore and study.
1 Some basic Unix commands u Understand the concept of loggin into and out of a Unix shell u Interact with the system in a basic way through keyboard and.
8/17/2015CS346 PHP1 Module 1 Introduction to PHP.
1 SEEM3460 Tutorial Unix Introduction. 2 Introduction What is Unix? An operation system (OS), similar to Windows, MacOS X Why learn Unix? Greatest Software.
1 Homework / Exam Exam 3 –Solutions Posted –Questions? HW8 due next class Final Exam –See posted schedule Websites on UNIX systems Course Evaluations.
FALL 2005CSI 4118 – UNIVERSITY OF OTTAWA1 Part 4 Web technologies: HTTP, CGI, PHP,Java applets)
Chapter 9 Part II Linux Command Line Access to Linux Authenticated login using a Linux account is required to access a Linux system. The Linux prompt will.
Copyright © cs-tutorial.com. Introduction to Web Development In 1990 and 1991,Tim Berners-Lee created the World Wide Web at the European Laboratory for.
CPSC 203 Introduction to Computers Lab 21, 22 By Jie Gao.
Chapter 6 The World Wide Web. Web Pages Each page is an interactive multimedia publication It can include: text, graphics, music and videos Pages are.
XP New Perspectives on Browser and Basics Tutorial 1 1 Browser and Basics Tutorial 1.
Ku-Yaw Chang Assistant Professor, Department of Computer Science and Information Engineering Da-Yeh University.
Using Visual Basic 6.0 to Create Web-Based Database Applications
CHAPTER FOUR COMPUTER SOFTWARE.
10/5/2015CS346 PHP1 Module 1 Introduction to PHP.
CPSC 203 Introduction to Computers Lab 23 By Jie Gao.
HTML Hyper Text Markup Language A simple introduction.
HTML. Principle of Programming  Interface with PC 2 English Japanese Chinese Machine Code Compiler / Interpreter C++ Perl Assembler Machine Code.
Tutorial 10 Programming with JavaScript
An Introduction to JavaScript Summarized from Chapter 6 of “Web Programming: Building Internet Applications”, 3 rd Edition.
Session 2 Wharton Summer Tech Camp Basic Unix. Agenda Cover basic UNIX commands and useful functions.
Lecture 9 The Basics of JavaScript Boriana Koleva Room: C54
1 Welcome to CSC 301 Web Programming Charles Frank.
1 © Copyright 2000 Ethel Schuster The Web… in 15 minutes Ethel Schuster
1 Operating Systems and Using Linux Topics What is an Operating System? Linux Overview Frequently Used Linux Commands Some content in this lecture added.
Operating Systems and Using Linux CMSC 104, Lecture 3 John Y. Park 1.
CGI Common Gateway Interface. CGI is the scheme to interface other programs to the Web Server.
Introduction to Programming the WWW I CMSC Summer 2003 Lecture 7.
Operating System What is an Operating System? A program that acts as an intermediary between a user of a computer and the computer hardware. An operating.
Intro to PHP IST2101. Review: HTML & Tags 2IST210.
Overview Web Session 3 Matakuliah: Web Database Tahun: 2008.
Overview of Form and Javascript fundamentals. Brief matching exercise 1. This is the software that allows a user to access and view HTML documents 2.
1 Chapter 01: Introduction by Tharith Sriv. This course covers the following topics:  Hypertext Markup Language (HTML)  Cascading Style Sheets  JavaScript.
Introduction to Programming Using C An Introduction to Operating Systems.
Web Design and Development. World Wide Web  World Wide Web (WWW or W3), collection of globally distributed text and multimedia documents and files 
Programming for WWW (ICE 1338) Lecture #2 Lecture #2 June 25, 2004 In-Young Ko iko.AT. icu.ac.kr Information and Communications University (ICU) iko.AT.
Internet Applications (Cont’d) Basic Internet Applications – World Wide Web (WWW) Browser Architecture Static Documents Dynamic Documents Active Documents.
WEB SERVER SOFTWARE FEATURE SETS
Introduction to JavaScript Fort Collins, CO Copyright © XTR Systems, LLC Introduction to JavaScript Programming Instructor: Joseph DiVerdi, Ph.D., MBA.
Servers- Apache Tomcat Server Server-side scripts- Java Server Pages.
1 Introduction to Unix. 2 What is UNIX?  UNIX is an Operating System (OS).  An operating system is a control program that helps the user communicate.
The Kernel At a high level, the kernel in an operating system serves as the bridge between applications and the actual data processing of the hardware.
 Last lesson, the Windows Operating System was discussed along with the Windows command shell  Unix is a computer operating system, that similarly manages.
Web Design Terminology Unit 2 STEM. 1. Accessibility – a web page or site that address the users limitations or disabilities 2. Active server page (ASP)
XP Creating Web Pages with Microsoft Office
UNIX To do work for the class, you will be using the Unix operating system. Once connected to the system, you will be presented with a login screen. Once.
Distributed Control and Measurement via the Internet
Andy Wang Object Oriented Programming in C++ COP 3330
Chapter 27 WWW and HTTP.
Operating Systems and Using Linux
Web Programming Essentials:
Web Page Concept and Design :
Andy Wang Object Oriented Programming in C++ COP 3330
Presentation transcript:

Programming for WWW (ICE 1338) Lecture #4 Lecture #4 July 2, 2004 In-Young Ko iko.AT. icu.ac.kr Information and Communications University (ICU) iko.AT. icu.ac.kr

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University Announcements Our TA Our TA Name: Mr. Trinh Minh Cuong Name: Mr. Trinh Minh Cuong minhcuong.AT. icu.ac.kr minhcuong.AT. icu.ac.kr Office: F641 Office: F641 Office Hours: Tuesday 11-12PM, Thursday 2-4PM Office Hours: Tuesday 11-12PM, Thursday 2-4PM Please send the instructor your team information Please send the instructor your team information Please send the instructor your information for creating a Unix account Please send the instructor your information for creating a Unix account Submit your homework#1 (a URL or HTML source) by tomorrow Submit your homework#1 (a URL or HTML source) by tomorrow

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University Review of the Previous Lecture Cascading Style Sheet Cascading Style Sheet Web-based Information Integration Web-based Information Integration Examples Examples Information Mediators Information Mediators Information Wrappers (Web Wrappers) Information Wrappers (Web Wrappers)

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University Contents of Today’s Lecture Basic UNIX Commands Basic UNIX Commands More on Web-based Information Integration More on Web-based Information Integration JavaScript JavaScript

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University UNIX Operating System A multi-user, multi-tasking operating system A multi-user, multi-tasking operating system Developed by Ken Thompson and Dennis Ritchie at the Bell Lab in early 70’s Developed by Ken Thompson and Dennis Ritchie at the Bell Lab in early 70’s Success factors of UNIX Success factors of UNIX Written in a high-level language (C language) – improving readability and portability Written in a high-level language (C language) – improving readability and portability Support of primitives (system calls) – permitting complex programs to be built efficiently Support of primitives (system calls) – permitting complex programs to be built efficiently A hierarchical file system – easy maintenance A hierarchical file system – easy maintenance Hiding the machine architecture from the user – allowing programs to be run on different machines Hiding the machine architecture from the user – allowing programs to be run on different machines

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University Architecture of UNIX Systems Other application programs cc Other application programs Hardware Kernel sh who a.out date we grep ed vi ld as comp cpp nroff

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University Basic UNIX Shell Commands cd - Changes directories to the one named cd - Changes directories to the one named pwd - Displays the current working directory pwd - Displays the current working directory ls - Lists the contents of the current directory ls - Lists the contents of the current directory ls -l - Same as above, but it lists with more information ls -l - Same as above, but it lists with more information mkdir - Make a directory mkdir - Make a directory rmdir - Remove a directory rmdir - Remove a directory cat - Concatenate or show a files contents cat - Concatenate or show a files contents cp - Copy a file cp - Copy a file mv - Rename or move a file to a different name or directory mv - Rename or move a file to a different name or directory rm - Remove a file rm - Remove a file logout - Terminates a Unix Shell session logout - Terminates a Unix Shell session man - Access manual pages man - Access manual pages

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University Publishing Web Pages on the Server Copy your files to the ‘public_html’ directory under your home directory in the server Copy your files to the ‘public_html’ directory under your home directory in the server Use FTP to copy your files in a local directory to the server directory Use FTP to copy your files in a local directory to the server directory ftp vega.icu.ac.kr (login with your user ID) cd public_html lcd d:\myweb put index.html (mput *.html) quit Your homepage is now accessible from Your homepage is now accessible fromhttp://vega.icu.ac.kr/~yourid

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University Connections Between Web Clients and Servers A Web Browser A Web Server Listen 80 Accept A Web server is a daemon process that executes in the background waiting for some event to occur Process Return Connect Write Read

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University Sockets A Web Browser A Web Server Listen 80 Accept Process Return Connect Write Read Sockets A socket is an end point for communication between two machines A socket is an association of a protocol, address and process to an end point of communication

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University Accessing Web Contents from Java Programs via Sockets import java.net.*; import java.io.*; … Socket sk = new Socket( 80); OutputStream os = sk.getOutputStream(); PrintWriter pw = new PrintWriter(os); pw.println("GET /index.html"); pw.println();pw.flush(); InputStream is = sk.getInputStream(); InputStreamReader ips = new InputStreamReader(is); BufferedReader in = new BufferedReader(ips); String line; while ((line=in.readLine()) != null) { System.out.println(line);} Socket Creation Write Request Read Results

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University Accessing Web Contents from Java Programs via URL Connections import java.net.*; import java.io.*; … URL url = new URL(“ URLConnection urlc = url.openConnection(); InputStream is = urlc.getInputStream(); InputStreamReader ips = new InputStreamReader(is); BufferedReader in = new BufferedReader(ips); String line; while ((line=in.readLine()) != null) { System.out.println(line);} URL Object Creation URL Connection Creation Read Results

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University Java String Manipulation Methods for Result Parsing int indexOf(String str, int fromIndex) int indexOf(String str, int fromIndex) int lastIndexOf(String str, int fromIndex) int lastIndexOf(String str, int fromIndex) boolean startsWith(String prefix) boolean startsWith(String prefix) boolean endsWith(String suffix) boolean endsWith(String suffix) boolean matches(String regex) boolean matches(String regex) String[] split(String regex) String[] split(String regex) String substring(int begineIndex, int endIndex) String substring(int begineIndex, int endIndex) String toLowerCase() String toLowerCase() String toUpperCase() String toUpperCase()

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University Web Wrapper for Naver.com URLSummary Title

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University Result Parsing Strategies Structure-based Parsing Structure-based Parsing Analyzes Web pages based on tag hierarchies Analyzes Web pages based on tag hierarchies Cannot be used for ill-formed HTML documents Cannot be used for ill-formed HTML documents Pattern-based Parsing Pattern-based Parsing Search for a unique string pattern to locate a result item Search for a unique string pattern to locate a result item Needs to identify such unique string patterns first Needs to identify such unique string patterns first

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University Structure-based Result Parsing

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University Pattern-based Result Parsing 1.Find out a unique pattern to locate a result item e.g., “ <font” in the Naver result pages 2.Find the prefix and suffix patterns to extract an information piece (e.g., URL, title, summary) from the result item e.g., “a href=” to extract a URL from a result line e.g., “a href=” to extract a URL from a result line

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University Java Implementation of Web Wrapper public void WebWrapper(String host, String path, String query, int startIndex, int pageSize) { try { String address = " + host + path + "?where=webkr" + "&query=" + query + String address = " + host + path + "?where=webkr" + "&query=" + query + "&start=" + startIndex + "1" + “&display=" + pageSize; URL url = new URL(address); URL url = new URL(address); URLConnection urlc = url.openConnection(); URLConnection urlc = url.openConnection(); urlc.setRequestProperty("Accept", "*/*"); urlc.setRequestProperty("Accept", "*/*"); urlc.setRequestProperty("User-Agent", "Mozilla/4.0"); urlc.setRequestProperty("User-Agent", "Mozilla/4.0"); InputStream is = urlc.getInputStream(); InputStream is = urlc.getInputStream(); InputStreamReader ips = new InputStreamReader(is); InputStreamReader ips = new InputStreamReader(is); BufferedReader in = new BufferedReader(ips); BufferedReader in = new BufferedReader(ips); String line; String line; while ((line=in.readLine()) != null) { while ((line=in.readLine()) != null) {//System.out.println(line);// } } catch(Exception e) { e.printStackTrace(); e.printStackTrace();} } Parsing Results Query Translation

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University Web Robots A Web robot is a program (agent) that collects information while following all the links on a Web page A Web robot is a program (agent) that collects information while following all the links on a Web page Web Robots = Crawlers = Spiders Web Robots = Crawlers = Spiders Web search engines use Web robots to collect and index Web documents Web search engines use Web robots to collect and index Web documents A tag to tell Web robots not to index a page: A tag to tell Web robots not to index a page: Crawling methods: Crawling methods: Breadth-first crawling Breadth-first crawling Depth-first crawling Depth-first crawling

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University Breadth First Crawlers

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University Depth First Crawlers

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University For each map layer displayed, get the set of place names and classify the documents based on the place names Classify documents based on the disaster types mentioned Cross-product between place names and the disaster-type categories Plot the document clusters on the map to figure out the major flooding areas An Web document collection about ‘China disasters’ Web-based Information Management Applications (Example Scenario) Identify Recurring Disaster Areas in China, e.g. Locations of Floods

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University Keyword Editor Keyword Extractor Search Engines Place Name Generator Place Name Extractor Product Categories Mapping Clusters Pipelined components : Sequential connection : Pipelined connection Generate multiple sets of place names Web-based Information Management Applications (Example App. Design)

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University Problems in Composing Large-scale Information Management Applications Time-consuming to explore and test a large number of options Time-consuming to explore and test a large number of options Hard to choose appropriate services for collections Hard to choose appropriate services for collections Hard to quickly substitute and test a service within a sequence of steps Hard to quickly substitute and test a service within a sequence of steps Difficulties of capturing and reusing shared patterns of information management steps Difficulties of capturing and reusing shared patterns of information management steps Difficult to record and recurrently perform information management steps Difficult to record and recurrently perform information management steps Necessity of extracting abstract patterns of information management steps and reusing them Necessity of extracting abstract patterns of information management steps and reusing them Hard to cope with dynamic aspects of Web resources Hard to cope with dynamic aspects of Web resources

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University Characteristics of Large-scale Information Management Tasks Incremental development of information management steps for an abstract task goal Incremental development of information management steps for an abstract task goal Recurrent executions of the steps Recurrent executions of the steps Evolving requirements of users Evolving requirements of users Shared patterns of management steps Shared patterns of management steps Collection-based information processing Collection-based information processing Dynamic aspects of information sources and services Dynamic aspects of information sources and services Large and growing number of component services Large and growing number of component services

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University Improvement Goals Significantly reduce construction time, keeping costs low Significantly reduce construction time, keeping costs low Enable very rapid construction/adaptation of new applications Enable very rapid construction/adaptation of new applications Provide static and run-time diagnostic tools, facilitating debugging and performance tuning tasks Provide static and run-time diagnostic tools, facilitating debugging and performance tuning tasks Rapid Composition and Reconfiguration of Large-scale Custom Applications

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University JavaScript The goal of JavaScript is to provide programming capability at both the client and server ends of a Web connection The goal of JavaScript is to provide programming capability at both the client and server ends of a Web connection Originally developed by Netscape, as LiveScript Originally developed by Netscape, as LiveScript Became a joint venture of Netscape and Sun in 1995, renamed JavaScript Became a joint venture of Netscape and Sun in 1995, renamed JavaScript Now standardized by the European Computer Manufacturers Association as ECMA-262 (also ISO 16262) Now standardized by the European Computer Manufacturers Association as ECMA-262 (also ISO 16262) User interactions with HTML documents in JavaScript use the event-driven model of computation User interactions with HTML documents in JavaScript use the event-driven model of computation

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University <html> ICE1338 ICE1338 <!-- <!-- p { font-size: 12pt; color: blue; background-color: yellow } p { font-size: 12pt; color: blue; background-color: yellow } h2, h3 { font-size: 16pt; color: red; font-style: oblique } h2, h3 { font-size: 16pt; color: red; font-style: oblique } --> --> function displayDate() { function displayDate() { alert("Today's date is: " + alert("Today's date is: " + new Date() + "!!"); new Date() + "!!"); } <br/> Programming for WWW Programming for WWW A Popup Window

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University JavaScript vs. Java Both share similar syntax Both share similar syntax JavaScript is a scripting language, not a programming language JavaScript is a scripting language, not a programming language JavaScript is an interpreter-based language JavaScript is an interpreter-based language JavaScript is dynamically typed JavaScript is dynamically typed JavaScript does not support class-based inheritance JavaScript does not support class-based inheritance JavaScripts are usually embedded in HTML documents JavaScripts are usually embedded in HTML documents

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University General Syntax of JavaScript Direct embedding of a JavaScript code: Direct embedding of a JavaScript code: -- JavaScript script – -- JavaScript script –</script> Indirect JavaScript specification: Indirect JavaScript specification: Identifier form: begin with a letter or underscore, followed by any number of letters, underscores, and digits Identifier form: begin with a letter or underscore, followed by any number of letters, underscores, and digits Case sensitive Case sensitive 25 reserved words, plus future reserved words 25 reserved words, plus future reserved words Comments: both // and /* … */ Comments: both // and /* … */

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University Document Object Model HTML “A platform- and language-neutral interface that allows programs and scripts to dynamically access and update the content, structure and style of documents” <html><head> My Document My Document </head><body> Header Header Paragraph Paragraph </body></html> var header = document.getElementsByTagName("H1").item(0); header.firstChild.data = "A dynamic document";

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University DOM Specification e.g., e.g.,

July 2, Programming for WWW (Lecture#4) In-Young Ko, Information Communications University Screen Outputs The model for the browser display window is the Window object The model for the browser display window is the Window object Properties: Properties: window.document window.document window.screenLeft window.screenLeft window.screenTop window.screenTop … Methods: Methods: alert: alert: confirm confirm prompt prompt