Presentation is loading. Please wait.

Presentation is loading. Please wait.

ES 101-02. Module 5 Uniform Resource Locators, Hypertext Transfer Protocol, & Common Gateway Interface.

Similar presentations


Presentation on theme: "ES 101-02. Module 5 Uniform Resource Locators, Hypertext Transfer Protocol, & Common Gateway Interface."— Presentation transcript:

1 ES 101-02. Module 5 Uniform Resource Locators, Hypertext Transfer Protocol, & Common Gateway Interface

2 This Lecture Uniform Resource Locators (URL) Hypertext Transfer Protocol (HTTP) Common Gateway Interface (CGI)

3 Definitions We previously discussed the Domain Name System, or DNS –Distributed database hosted by DNS servers –Maps host IP addresses to a mnemonic name Easier for humans to remember –Universal registration, ie. every domain name on the Internet is unique In order to find resources on a particular server, we must introduce the concept of a URL

4 Uniform Resource Locators The URL allows a client browser to send search data to a server for further processing URLs are a scheme for specifying Internet resources using a single line of printable ASCII characters –No control characters are allowed The URL structure and syntax allows the web client to access all major Internet protocols via TCP –File Transfer Protocol (FTP) –Hypertext Transfer Protocol (HTTP) –Etc. URLs can also be used within HTML documents to provide “links” to other documents

5 URL Contents A URL contains the following: –Protocol to use when accessing the server, e.g. HTTP –Internet Domain Name of the site on which the server is running, and the address of the requested server –Port number of the target application –Location of the resource in the directory structure Example of a URL: http://www.cern.ch/hypertext/WWW/RDBgate/Implementation.html

6 URL Contents (cont’d) The previous URL references the file: Implementation.html This file is located in the directory: /hypertext/WWW/RDBgate, which is located on the server www.cern.ch The protocol used is HTTP Note that this is an exact reference. Abbreviated references are allowed under certain conditions.

7 Allowed Characters in URLs Every URL must be written using printable ASCII characters This ensures that URLs can be sent by electronic mail –Many mail programs would mishandle control characters However, any non-printable ASCII character can be included in a URL by using a character encoding scheme

8 ASCII/IRA Character Set

9 Character Encoding Any ASCII control character can be represented by using the preceding character stream, %xy, where “xy” is equal to the hexadecimal code of the character of interest It should be obvious that the “%” character can’t be used in a URL There are other disallowed characters: –“Space” and “TAB” characters, double quotation marks (“), and “Slash” are examples of forbidden characters

10 Ports and IP Addresses Port designations and IP addresses are usually “assumed” if they are not specified in the URL However, they can be included within a URL without causing problems: http://www.address.edu:80/path/subdir/file.ext If a port number or IP address is not included in the URL, the protocol assumes that the port number is the default for that protocol As an example, using the “HTTP” protocol implies port “80”

11 Ports and IP Addresses (cont’d) Numeric IP addresses can be used in place of domain names: http://132.206.9.22/pathname You could also include the username and password in the URL –This is not recommended, since the password is not encrypted. Very bad security practice!!

12 Partial URLs If you are within a given HTML document, it is not necessary to specify the complete URL Any information not included in the URL is assumed to be the same as that used to access the current document Partial URLs are very useful when constructing large collections of HTML documents that will be kept “together” –Caveat: If you move this collection of documents to a different folder or server, the links will not work

13 URL Forms Let’s look at a couple of examples: –File Transfer Protocol (FTP) –Hypertext Transfer Protocol (HTTP)

14 FTP URLs FTP URLs designate the files and directories that are accessible using the FTP protocol In the absence of any username and password, anonymous FTP access is assumed –This connects you to the server as user “anonymous” with a password equal to your email address –Examples: ftp://internet.address.edu/path/ ftp://ftp.prenhall.com/pub/esm/computer_science.s- 041/stallings/Figures/DCC7e_PDF_Figures/CHAP-02/ –Note that the final “slash” indicates a directory The web browser would display this URL as a directory of contents

15 FTP Directory Example

16 HTTP URLs HTTP URLs designate files, directories, or server-side programs that are accessible using the HTTP protocol –Example: http://www.site.edu:3232/cgi-bin/srch This example references the program “srch” at the site www.site.edu, accessible through the HTTP server, using Port = 3232 An HTTP URL must always point to either a file, or a directory –A directory is indicated by terminating the URL with a “slash” –Example: http://www.site.edu/htmldocs/ Note the slash

17 HTTP History HTTP is a protocol utilized for transmitting information with the efficiency necessary for making hypertext “jumps” It is documented in the IETF standards as RFC 2616 It is a transaction-oriented, client/server protocol The most common use of HTTP is to handle communications between a web browser (client), and a web server –Other examples: Accessing a CD using HTTP To provide reliability, HTTP utilizes TCP

18 TCP/IP Architecture

19 Hypertext Transfer Protocol In order to develop interactive HTML documents, we need to first review the interaction between a WWW client (browser) and an HTTP server –A web site is a directory of interactive HTML documents and programs This interaction involves two distinct, but closely related issues –HTTP communication methods –How a HTTP server handles a client request

20 HTTP Communication Methods HTTP provides a number of communication methods, such as: –GET, POST, HEAD, etc. These methods allow a client to receive information from the server, and send information to the server

21 HTTP Request Handling If the client requests a file, the server simply locates the file and sends it to the client –If the file is not available, an error message is returned to the client Consider the situation when the client wants to send information to the server for more complicated processing –The HTTP server software does not do this processing, but hands it off to another program via the Common Gateway Interface (cgi-bin) –The program that receives the processing request is referred to as a “gateway program” –This implies that there are two interfaces to the HTTP server HTTP client interactions CGI interactions

22 Gateway Programs Gateway programs can be referenced using URLs When the HTTP server needs to activate the program, it invokes the CGI mechanism to pass the data to the target program The CGI program acts on the data, and returns it to the HTTP protocol In order to understand the CGI program, we must first discuss the HTTP protocol After this discussion, we will cover the CGI

23 HTTP Overview HTTP is an Internet-based, client/server protocol that has been designed for the rapid and efficient delivery of HTML documents The client can make multiple concurrent requests of the HTTP server –Each request is processed individually –The server has no recollection of previous connections –This type of protocol is “stateless” Statelessness is a very important feature of HTTP –Speeds up processing of requests

24 HTTP Communications All HTTP communications utilize 8-bit characters This allows the safe transmission of any type of data, such as HTML documents An HTTP connection has four stages: –Open the connection –Request –Response –Close the connection

25 HTTP Open Connection The client contacts the server at the correct IP address, using TCP Port 80 Note that the DNS servers allow mapping mnemonic names to IP addresses TCP Port 80 is a “well known” port

26 TCP Well Known Ports

27 HTTP Request The client sends a message to the server requesting service. The client request contains HTTP request headers that define the “method” requested for the transaction The request header is followed by information about the capabilities of the client, followed by the data to be sent to the HTTP server, if any

28 HTTP Response The server sends a response to the client The response is composed of “response headers” describing the state of the transaction The response header is then followed by any data required for the client

29 HTTP Close Connection The connection is closed by the client

30 HTTP Procedure The procedure outlined previously implies that only a single download or process can be handled per connection This has some implications regarding handling of a request Consider the following scenarios: –Single Transaction per Connection –Statelessness of the Connection

31 Single Transaction per Connection Suppose HTTP is utilized to access an HTML document that contains ten different images As a result, the document is composed of 11 distinct connections –HTML document –Ten additional requests for the images

32 Statelessness of the Connection Suppose a user retrieves a “fill-in” HTML form from the HTTP server The user would then enter their username and password in order to access restricted data After the client submits the form data, the HTTP server hands off the information to the CGI program The CGI program then processes the data, and returns it as an HTML document, which is then delivered to the client Note that the HTTP server would not retain any knowledge of this connection. The state information would be included in the form data

33 Eavesdropping Recall that all HTML information is passed back and forth between the client and server in unencrypted ASCII character format This implies that a machine could “listen” on Port 80 to the data sent between the HTTP server and the client If security is required, a secure form of HTTP must be used (HTTPS) –Secure communication is beyond the scope of this course

34 Common Gateway Interface This is the standard method for communication between HTTP servers, and server-side gateway programs When access to a gateway program is required, the CGI process activates the program, and sends it any data required When the processing is finished, the CGI process sends the information back to the HTTP server Gateway programs can be compiled programs written in any high-level language, or scripting language –High-level languages: C, C++, Pascal –Scripting languages: perl, tcl, Unix shell, etc.

35 Common Gateway Interface (cont’d) Gateway programs reside in the “/cgi-bin” folder of the server

36 Next Lecture(s) This presentation concludes our discussion of HTTP/URLs, which are Layer 5 constructs The next topic of discussion will be on utilities that are of use in web development, and HTML At the conclusion of these lectures, we will discuss how to use these tools to build a web site


Download ppt "ES 101-02. Module 5 Uniform Resource Locators, Hypertext Transfer Protocol, & Common Gateway Interface."

Similar presentations


Ads by Google