Download presentation
Presentation is loading. Please wait.
1
Internet Engineering Course
Web Servers
2
Introduction Company needs to provide various web services
Hosting intranet applications Company web site Various internet applications Therefore there is a need to provide http server First we have a look at what http protocol is Then we talk about Web Servers and Apache as leading web server application
3
The World Wide Web (WWW)
Global hypertext system Initially developed in 1989 By Tim Berners Lee at the European Laboratory for Particle Physics, CERN in Switzerland. To facilitate an easy way of sharing and editing research documents among a geographically dispersed groups of scientists. In 1993, started to grow rapidly Mainly due to the NCSA developing a Web browser called Mosaic (an X Window-based application) First graphical interface to the Web More convenient browsing Flexible way people can navigate through worldwide resources in the Internet and retrieve them
4
Web Browsers Provides access to a Web server Basic components
HTML interpreter HTTP client used to retrieve HTML pages Some also support FTP, NTTP, POP, SMTP, …
5
Web Servers Definitions Common features
A computer, responsible for accepting HTTP requests from clients, and serving them Web pages. A computer program that provides the above mentioned functionality. Common features Accepting HTTP requests from the network Providing HTTP response to the requester Typically consists of an HTML Usually capable of logging Client requests/Server responses
6
Web Servers cont. Returned content Path translation Static Dynamic
Comes from an existing file Dynamic Dynamically generated by some other program/script called by the Web server. Path translation Translate the path component of a URL into a local file system resource Path specified by the client is relative to the server’s root dir
7
Basic Client/Server Architecture in WWW
Overall organization of the Web. Basic function operation is to fetch documents Client issues requests, browser displays document Server responsible for retrieving document from local file system Client/server communications based on HTTP protocol
8
Dynamic Content Parts of documents may be specified via scripts/programs Client-side (executed on client machine, e.g., within the browser) Client-side script - Script embedded in html document Applet - pre-compiled program passed to client Server-side (executed on server machine) Server-side script embedded in document Servelet - precompiled program executed within the server’s address space CGI scripts
9
Common Gateway Interface (CGI)
The principle of using server-side CGI programs. Allows documents can be generated dynamically “on-the-fly” Provides a standard way for web server to execute a program using user-provided data as input To the server, CGI program appears as program responsible for fetching the requested document
10
Architectural Overview
Architectural details of a client and server in the Web. Document fetch (and possibly server-side script): 2b-3b Execute CGI Script (separate process): 2c-3c-4c Execute servlet program (run within server): 2a-3a-4a
11
http protocol Defines the communication between a web server and a client Used to deliver virtually all files and other data (collectively called resources) on the World Wide Web A browser is an HTTP client because it sends requests to an HTTP server (Web server The standard (and default) port for HTTP servers to listen on is 80, though they can use any port.
12
Structure of http transactions
Request/Response, text based protocol Format of a http message: <initial line, different for request vs. response> Header1: value1 Header2: value2 Header3: value3 <optional message body goes here, like file contents or query data; it can be many lines long, or even binary data >
13
The Format of a Request Entity Body headers lines method sp URL sp
version cr lf header : value cr lf headers lines header : value cr lf cr lf Entity Body
14
Request Example GET /index.html HTTP/1.1 [CRLF]
Accept: image/gif, image/jpeg [CRLF] User-Agent: Mozilla/4.0 [CRLF] Host: [CRLF] Connection: Keep-Alive [CRLF] [CRLF]
15
Request Example method request URL GET /index.html HTTP/1.1 version
Accept: image/gif, image/jpeg User-Agent: Mozilla/4.0 Host: Connection: Keep-Alive [blank line here] version headers
16
The Format of a Response
status line version sp status code sp phrase cr lf header : value cr lf headers lines header : value cr lf cr lf Entity Body
17
Response Example HTTP/1.0 200 OK Date: Fri, 31 Dec 1999 23:59:59 GMT
Content-Type: text/html Content-Length: 1354 <html> <body> <h1>Hello World</h1> (more file contents) . . . </body> </html>
18
Response Example version status code reason phrase headers
HTTP/ OK Date: Fri, 31 Dec :59:59 GMT Content-Type: text/html Content-Length: 1354 <html> <body> <h1>Hello World</h1> (more file contents) . . . </body> </html> headers message body
19
Initial line A typical initial request line:
GET /path/to/file/index.html HTTP/1.0 Initial response line: HTTP/ OK HTTP/ Not Found Status code: 1xx indicates an informational message only 2xx indicates success of some kind 3xx redirects the client to another URL 4xx indicates an error on the client's part 5xx indicates an error on the server's part Common status codes: 200 OK 404 Not Found 301 Moved Permanently 302 Moved Temporarily 303 See Other (HTTP 1.1 only) 500 Server Error
20
Header lines Typical request headers: Typical response headers:
From: address of requester User-Agent: for example User- agent: Mozilla/3.0Gold Typical response headers: Server: for example Server: Apache/1.2b3- dev Last-modified: for example Last-Modified: , 19 Feb :59:59 GMT
21
Message body In a response, this is where the requested resource is returned to the client (the most common use of the message body), or perhaps explanatory text if there's an error. In a request, this is where user-entered data or uploaded files are sent to the server. If an HTTP message includes a body, there are usually header lines in the message that describe the body. In particular, The Content-Type: header gives the MIME-type of the data in the body, such as text/html or image/gif. The Content-Length: header gives the number of bytes in the body.
22
MIME Media types Multipurpose Internet Mail Extensions
HTTP sends the media type of the file using the Content-Type: header Some important media types are text/plain, text/html image/gif, image/jpeg audio/basic, audio/wav model/vrml video/mpeg, video/quicktime application/*, application-specific data that does not fall under any other MIME category, e.g. application/octet-stream
23
Sample HTTP exchange To retrieve the file at the URL Request: GET /path/file.html HTTP/1.0 From: User-Agent: HTTPTool/1.0 [blank line here] Response: HTTP/ OK Date: Fri, 31 Dec :59:59 GMT Content-Type: text/html Content-Length: 1354 <html> <body> <h1>Happy New Millennium!</h1> (more file contents) </body> </html>
24
HTTP methods GET: request a resource by url HEAD
is just like a GET request, except it asks the server to return the response headers only, and not the actual resource (i.e. no message body). This is useful to check characteristics of a resource without actually downloading it, thus saving bandwidth. POST A POST request is used to send data to the server to be processed in some way, like by a CGI script. There's a block of data sent with the request, in the message body. There are usually extra headers to describe this message body, like Content-Type: and Content-Length:. The request URI is not a resource to retrieve; it's usually a program to handle the data you're sending. The HTTP response is normally program output, not a static file.
25
HTTP 1.1 It is a superset of HTTP 1.0. Improvements include:
Faster response, by allowing multiple transactions to take place over a single persistent connection. Faster response and great bandwidth savings, by adding cache support. Faster response for dynamically-generated pages, by supporting chunked encoding, which allows a response to be sent before its total length is known. Efficient use of IP addresses, by allowing multiple domains to be served from a single IP address.
26
Manually Experimenting with HTTP
>telnet eng.ui.ac.ir 80 Trying … Connected to eng.ui.ac.ir Escape character is ‘^]’.
27
Sending a Request > GET /~ladani/index.htm HTTP/1.0 [blank line]
28
The Response HTTP/1.1 200 OK Date: Fri, 29 Feb 2008 08:23:33 GMT
Server: Apache/ (CentOS) Last-Modified: Wed, 07 Nov :27:44 GMT ETag: "6ccb6-741c-43e55e05a5000" Accept-Ranges: bytes Content-Length: 29724 Connection: close Content-Type: text/html; charset=WINDOWS-1256 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=windows-1252"> <meta name=" GENERATOR" content="Microsoft FrontPage 5.0"> ….
29
GET /~ladani/index.htm HTTP/1.0
HTTP/ OK HTML code
30
GET /~ladani/no-such-page.htm HTTP/1.0
HTTP/ Not Found HTML code
31
HTTP/1.1 without Host Header
GET /index.html HTTP/1.1 HTTP/ Bad Request HTML code Why is it a Bad Request? HTTP/1.1 without Host Header
32
Session-persistent State
What does session-persistent state mean? State information that is preserved between browsing sessions. Information that is stored semi-permanently (i.e., on disk) for later access. Why was calculator example not session-persistent? Sum, current display, etc. not preserved if we went to a different website and back to calculator.
33
Why session-persistence?
User-based customizations. MyYahoo, E*Trade, etc. Long transactions. Electronic shopping carts. Order preparation Server-side state maintenance. Large amounts of state info that you don’t want to pass back and forth.
34
Cookie Overview HTTP cookies are a mechanism for creating and using session-persistent state. Cookies are simple string values that are associated with a set of URL’s. Servers set cookies using an HTTP header. Client transmits the cookie as part of HTTP request whenever an associated URL is visited in the future.
35
Anatomy of a cookie. Cookie has 6 parts:
Name Value Domain Path Expiration Security flag Name and Value are required, others have default value.
36
Setting a cookie. A cookie is set using the “Set-cookie” header in an HTTP response. String value of the Set-cookie header is parsed into semi-colon separated fields that define the different parts of the cookie. Cookie is stored by the client.
37
Sending cookies Every time a client makes an HTTP request, it tests every cookie for a match. Cookies match if… Cookie domain is suffix of URL server. Cookie expiration has not passed. Cookie path is prefix of URL path. Cookie security flag is on and connection is secure. If a match is made, then name/value pair of cookie is sent as “Cookie” header in request.
38
Setting a Cookie Full cookie: Set-Cookie: my_cookie = This is my cookie value; domain=.eng.ui.ac.ir; path=/~ladani; expires Thu, 06- March-08 12:00:00 GMT Can have more than one Set-Cookie header, or can combine more than one cookie in one header by separating with ,
39
Cookie Matching Biggest misunderstanding: Step 1:
Servers do not RETRIEVE cookies!!!! Servers RECEIVE cookies previously planted. Step 1: Some response by server installs cookie with “Set-cookie” header. Client saves cookie to disk.
40
Cookie Matching Step 2: Step 3:
Browser goes to some page which matches previously received cookie. Cookie name and value sent in request as “Cookie” HTTP header. Step 3: CGI program detects presence of cookie and uses it. Where is the cookie info? Environment variable HTTP_COOKIE
41
Where are cookies stored on client?
Client-specific locations. No standard. Latest IE stores in a folder called “Temporary Internet Files” Each cookie stored in a separate file. Netscape stores in “cookies.txt”
42
Typical Cookie Usages Cookies as Database Index
Most common use of cookies. State information is kept in some sort of database and the cookie acts as an index. Cookies as State Variables Name of cookie is like variable name. Value of cookie is state information.
43
Cookie Security Security flag restricts when browser will send a cookie back to server. Requires “secure” connection. For example: https in effect. What does this mean about when the cookies was set?
44
First Web Server Berners-Lee wrote two programs
A browser called WorldWideWeb The world’s first Web server, which ran on NeXSTEP The machine is on exhibition at CERN’s public museum
45
Most Famous Web Servers
Apache HTTP Server from Apache Software Foundation Internet Information Services (IIS) from Microsoft Google Web Server (GWS) Started from May 2007 Lighttpd powers several popular Web 2.0 sites like YouTube, wikipedia and meebo
46
Web Servers Usage – Statistics
The most popular Web servers, used for public Web sites, are tracked by Netcraft Web Server Survey Details given by Netcraft Web Server Reports Apache is the most popular since April 1996 Currently (February 2008) about 50.93% Apache 35.56 % Microsoft (IIS, PWS, etc.) 5.16 % Google 0.99% Lighttpd
47
Web Servers Usage – Statistics cont.
Total Sites Across All Domains August February 2008
48
Web Servers Usage – Statistics cont.
Market Share for Top Servers Across All Domains August February 2008
49
Web Servers Usage – Statistics cont.
Totals for Active Servers Across All Domains June February 2008
50
Apache (A PAtCHy) Web Server
Origins: NCSA (Univ. of Illinois,Urbana/Champaign) Now: Apache Software Foundation ( developers world-wide Most widely used web server today [NetCraft web survey, 2/2008] Open source software Geographically distributed developers Modular, extensible design needed where third-party developers could override or extend basic characteristics
51
Web Server Processing Steps
Accept Client Connection Read HTTP Request Header Find File Send HTTP Response Header Read File Send Data
52
Apache HTTP Server Apache Core
Receives client request Typically, allocate new process for each incoming request Allocates request record Invokes handlers on individual modules in sequence Modules register handlers during configuration Handler Request record passed as single parameter Each handler reads/modifes request record
53
Web Server Phases Apache core invokes a handler for each phase
Resolve document reference (URI) to a local file name (or CGI program+parameters) Client authentication (verify client identity) Client access control (determine access rights) Request access control (check if access allowed) MIME type determination of the response General phase for handling leftovers (e.g., check syntax of returned response, build up user profile) Transmission of the response to client Logging data on the processing of the request
54
References http://www.jmarshall.com/easy/http/
TCP/IP Tutorial and Technical Overview, Rodriguez, Gatrell, Karas, Peschke, IBM redbooks, August 2001 Wikipedia, the free encyclopedia Apache: The Definitive Guide, 2nd edition, Ben Laurie, Peter Laurie, O’Reilly, February 1999 Webmaster in a nutshell, 1st edition, Stephen Spainhour, Valerie Quercia, O’Reilly, October 1996 Netcraft: February 2006 Web Server Survey
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.