Download presentation
Presentation is loading. Please wait.
Published byAnnabella Wilkins Modified over 8 years ago
1
Web (HTTP) Servers Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems January 24, 2008
2
2 Today Web (HTTP) servers Upcoming: Data on the Web HTML, XML, JPEG, MP3, … Please read chapter on XML, for review
3
3 Designing Servers (Review) Major issues: Concurrency Statefulness and sessions Are requests self-contained, or do they require the server to keep around state? Communication and consistency What state is shared across requests? Do all requests need the same view? … And, of course, security!!!
4
4 Recall Our Toy Example Suppose we want to build an “arithmetic” server Takes a request for a computation Parses the computation request Performs the computation Generates an HTML document with the result Returns the result to the requestor Suppose we can build on TCP…
5
5 Thread Pools Goal is to limit maximum concurrency and prevent thrashing among threads Main server thread (daemon) puts requests on an event queue Worker threads retrieve items from queue and process; they sleep when queue is empty
6
6 Concurrency and Debugging A critical issue: how do we debug concurrent apps? Consider: Printlns (be sure to tag with the thread/context info) Logs (see log4j) Selective breakpoints Remember to do “binary search” on your bugs!
7
7 Statefulness and Sessions Very early HTTP Essentially stateless Make a request; the response is a page that is named by the URL More recent HTTP, and other protocols: Some amount of state is maintained In HTTP, this requires cookies (more later) In many other protocols, the connection is kept open and all state is preserved on both ends Pros and cons of statefulness?
8
8 Communication and Consistency A key question in deciding on a server architecture: how much interaction is there among server processes / requests? Let’s consider: Amazon.com EBAY Blogger.com iTunes Google
9
9 Shared, Persistent State Generally a database back-end Recovery and reliability features Transaction support Simple query interface Often the database is on a different server from the executing code This is what Enterprise JavaBeans are designed to support: distributed transactions “Model view controller” pattern is the most common Model View Controller Client-side JavaScript XML view Database AJAX map
10
10 Web (HTTP) Servers HTTP request Port 80 Response Other port Processing Processes HTTP requests, generally over TCP Port 80 Or often 808x Response uses another port May involve: Returning a document, with its (MIME) type info e.g., HTML document, TXT document Invoking a program or module, returning its output Submitting form data to a program or module, returning its output Resources are described using URLs
11
11 The URL URL: Uniform Resource Locator A way of encoding protocol, login, DNS (or IP) address, path info in one string Special case of Uniform Resource Identifer (URI) URL is a URI for a location from which something can be retrieved URN is a URI for a name General syntax: {partition/protocol}://{userid}:{password}@{domain:port}/{path} http://me:too@my.com/index.html http://me:too@my.com/index.html (Note that many browsers no longer support password in URL) news://nntp.upenn.edu news://nntp.upenn.edu imap://email:me@my.com/folder1
12
12 Handling a Web (HTTP) Request 1.Read and parse the request message Most commonly, GET the contents of a URL 2.Translate the URL Extract the “path” that is being requested Determine if this is: A “virtual directory” that’s an alias for something else A reference to a file (HTML or SSI) A reference to a script or servlet 3.Verify authorization / access rights 4.Generate the response (may be an error code)
13
13 HTTP: HyperText Transfer Protocol A very simple, stateless protocol for sessionless exchanges Browser creates a new connection each time it wants to make a new request (for a page, image, etc.) What are the benefits of this model? Drawbacks? Exceptions: HTTP 1.1 added support for persistent connections and pipelining Clients + servers might keep state information Cookies provide a way of recording state
14
14 HTTP Overview Requests: A small number of request types (GET, POST, PUT, DELETE) Request may contain additional information, e.g. client info, parameters for forms, etc. Responses: Response codes: 200 (OK), 404 (not found), etc. Metadata: content’s MIME type, length, etc. The “payload” or data
15
15 A Simple HTTP Request GET /~cse455/index.html HTTP/1.1 If-Modified-Since: Sun, 11 Jan 2004 11:12:23 GMT Referer: http://www.cis.upenn.edu/index.html Requests data at a path using HTTP 1.1 protocol Example response: HTTP/1.1 200 OK Date: Wed, 14 Jan 2004 9:56:00 GMT Last-Modified: Wed, 14 Jan 2004 8:30:00 GMT Content-Type: text/html Content-Length: 3931 …
16
16 Request Types GET Retrieve the resource at a URL PUT Publish the specified data at a URL DELETE (Self-explanatory) POST Submit form content
17
17 Forms: Returning Data to the Server HTML forms allow assignments of values to variables Two means of submitting forms to apps: GET-style – within the URL: GET /home/my.cgi?param=val¶m2=val2 POST-style – as the data: POST /home/second.cgi Content-Length: 34 searchKey Penn where www.google.com
18
18 Authentication and Authorization Authentication At minimum, user ID and password – authenticates requestor Client may wish to authenticate the server, too! SSL (we’ll discuss this more later) Part of SSL: certificate from trusted server, validating machine Also: public key for encrypting client’s transmissions Authorization Determine what user can access For files, applications: typically, access control list If data from database, may also have view-based security
19
19 Programming Support in Web Servers CGI – Common Gateway Interface – the oldest: A CGI is a separate program, often in Perl, invoked by the server Certain info is passed from server to CGI via Unix-style environment variables QUERY_STRING; REMOTE_HOST, CONTENT_TYPE, … HTTP post data is read from stdin Interface to persistent process: In essence, how communication with a database is done – Oracle or MySQL is running “on the side” Communicate via pipes, APIs like ODBC/JDBC, etc. Server module running in the same process Might be custom code (e.g., Apache extension) or an interpreter/runtime system…
20
20 Server Modules Interpreters: JavaScript/JScript, PHP, ASP, … Often a full-fledged programming language Code is generally embedded within HTML, not stand-alone Custom runtimes/virtual machines: Most modern Perl runtimes; Java servlets; ASP.NET A virtual machine runs within the web server process Functions are invoked within that JVM to handle each request Code is generally written as usual, but may need to use HTML to create UI rather than standard GUI APIs Most of these provide (at least limited) protection mechanisms
21
21 Servlets An interesting model for programming applications in Java A servlet is a subclass of HttpServlet It overrides methods doGet() or doPost() It’s given a number of objects: HttpServletRequest (includes info about parameters, browser, etc.), HttpServletResponse (a means for sending info back to the browser, including data, forwarding requests, etc.) There’s a notion of a session that can be used to share state across doGet()/doPost() invocations – it’s generally connected with a cookie Those of you who took CSE 330/CIS 550 should be generally familiar with servlets Those who didn’t should be able to catch up by looking at, e.g., http://www.apl.jhu.edu/~hall/java/Servlet-Tutorial/ http://www.apl.jhu.edu/~hall/java/Servlet-Tutorial/ http://www.novocode.com/doc/servlet-essentials/ http://www.novocode.com/doc/servlet-essentials/ Your homework assignment will be to build a simple servlet engine a la Tomcat
22
22 (Cross-)Session State: Cookies Major problem with sessionless nature of HTTP: how do we keep info between connections? Cookie: an opaque string associated with a web site, stored at the browser Create in HTTP response with “ Set-Cookie: xxx ” Passed in HTTP header as “ Cookie: xxx ” Interpretation is up to the application Usually, object-value pairs; passed in HTTP header: Cookie: user=“Joe” pwd=“blob” … Often have an expiration Very common: “session cookies”
23
23 Persistent State: Interfacing with a Database A very common operation: Read some data from a database, output in a web form e.g., postings on Slashdot, items for a product catalog, etc. Three problems, abstracted away by ODBC/ADO/JDBC: Impedance mismatch from relational DBs to objects in Java (etc.) Standard API for different databases Physical implementation for each DB
24
24 The New Generation: XML All the Way An alternative to using ODBC/JDBC: use XML as the data representation Increasing interest in transmitting data as XML: “Less of an impedance mismatch” Databases easily export to XML HTTP and other protocols make a natural way to request all forms of XML (including files) Web services use SOAP messages with XML serializations of data Some XML databases now exist Next time, we’ll talk about XML in more detail
25
25 For Next Week Homework 1 broken into two milestones: First part (due next week): get a basic HTTP server up and running, with thread pool support and the appropriate locking mechanisms on the queue Second part: extend this to support running servlets Readings on XML as a data representation: Kifer et al. Chapter 17 My PhD student TJ Green will cover Tuesday
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.