Presentation is loading. Please wait.

Presentation is loading. Please wait.

Web (HTTP) Servers Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems January 24, 2008.

Similar presentations


Presentation on theme: "Web (HTTP) Servers Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems January 24, 2008."— Presentation transcript:

1 Web (HTTP) Servers Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems January 24, 2008

2 2 Today  Web (HTTP) servers  Upcoming:  Data on the Web  HTML, XML, JPEG, MP3, …  Please read chapter on XML, for review

3 3 Designing Servers (Review) Major issues:  Concurrency  Statefulness and sessions  Are requests self-contained, or do they require the server to keep around state?  Communication and consistency  What state is shared across requests?  Do all requests need the same view?  … And, of course, security!!!

4 4 Recall Our Toy Example Suppose we want to build an “arithmetic” server  Takes a request for a computation  Parses the computation request  Performs the computation  Generates an HTML document with the result  Returns the result to the requestor Suppose we can build on TCP…

5 5 Thread Pools  Goal is to limit maximum concurrency and prevent thrashing among threads Main server thread (daemon) puts requests on an event queue Worker threads retrieve items from queue and process; they sleep when queue is empty

6 6 Concurrency and Debugging  A critical issue: how do we debug concurrent apps?  Consider:  Printlns (be sure to tag with the thread/context info)  Logs (see log4j)  Selective breakpoints  Remember to do “binary search” on your bugs!

7 7 Statefulness and Sessions Very early HTTP  Essentially stateless  Make a request; the response is a page that is named by the URL More recent HTTP, and other protocols:  Some amount of state is maintained  In HTTP, this requires cookies (more later)  In many other protocols, the connection is kept open and all state is preserved on both ends Pros and cons of statefulness?

8 8 Communication and Consistency A key question in deciding on a server architecture: how much interaction is there among server processes / requests? Let’s consider:  Amazon.com  EBAY  Blogger.com  iTunes  Google

9 9 Shared, Persistent State Generally a database back-end  Recovery and reliability features  Transaction support  Simple query interface Often the database is on a different server from the executing code  This is what Enterprise JavaBeans are designed to support: distributed  transactions  “Model view controller” pattern is the most common Model View Controller Client-side JavaScript XML view Database AJAX map

10 10 Web (HTTP) Servers HTTP request Port 80 Response Other port Processing  Processes HTTP requests, generally over TCP Port 80  Or often 808x  Response uses another port  May involve:  Returning a document, with its (MIME) type info  e.g., HTML document, TXT document  Invoking a program or module, returning its output  Submitting form data to a program or module, returning its output  Resources are described using URLs

11 11 The URL URL: Uniform Resource Locator  A way of encoding protocol, login, DNS (or IP) address, path info in one string  Special case of Uniform Resource Identifer (URI) URL is a URI for a location from which something can be retrieved URN is a URI for a name General syntax:  {partition/protocol}://{userid}:{password}@{domain:port}/{path}  http://me:too@my.com/index.html http://me:too@my.com/index.html  (Note that many browsers no longer support password in URL)  news://nntp.upenn.edu news://nntp.upenn.edu  imap://email:me@my.com/folder1

12 12 Handling a Web (HTTP) Request 1.Read and parse the request message  Most commonly, GET the contents of a URL 2.Translate the URL  Extract the “path” that is being requested  Determine if this is:  A “virtual directory” that’s an alias for something else  A reference to a file (HTML or SSI)  A reference to a script or servlet 3.Verify authorization / access rights 4.Generate the response (may be an error code)

13 13 HTTP: HyperText Transfer Protocol A very simple, stateless protocol for sessionless exchanges  Browser creates a new connection each time it wants to make a new request (for a page, image, etc.)  What are the benefits of this model? Drawbacks? Exceptions:  HTTP 1.1 added support for persistent connections and pipelining  Clients + servers might keep state information  Cookies provide a way of recording state

14 14 HTTP Overview Requests:  A small number of request types (GET, POST, PUT, DELETE)  Request may contain additional information, e.g. client info, parameters for forms, etc. Responses:  Response codes: 200 (OK), 404 (not found), etc.  Metadata: content’s MIME type, length, etc.  The “payload” or data

15 15 A Simple HTTP Request GET /~cse455/index.html HTTP/1.1 If-Modified-Since: Sun, 11 Jan 2004 11:12:23 GMT Referer: http://www.cis.upenn.edu/index.html  Requests data at a path using HTTP 1.1 protocol  Example response: HTTP/1.1 200 OK Date: Wed, 14 Jan 2004 9:56:00 GMT Last-Modified: Wed, 14 Jan 2004 8:30:00 GMT Content-Type: text/html Content-Length: 3931 …

16 16 Request Types GET Retrieve the resource at a URL PUT Publish the specified data at a URL DELETE (Self-explanatory) POST Submit form content

17 17 Forms: Returning Data to the Server  HTML forms allow assignments of values to variables  Two means of submitting forms to apps:  GET-style – within the URL: GET /home/my.cgi?param=val&param2=val2  POST-style – as the data: POST /home/second.cgi Content-Length: 34 searchKey Penn where www.google.com

18 18 Authentication and Authorization Authentication  At minimum, user ID and password – authenticates requestor  Client may wish to authenticate the server, too!  SSL (we’ll discuss this more later)  Part of SSL: certificate from trusted server, validating machine  Also: public key for encrypting client’s transmissions Authorization  Determine what user can access  For files, applications: typically, access control list  If data from database, may also have view-based security

19 19 Programming Support in Web Servers CGI – Common Gateway Interface – the oldest:  A CGI is a separate program, often in Perl, invoked by the server  Certain info is passed from server to CGI via Unix-style environment variables  QUERY_STRING; REMOTE_HOST, CONTENT_TYPE, …  HTTP post data is read from stdin Interface to persistent process:  In essence, how communication with a database is done – Oracle or MySQL is running “on the side”  Communicate via pipes, APIs like ODBC/JDBC, etc. Server module running in the same process  Might be custom code (e.g., Apache extension) or an interpreter/runtime system…

20 20 Server Modules Interpreters:  JavaScript/JScript, PHP, ASP, …  Often a full-fledged programming language  Code is generally embedded within HTML, not stand-alone Custom runtimes/virtual machines:  Most modern Perl runtimes; Java servlets; ASP.NET  A virtual machine runs within the web server process  Functions are invoked within that JVM to handle each request  Code is generally written as usual, but may need to use HTML to create UI rather than standard GUI APIs  Most of these provide (at least limited) protection mechanisms

21 21 Servlets  An interesting model for programming applications in Java  A servlet is a subclass of HttpServlet  It overrides methods doGet() or doPost()  It’s given a number of objects: HttpServletRequest (includes info about parameters, browser, etc.), HttpServletResponse (a means for sending info back to the browser, including data, forwarding requests, etc.)  There’s a notion of a session that can be used to share state across doGet()/doPost() invocations – it’s generally connected with a cookie  Those of you who took CSE 330/CIS 550 should be generally familiar with servlets  Those who didn’t should be able to catch up by looking at, e.g., http://www.apl.jhu.edu/~hall/java/Servlet-Tutorial/ http://www.apl.jhu.edu/~hall/java/Servlet-Tutorial/  http://www.novocode.com/doc/servlet-essentials/ http://www.novocode.com/doc/servlet-essentials/  Your homework assignment will be to build a simple servlet engine a la Tomcat

22 22 (Cross-)Session State: Cookies Major problem with sessionless nature of HTTP: how do we keep info between connections?  Cookie: an opaque string associated with a web site, stored at the browser  Create in HTTP response with “ Set-Cookie: xxx ”  Passed in HTTP header as “ Cookie: xxx ”  Interpretation is up to the application  Usually, object-value pairs; passed in HTTP header:  Cookie: user=“Joe” pwd=“blob” …  Often have an expiration  Very common: “session cookies”

23 23 Persistent State: Interfacing with a Database A very common operation:  Read some data from a database, output in a web form  e.g., postings on Slashdot, items for a product catalog, etc. Three problems, abstracted away by ODBC/ADO/JDBC:  Impedance mismatch from relational DBs to objects in Java (etc.)  Standard API for different databases  Physical implementation for each DB

24 24 The New Generation: XML All the Way An alternative to using ODBC/JDBC: use XML as the data representation Increasing interest in transmitting data as XML:  “Less of an impedance mismatch”  Databases easily export to XML  HTTP and other protocols make a natural way to request all forms of XML (including files)  Web services use SOAP messages with XML serializations of data  Some XML databases now exist Next time, we’ll talk about XML in more detail

25 25 For Next Week Homework 1 broken into two milestones:  First part (due next week): get a basic HTTP server up and running, with thread pool support and the appropriate locking mechanisms on the queue  Second part: extend this to support running servlets Readings on XML as a data representation:  Kifer et al. Chapter 17 My PhD student TJ Green will cover Tuesday


Download ppt "Web (HTTP) Servers Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems January 24, 2008."

Similar presentations


Ads by Google