Presentation is loading. Please wait.

Presentation is loading. Please wait.

© 2013 A. Haeberlen, Z. Ives Distributed Computing: Servers Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems January.

Similar presentations


Presentation on theme: "© 2013 A. Haeberlen, Z. Ives Distributed Computing: Servers Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems January."— Presentation transcript:

1 © 2013 A. Haeberlen, Z. Ives Distributed Computing: Servers Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems January 23, 2015

2 © 2004-15 A. Haeberlen, Z. Ives Today Review and further discussion about distributed systems issues Scale Availability Consistency Interoperability Location / Discovery University of Pennsylvania 2

3 © 2004-15 A. Haeberlen, Z. Ives 3 Next Brief discussion of the Butler Lampson paper Server architecture (internal) If time: Web (HTTP) servers Read: “HTTP Made Really Easy” link on the Schedule page Krishnamurthy / Rexford Ch 4 (see Web page)

4 © 2004-15 A. Haeberlen, Z. Ives 4 Some Context To this point, you’ve probably had significant experience designing programs to solve specific, relatively small tasks It’s often a very difficult job to build a system (What is a computing system?) (Why is it harder to build?) We will consider in this course: Architectural aspects [Butler Lampson article] Algorithmic aspects [e.g., two-phase commit] Engineering aspects [e.g., build management]

5 © 2004-15 A. Haeberlen, Z. Ives 5 Butler Lampson (Abbreviated Biography from His Page) Butler Lampson is an Architect at Microsoft Corporation and an Adjunct Professor of Computer Science and Electrical Engineering at MIT. He was one of the designers of the SDS 940 time-sharing system, the Alto personal distributed computing system, the Xerox 9700 laser printer, two-phase commit protocols,... He received the ACM’s Software Systems Award in 1984 for his work on the Alto, the IEEE Computer Pioneer award in 1996, and the Turing Award in 1992.

6 © 2004-15 A. Haeberlen, Z. Ives 6 Historical Note: Xerox Alto 1972-78 Personal computer for research The first GUI-based computer (note the mouse!) 128KB RAM, 2.5MB hard disk Ethernet In many ways, the forerunner to the Xerox Star … Which begat the Apple Lisa, and the rest is history!

7 © 2004-15 A. Haeberlen, Z. Ives 7 Lampson’s Advice

8 © 2004-15 A. Haeberlen, Z. Ives 8 Designing Servers: Systems for Handling Many Client Requests Major issues: Concurrency How do we handle multiple simultaneous requests? Statefulness and sessions Are requests self-contained, or do they require the server to keep around state? Communication and consistency What state is shared across requests? Do all requests need the same view? … And, of course, security!!! (Note that servers today are typically replicated)

9 © 2004-15 A. Haeberlen, Z. Ives 9 Toy Example Suppose we want to build an “arithmetic” server Takes a request for a computation Parses the computation request Performs the computation Generates an HTML document with the result Returns the result to the requestor Suppose we can build on TCP…

10 © 2004-15 A. Haeberlen, Z. Ives 10 Concurrency One approach: a separate server for each request Obviously this doesn’t work Alternative: context-switching using shared resources One, or a few, CPUs/disks/etc., multiplexing across jobs Threads and processes Events Cooperative scheduling Thread pools

11 © 2004-15 A. Haeberlen, Z. Ives 11 Review: Threads and Processes Threads/processes are each written as if they are sequential programs But threads may also yield or wait on condition variables Preemptive switching, based on time slicing according to quanta (usu. 10- 100msec) States of threads: ready, running, and blocked Different levels of sharing and overhead between the two

12 © 2004-15 A. Haeberlen, Z. Ives 12 Example with Threads “Arithmetic” server divided into several components Daemon thread: Takes a request for a computation Parses the computation request Handler thread invoked on the results: Performs the computation Generates an HTML document with the result Returns the result to the requestor

13 © 2004-15 A. Haeberlen, Z. Ives Necessary Java Constructs Each handler may subclass Thread Implement the run() method Invoke via Handler h = new Handler(); h.start() Or the handler may implement Runnable Implement the run() method Invoke via Thread t = new Thread(myHandler); t.start() 13

14 © 2004-15 A. Haeberlen, Z. Ives Shared Resources Suppose we share a resource such as an output logfile across threads How do we ensure that each thread’s modifications to the file are compatible (e.g., log interleaves one status msg at a time)? For shared resources, use synchronized to gain a monitor on an object to be “locked” synchronized methods lock the entire object synchronized (obj) { … } blocks lock their argument 14

15 © 2004-15 A. Haeberlen, Z. Ives Issues with Threads and Shared Resources Deadlock: nothing happens because locks are held in a way that all threads are waiting on other threads Livelock: system grinds to a halt because each thread is responding to requests from the other threads, but not making progress Starvation: a thread never gets scheduled 15

16 © 2004-15 A. Haeberlen, Z. Ives 16 Event Handlers Basically, a programmer-specified way of breaking up tasks You’ve probably seen it if you’ve done any sort of GUI programming But it’s also used to multitask Based on an event queue and a notion of an event handler loop Each task is broken into a series of events Each event has a handler that does some work and potentially enqueues another event “Local state” is generally kept in the event

17 © 2004-15 A. Haeberlen, Z. Ives Shared Resources in Event Handlers Generally don’t need true synchronized blocks or the equivalent here We control when each event handler gives up a resource, hence we control interleaving of requests and their modification to shared resource But still may need to maintain flags or other information for situations when a resource is used across events 17

18 © 2004-15 A. Haeberlen, Z. Ives Example with Events 18

19 © 2004-15 A. Haeberlen, Z. Ives 19 Thread Pools Very commonly used (e.g., in many Apache products including some versions of the Web server) Fixed number of threads – say 100 or 200 As requests come in, they’re put onto a queue Handler threads dequeue items and process them

20 © 2004-15 A. Haeberlen, Z. Ives A Key Aspect of the Thread Pool: the Queue The daemon thread doesn’t spawn threads: instead, it enqueues requests The handler threads dequeue and handle requests What to do when the system is not fully saturated, i.e., some threads in the pool are idle? object.wait(), notify(), notifyAll() MUST use these when a synchronized on the object! 20

21 © 2004-15 A. Haeberlen, Z. Ives Example with Thread Pools 21

22 © 2004-15 A. Haeberlen, Z. Ives 22 Other Ideas Cooperative scheduling “Non-preemptive multitasking”: threads execute for a while, save state, and explicitly yield Examples of where used: old Mac OS, Windows 2.x Why is it bad?

23 © 2004-15 A. Haeberlen, Z. Ives 23 Concurrency and Debugging A critical issue: how do we debug concurrent apps? Consider: Threads – pros and cons Events – pros and cons There’s no free lunch! What are some tricks?

24 © 2004-15 A. Haeberlen, Z. Ives 24 Statefulness and Sessions Very early HTTP Essentially stateless Make a request; the response is a page that is named by the URL More recent HTTP, and other protocols: Some amount of state is maintained In HTTP, this requires cookies (more later) In many other protocols, the connection is kept open and all state is preserved on both ends Pros and cons of statefulness? (Does this look at all like the threads vs. events discussion?)

25 © 2004-15 A. Haeberlen, Z. Ives 25 Communication and Consistency A key question: how much interaction is there among server processes / requests? Let’s consider: Amazon.com EBAY Blogger.com iTunes Google

26 © 2004-15 A. Haeberlen, Z. Ives 26 Shared, Persistent State Generally a database back-end Recovery and reliability features Transaction support Simple query interface Often the database is on a different server from the executing code This is what Enterprise JavaBeans are designed to support: distributed transactions “Model view controller” pattern is the most common Model View Controller Client-side JScript XML view Database AJAX game

27 © 2004-15 A. Haeberlen, Z. Ives 27 Web (HTTP) Servers HTTP request Port 80 Response Other port Processing Processes HTTP requests, generally over TCP Port 80 Response uses another port May involve: Returning a document, with its (MIME) type info e.g., HTML document, TXT document Invoking a program or module, returning its output Submitting form data to a program or module, returning its output Resources are described using URLs

28 © 2004-15 A. Haeberlen, Z. Ives 28 The URL URL: Uniform Resource Locator A way of encoding protocol, login, DNS (or IP) address, path info in one string Special case of Uniform Resource Identifer (URI) URL is a URI for a location from which something can be retrieved URN is a URI for a name General syntax: {partition/protocol}://{userid}:{password}@{domain:port}/{path} http://me:too@my.com/index.html news://nntp.upenn.edu imap://email:me@my.com/folder1

29 © 2004-15 A. Haeberlen, Z. Ives 29 Handling a Web (HTTP) Request 1. Read and parse the request message Most commonly, GET the contents of a URL 2. Translate the URL Extract the “path” that is being requested Determine if this is: A “virtual directory” that’s an alias for something else A reference to a file (HTML or SSI) A reference to a script or servlet 3. Verify authorization / access rights 4. Generate the response (may be an error code)

30 © 2004-15 A. Haeberlen, Z. Ives 30 HTTP: HyperText Transfer Protocol A very simple, stateless protocol for sessionless exchanges Browser creates a new connection each time it wants to make a new request (for a page, image, etc.) What are the benefits of this model? Drawbacks? Exceptions: HTTP 1.1 added optional support for persistent connections and pipelining Clients + servers might keep state information Cookies provide a way of recording state

31 © 2004-15 A. Haeberlen, Z. Ives 31 HTTP Overview Requests: A small number of request types (GET, POST, PUT, DELETE) Request may contain additional information, e.g. client info, parameters for forms, etc. Responses: Response codes: 200 (OK), 404 (not found), etc. Metadata: content’s MIME type, length, etc. The “payload” or data

32 © 2004-15 A. Haeberlen, Z. Ives 32 A Simple HTTP Request GET /~cis455/index.html HTTP/1.1 If-Modified-Since: Sun, 7 Jan 2007 11:12:23 GMT Referer: http://www.cis.upenn.edu/index.html Requests data at a path using HTTP 1.1 protocol Example response: HTTP/1.1 200 OK Date: Sun, 7 Jan 2007 11:12:26 GMT Last-Modified: Wed, 14 Jan 2004 8:30:00 GMT Content-Type: text/html Content-Length: 3931 …

33 © 2004-15 A. Haeberlen, Z. Ives 33 Request Types GET Retrieve the resource at a URL PUT Publish the specified data at a URL DELETE (Self-explanatory; not always supported) POST Submit form content

34 © 2004-15 A. Haeberlen, Z. Ives 34 Forms: Returning Data to the Server HTML forms allow assignments of values to variables Two means of submitting forms to apps: GET-style – within the URL: GET /home/my.cgi?param=val¶m2=val2 POST-style – as the data: POST /home/second.cgi Content-Length: 34 searchKey Penn where www.google.com

35 © 2004-15 A. Haeberlen, Z. Ives 35 Authentication and Authorization Authentication At minimum, user ID and password – authenticates requestor Client may wish to authenticate the server, too! SSL (we’ll discuss this more later) Part of SSL: certificate from trusted server, validating machine Also: public key for encrypting client’s transmissions Authorization Determine what user can access For files, applications: typically, access control list If data from database, may also have view-based security We’ll talk about these in more detail later in the semester

36 © 2004-15 A. Haeberlen, Z. Ives 36 Programming Support in Web Servers Several means of supporting custom code: CGI – Common Gateway Interface – the oldest: A CGI is a separate program, often in Perl, invoked by the server Certain info is passed from server to CGI via Unix-style environment variables QUERY_STRING; REMOTE_HOST, CONTENT_TYPE, … HTTP post data is read from stdin Interface to persistent process: In essence, how communication with a database is done – Oracle or MySQL is running “on the side” Communicate via pipes, APIs like ODBC/JDBC, etc. Server module running in the same process

37 © 2004-15 A. Haeberlen, Z. Ives 37 Two Main Types of Server Modules Interpreters: Old JavaScript/JScript, PHP, ASP, … Often a full-fledged programming language Code is generally embedded within HTML, not stand-alone Custom runtimes/virtual machines/JIT compilers: Most modern Perl runtimes; Java servlets; ASP.NET; Node.js A virtual machine runs within the web server process Functions are invoked within that JVM to handle each request Code is generally written as usual, but may need to use HTML to create UI rather than standard GUI APIs Most of these provide (at least limited) protection mechanisms

38 © 2004-15 A. Haeberlen, Z. Ives 38 Interfacing with a Database A very common operation: Read some data from a database, output in a web form e.g., postings on Slashdot, items for a product catalog, etc. Three problems, abstracted away by ODBC/ADO/JDBC: Impedance mismatch from relational DBs to objects in Java (etc.) Standard API for different databases Physical implementation for each DB

39 © 2004-15 A. Haeberlen, Z. Ives 39 (Cross-)Session State: Cookies Major problem with sessionless nature of HTTP: how do we keep info between connections? Cookie: an opaque string associated with a web site, stored at the browser Create in HTTP response with “ Set-Cookie: xxx ” Passed in HTTP header as “ Cookie: xxx ” Interpretation is up to the application Usually, object-value pairs; passed in HTTP header: Cookie: user=“Joe” pwd=“blob” … Often have an expiration Very common: “session cookies”

40 © 2004-15 A. Haeberlen, Z. Ives 40 Common Web Server Architectures How do we handle many concurrent requests? Approach 1 – use what the OS provides: Fork a separate process for each request Or spawn a separate thread Approach 2 – write your own task switcher Break every response into small steps Schedule with custom event-driven dispatcher Approach 3 – pool of handlers: Create a thread pool that switches among requests or steps

41 © 2004-15 A. Haeberlen, Z. Ives 41 Content Management Systems Generally, a “middleware” that runs under the web server (or provides its own) Provides content integration from multiple sources Perhaps SQL or XML databases Perhaps text files, RSS feeds, etc. Often provides content authoring & assembly tools Typically, provides templates or other similar features for describing how to assemble the site Common examples: MS Content Management Server; Slash; Apache Cocoon

42 © 2004-15 A. Haeberlen, Z. Ives 42 Ways of Handling Many Requests Web server “listens” on port 80 – “daemon” task Upon a request, it needs to invoke a response How should that response task get executed?

43 © 2004-15 A. Haeberlen, Z. Ives 43 Readings Please read for further depth: “HTTP Made Really Easy” Rexford/Krishnamurthy chapter on HTTP servers You will need to learn: Enough about HTTP to handle GET, POST, cookies, etc. Enough about Java threads to write your own thread pools for a Web server Enough about servlets to run them (including sessions)


Download ppt "© 2013 A. Haeberlen, Z. Ives Distributed Computing: Servers Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems January."

Similar presentations


Ads by Google