Presentation on theme: "The Web Core Ideas and Technologies Resources and Objects (and Services tomorrow) HTTP MIME Types URIs ReST."— Presentation transcript:
The Web Core Ideas and Technologies Resources and Objects (and Services tomorrow) HTTP MIME Types URIs ReST
Objects What is an Object?
Objects What is an Object? An object encapsulates state and operations on that state Remote Objects are the transferral of a local/single address space paradigm to a distributed environment. The fact that the object you are invoking resides on a different machine is often transparent - RPC Simple to use - no new concepts to digest But distributed environments are different from local ones. Example remote object technologies: CORBA, DCOM, Java RMI/Jini
Problems with RPC Stem from the fact that distributed environments are different from local ones: What if the remote node goes down? What if the network goes down? How long should I wait before deciding something has gone wrong if I hear nothing? How do I find a remote object? How do I make sure my concept of its lifetime/state are synchronized with the remote host’s concept of its lifetime/state? How do I ensure that the messages I send arrive in the same sequence at the other end? A single address space is tightly coupled through the operating system
Peer ClientServer Node Computer/ Device Service Resource Resources Resource: any hardware or software resource shared on a distributed network e.g. a file storage system, RAM, CPU, a file, a service or a communication channel
Resources and the Web Resources have a particular meaning in the context of the Web. Resource Orientation How is Resource Orientation different from Object Orientation? Everything that can be named is a resource. More commonly, resources are anything that can be located on the network. A resource has state, but no operations over that state. Operations on resource state are defined by the address scheme at which resources are located. If a resource is located at: http://example.com/abc, then the operations potentially allowed on that resource are defined by the HTTP protocol.http://example.com/abc, Example of resource oriented system: The Web
Benefits of Resource Orientation No proliferation of access/manipulation interfaces The Web is based on a handful of operations defined by HTTP. Imagine if every website had its own programmatic interface In contrast, every Object has its own proprietary interface. State is explicit Resources are not shared Representations of resources are shared Because representations of state are shared, firewalling and caching are much simpler because servers are not receiving unknown application-specific payloads. The server is the application interface Provably scalable…
data object data resource Objects and Resources operations Access protocol representation
Uniform Access - HTTP Main access/manipulation protocol of the Web A means of exchanging resource representations across the network Request/response based interactions Supports a limited number of requests GET – retrieve a resource – no payload POST – send a resource to a server – payload PUT – update or alter a resource on a server - payload If the resource does not exist, the server may allow the resource to be created DELETE – remove a resource from a server – no payload HEAD – find out what metadata you would get back if you performed a GET – no payload OPTIONS – find out what operations are permitted on a resource – no payload
HTTP When a server receives a request, it returns a response containing a response code, headers and possibly a resource representation. Codes are divided into types 100 – 199 – informational 200 – 299 – success 300 – 399 – redirect 400 – 499 – client error 500 – 599 – server error examples: 200 OK 201 Created 403 Forbidden 301 Redirect (with ‘Location’ header) 500 Internal Server Error
HTTP Headers The HTTP message is an envelope starts with headers Then the body (payload), if any. request: GET /weather/ HTTP/1.1 Host: www.bbc.co.uk User-Agent: curl/7.16.3 Accept: */* response: HTTP/1.1 200 OK Server: Apache/2.0.59 (Unix) Content-Length: 12345 Content-Type: text/html …payload…
MIME Multipurpose Internet Mail Extensions not used just in email now The Content-Type header field points to a MIME Type A (slowly growing) list of known media types text/plain text/xml text/html image/gif image/jpeg Developers/designers are encouraged to reuse MIME types keeping the list manageable helps interoperability A different approach to Object interfaces
Common HTTP Headers Host: the target host Content-Length: length (in bytes) of the resource representation Content-Type: MIME type of representation Accept: Accepted MIME type Transfer-Encoding: can be chunked. This means the data is sent in chunks, rather than all in one go. Useful for data for which the length is not known when starting transmission. Content-Encoding: e.g. gzip, deflate. Allows zipping up content. Etag: a server defined ID for a resource If-Match, If-None-Match: allows conditional requests Date: If-Modified-Since, If-Not-Modified: allows conditional requests Expect: e.g. 100-continue: server responds with a 100 status code if the client should continue with the request by sending the payload. WWW-Authenticate: authentication challenge issued by a server Many Caching-related headers
HTTP Headers are extensible Typically start with ‘X-’ Allows application specific metadata Examples from Rackspace: X-Auth-User – user name in request X-Auth-Key – user API key in request X-Auth-Token – user token with lifetime in response Mobiles: x-roaming - is the user roaming? x-nokia-msisdn – user’s mobile number in plain text HTTP_X_BROWSER_HEIGHT HTTP_X_DEVICE_TYPE Even: HTTP_X_ME_CUSTOM_ITEM_1
Uniform Naming - URIs Uniform Resource Identifier An identifier for a resource on the Web. A superset comprising URNs (Uniform Resource Name) and URLs (Uniform Resource Location). URLs can be dereferenced (i.e. they represent an ‘address’) Examples: http://cs.cf.ac.uk/user urn:isbn:0-395-36341-1 urn:jxta:uuid:59616261646162614A78746150325033F3BC76 The first is a URL - it represents an address - you can GO there - PPT spotted that The second is a URN - it identifies a book uniquely The third is a URN, but using the Jxta Protocol, one may be able to resolve this to a URL. PPT obviously does not support Jxta.
URIs URIs can be either hierarchical or non-hierarchical. A non hierarchical URI is considered opaque. A hierarchical URI has a number of elements. All URIs start with a scheme element This example is hierarchical: http://cs.cf.ac.uk:8080/harrison/resume.html#intro schemehostportpathfragment This example is not hierarchical: mailto:firstname.lastname@example.org scheme opaque authority
Resources and URIs Hierarchical URIs can also have a query string: http://www.google.com/search?q=web keyvalue This is a simplified URI generated by searching Google for the word “web” q is a key to google to say the bit of the query after the = is what I typed in. Query strings can contain multiple key/value pairs using an ampersand ?client=safari&rls=en-us&q=web&ie=UTF-8&oe=UTF-8 Experiment - change the value after the q=. Add &hl=ja to the end of the URI and see what happens
Resources and URIs Resources have state but not operations over that state. This is supported through URIs The scheme of the URI determines the operations permitted on a resource. By exposing representations of resources at URIs with different schemes, you offer different operations on that resource.
Representational State Transfer (ReST) Term coined by Roy Fielding in his PhD dissertation in 2000 A PhD thesis that has actually had an impact!!! ReST is an architectural style derived from looking at the Web and what makes it scalable It is a client/server architectural style in which clients retrieve representations of resources. When a new resource is retrieved, the state of the client, e.g., the browser, is changed. The term is now heavily used and misused ReST != HTTP Not all the Web is ReSTful
ReST Primary constraints Client/server Request/response Statelessness interactions are stateless Caching reduces network load HTTP headers support lots of caching policies Uniform Interface URIs HTTP MIME types Layering Allows servers to act as gateways to areas of the network for firewalling, caching Allows evolution of parts of the network to happen incrementally without affecting the whole network Code-on-Demand e.g. Applets, SWF
Statelessness Interactions on the Web are usually stateless Resources have (represent) state, but the interactions do not When I query Google, each query happens in isolation. There is no state preserved at the server regarding my query, even if I move through a number of results pages Experiment: add &start=20 to the end of your query What happens? You move to a new URI which is independent of the previous URI. (Look at the Gooooooogle links at the bottom of the page) The Cookie is typically used to represent interaction state Cookie Header is an opaque pointer to state Stored at the client Understood by the server – the state of interactions is on the server – not ReSTful
Caching and Linkability Caching Allows resources to be stored and retrieved locally, or closer to the source of the request This saves on bandwidth Is a form of load balancing Is possible because of the resource abstraction. Caching also applies to storage outside of the Web via URIs - in desktop applications, on TV, newspapers, billboards, human memory Linkability Web Architecture (Vol 1): It is a strength of Web Architecture that links can be made and shared; a user who has found an interesting part of the Web can share this experience just by republishing a URI. Makes the Web a web
Caching and Linkability Linkability and cacheability are achievable through two primary qualities URIs being ‘addressable’ I can share them, publish them knowing others will get the same result as me when they de-reference the URI. A certain longevity to URIs - If I link to a resource today, will it still be there tomorrow? Safe and idempotent interactions Safe interactions are those with no side-effects. the user is not to be held accountable for the result of the interaction. I am not entering into a contract or agreeing to terms and conditions when I follow a link. If this were the case, how could you be sure that you were not at a deeper level within the domain, having unknowingly bypassed the terms an conditions page? – Servers use redirects for this. Idempotent interactions. No matter how many times the identical request is repeated, it does not change the side-effects of the request.
Uniform Interface separation of resource from representation What you see is not the actual thing, but a representation. Things may have multiple representations. manipulation of resources by representations Changing the representation may induce a change in the state of the resource itself self-descriptive messages Allows each message to be independent No need to track state changes across multiple exchanges hypermedia as the engine of application state (?) Hypermedia is a medium that allows non-linear progression through states. A page with links - its up to you where you go. So state is only perceived and controlled by the client
URL Every GET to Google is independent. Google doesn’t perceive your state changes Possible next states The client chooses the next state Only the client percieves the continuum of states “Hypermedia as the engine of application state”
Conclusion The Web core concepts Uniform Access HTTP Naming URIs Representation MIME types primarily HTML ReST stateless interactions caching state at the client side