Presentation on theme: "The Architecture of the World Wide Web Min Song IS NJIT."— Presentation transcript:
The Architecture of the World Wide Web Min Song IS NJIT
Internet Architecture Today’s Internet Thousands of networks Connected by legal agreements and commercial contracts Uses TCP/IP protocol Internet service providers (ISPs) Provide most individual users with access to the Internet Dialup connections Modems and conventional phone lines xDSL and cable modems provide broadband access
Packet Switching Most modern Wide Area Network (WAN) protocols, including TCP/IP, X.25, and Frame Relay Packet switching is more efficient and robust for data that can withstand some delays in transmission, such as messages and Web pages. Circuit-switching: Normal telephone service is based on a circuit-switching technology a dedicated line is allocated for transmission between two parties. data must be transmitted quickly and must arrive in the same order in which it's sent. real-time data, such as live audio and video. real-timevideo
Use of Packets
Internet Protocols:TCP/IP Communications protocol suite Packet switched protocol No end-to-end connection is required Each message broken down into small pieces called packets Packets possibly routed to destination over different paths Transmission Control Protocol (TCP) Breaks messages into packets Numbers packets in order Reorders packets at the destination Internet Protocol (IP) Routes packets to the proper destination
Domain Names Every computer connected to the Internet must have a unique IP address IP address format is xxx.xxx.xxx.xxx where xxx is a number between 0 and 255 How do we know that is Microsoft? Domain Name Service(DNS) A database of Internet names DNS Servers convert Internet names to IP addresses Top level domains Ping: to test whether a particular host is reachable across an IP network. Tcpdump: to sniff network packets and make some statistical analysis out of those dumps
The World Wide Web Collection of hyperlinked computer files on the Internet Client-server application Web servers Web browsers as clients WWW standards Hypertext markup language (HTML) Current standard for writing Web pages Implementation of SGML specifically for Web pagesSGML Tags in HTML instruct the client browser how to format and display the Web page content Hypertext transfer protocol (HTTP) Protocol that establishes a connection between Web server and client Extensible markup language (XML) A meta-markup language Gives meaning to the data enclosed within XML tags
Searching the WWW Most data on the Internet is part of the WWW Search engines – large databases that index WWW content Building the search engine database Submit a site to the search engine administrator for listing Spiders Metatags Google Google Yahoo Yahoo
Hypertext Transfer Protocol A protocol (syntax and semantics) for transferring representations of resources usually across the Internet using TCP Design goals speed(stateless, cachable, few round- trips) simplicity extensibility data (payload) independence A true network-based API
HTTP/0.9 (pre-1993) Absolute Simplicity GET /url-path Hello World No Extensibility only one method (GET) no request modifiers no response metadata
HTTP/1.0 (1993-present) Simple and (mostly) Extensible GET /Test/hello.html HTTP/1.0 Accept: text/html User-Agent: GET/5 libwww-perl/0.40 HTTP/ OK Date: Fri, 12 Jan :02:49 GMT Server: Apache/1.0.5 Content-type: text/html Content-length: 38 Last-modified: Wed, 10 Jan : Hello Hello out there!
HTTP/1.0 Deficiencies No complete specification until end of `94 No minimum standard for compliance Poor network behavior one request per connection no reliable transfer of dynamic content no control over response caching failed to anticipate proxies and gateways created huge demand for vanity addresses misuse/misunderstanding of MIME
HTTP/1.1 Culmination of two years work, RFC2068 with Henrik Frystyk, Jim Gettys, Jeff Mogul designed at UCI and W3C; expanded in IETF Improved Reliability chunked transfer of dynamic content recognition of proxy and gateway requirements explicit cachability of responses Improved Network Behavior persistent connections virtual hosts (many names, one address)
HTTP/1.1 (1997-????) Less Simple, More Extensible, but Compatible GET /Test/hello.html HTTP/1.1 Host: kiwi.ics.uci.edu:8080 User-Agent: GET/7 libwww-perl/5.40 HTTP/ OK Date: Fri, 07 Jan :40:09 GMT Server: Apache/1.2b6 Content-type: text/html Transfer-Encoding: chunked Etag: “a797cd-465af” Cache-control: max-age=3600 Vary: Accept-Language …
HTTP/1.x Deficiencies MIME is too verbose (overhead per message) Control mixed with metadata Metadata restricted to header or trailer Fixed request/response ordering can block progress Incurs frequent round-trip delays due to connection establishment.
HTTP/2.x Tokenized transfer of common fields reducing bandwidth usage, latency removal of MIME syntax limitations self-descriptive for extensions Multiplexing control, data, metadata streams reducing desire for multiple connections enabling multi-protocol connections per-stream priority or credit mechanism Layered streams for meta-metadata, encryption...
XML to the rescue? “X” for extensible: self-descriptive syntax semantics by reference (doctype, namespaces) rendering by reference (style sheets) An XML representation is an object turned inside-out, with behavior-by-reference However, network application performance will demand standards for domain-specific doctypes and style sheets
Future Work Dynamic application architectures Architectural analysis and performance bounds Impact of future network architectures (ATM) Balancing secure transfer with firewall visibility Protocol for manipulating resource mappings HTTP-NG (W3C/Xerox PARC) rHTTP (UCI)