1 Web Proxies Dr. Rocky K. C. Chang 6 November 2005.
Published byModified over 4 years ago
Presentation on theme: "1 Web Proxies Dr. Rocky K. C. Chang 6 November 2005."— Presentation transcript:
1 Web Proxies Dr. Rocky K. C. Chang 6 November 2005
2 Motivation for Web proxies Sharing access to the Web Clients requesting the same resource from an origin server may share a single connection. Caching responses Same as above when clients requesting the same resource at different times. Anonymizing clients Some proxies can be configured not to provide anonymity by adding a header to the message.
3 Motivation for Web proxies Transforming requests and responses Different HTTP versions between proxy-server and client-proxy. Different compression algorithms between proxy-server and client-proxy. Gateway to nonHTTP systems Between a Web client and an FTP server. Filtering requests and responses Filtering based on URL or keywords
4 A classification of Web proxies Caching vs noncaching proxies Transparent vs nontransparent proxies A transparent proxy does not modify the request or response in anything more than a superficial manner. E.g, adding identification information about itself or the server from which the message is received.
5 A classification of Web proxies Interception vs explicit proxies Often referred to transparent proxies The presence of interception proxies is transparent to clients. Forward vs reverse proxies (or surrogates) Forward proxies are placed close to clients. Reverse proxies are placed close to origin servers. Reverse proxies may not use HTTP to communicate with the origin servers behind it.
6 Explicit proxies Clients are aware of the proxy ’ s presence. Send the requests to the proxy instead of the origin server. Delegate the DNS resolutions to the proxy. Explicit proxy configuration Explicit client configuration Browser autoconfiguration: Configured to download a special URL which identifies the proxy to use.
8 Interception proxies Require some network element (interceptor) to intercept all traffic from Web clients and divert them to an interception proxy. client : Interceptor Proxy Origin server Other traffic dest port = 80
9 Interception proxies and TCP After the interceptor intercepts and forwards the first packet in a request (TCP SYN) to a proxy, the proxy impersonates the origin server and establishes a TCP connection with the client with the source IP and port numbers the same as the original destination information. sends a response to the client if it has the requested resource; otherwise, forwards the request to the origin server.
10 Interception proxies and DNS Unlike explicit proxies, clients here resolve the origin servers ’ domain names themselves. The interception proxy needs the domain name of the origin server to resolve the IP address. An HTTP/1.0 request may lack a Host header to indicate the server ’ s domain name. The HTTP/1.1 mandates the inclusion of a host header in a request. GET /pub/WWW/TheProject.html HTTP/1.1 Host: www.w3.org
11 Pros and cons of interception proxies Pros Do not need explicit client configuration. The proxy ’ s domain name and IP address are not exposed to clients. Better performance during proxy failures or overload Cons Violation of the end-to-end principle Require the responses from the origin servers to return to the same interceptor; otherwise, a multipath problem would occur.
12 Interception mechanisms Once intercepted, the packet is delivered to the proxy using either layer 2 or 3 mechanisms. Layer 2 solution: replace the destination MAC address by the proxy ’ s MAC address. No modifications to the IP packets The interceptor and the proxy must be directly connected in a datalink network. Modifications to the proxy ’ s protocol stack is also required.
13 Interception mechanisms Layer 3 solution: The packet is tunneled in another IP packet destined to the proxy. Another solution: How about simply replacing the destination IP address with that of the proxy?
15 Layer-4 switches as interceptors Each switch may perform hashing on destination addresses for outgoing traffic in order to distribute the load to a set of proxies. Each switch must produce the same hash value for the same set of source and destination addresses. Layer-4 switches use the layer-2 solution to forward the intercepted packet to the proxy.
16 Routers as interceptors When multiple paths are available (through two different routers), only one of them is configured as the primary access router and the interceptor. Proxy RR RR client
17 Layer-7 switches as interceptors A layer-7 switch understands application protocols (HTTP in our case). Intercepts the TCP SYN segment for a request and performs handshaking. Intercepts and interprets requests and only then forwards client packets to a proxy or the Internet. The content-aware switch can be configured to send requests for different types of content to different proxies.
18 Forward proxy caches Caches at browsers, ISP proxies, enterprise proxies, local exchange proxies Forward proxy caches are intended to reduce latency and bandwidth usage by “ sharing hits. ” Increase in hit shares conflicts with the objective of latency and bandwidth reduction. : : Origin server
19 Latency reduction Reduce the time of sending the request and receiving the response (the proxy is closer to the client). A higher reduction when the access network speed is higher. Reduce the time to establish the TCP connection between client and origin server. More reduction when reusing TCP connections TCP-related factors: TCP connection splitting, network congestion, etc.
20 TCP connection caching A proxy may maintain persistent TCP connections with its clients on one side and with origin servers on the other. Reusing TCP connection instead of reusing cached objects proxy A B Q S
21 Benefits of TCP connection caching A requests a resource from Q, and later B requests another resource from Q. A persistent connection between the proxy and Q eliminates the second connection setup. A requests a resource from Q, and later A requests another resource from S. A persistent connection between the proxy and A eliminates the second connection setup. Moreover, the origin servers are keeping persistent connections with proxies, instead of individual clients.
22 Benefits of TCP connection caching In order to avoid the head-of-the-line problem, the proxy must be careful to only use idle cached connections to servers. Studies showed that connection caching provides generally much greater latency reduction than data caching. For modem environment, The connection alone reduced mean and median latencies by 21% and 40%, respectively. Together with data caching, the total benefits were 24% and 48%, respectively.
23 Benefits of TCP connection caching For Ethernet LAN environment, The connection alone reduced mean and median latencies by 2% and 20%, respectively. Together with data caching, the total benefits were 47% and 40%, respectively. For the modem environment, sizable benefits can be achieved by maintaining connections with the clients only.
24 TCP connection splitting A TCP connection is split by a proxy. The proxy incurs some processing delay. Each new TCP connection has half of the original RTT. It takes half of the time for the ACKs to come back: send new packets and increase the congestion window. RTT RTT/2
25 TCP connection splitting Get out of the slow start phase quickly. Timeout value is less when there are packet losses. With additional delay at the proxy, previous studies on the case of 2 proxies reported that the throughput is almost doubled. The connection splitting only benefits large resource and when the throughput is limited by the TCP congestion window.
26 Reduction in bandwidth The complicating factor is aborted object transfers. When a proxy learns about aborts, it can do one of two things. It can continue transferring the object, so that it has the object in the cache for future use. It can “ forward ” the abort. The first approach would waste the bandwidth if the hit rate is not high or the object is not cacheable.
27 Reduction in bandwidth The second approach would not be very effective if the speed of the client-side link is smaller than that of the server-side link. By the time the abort is received, the proxy has already received most, if not all, of the object. FIN request response clientproxyserver
28 Summary Proxies, notably Web proxies, are part of today ’ s Internet infrastructure. Although violating the end-to-end principle, Web proxies enhance the Web performance in a number of important ways. Interception proxies offer a number of advantages over the traditional explicit proxies. Forward proxy caches generally reduce latency by data and connection caching.
29 Acknowledgments The slides are prepared mainly based on B. Krishnamurthy and J. Rexford, Web Protocols and Practice, Addison Wesley, 2001. M. Ravinovich and O. Spatscheck, Web Caching and Replication, Addison Wesley, 2002.