Implementation of a parallel web proxy server with caching Presented by: Kaushik Choudhary
Outline Introduction Design Challenges Design Challenges - Solutions Implementation Experimental Setup Results
Introduction What is an HTTP/1.0 web-proxy? – Already covered by others. How does it work? – Already covered by others. What is a parallel web proxy with caching? – Already covered by others.
Outline Introduction Design Challenges Design Challenges - Solutions Implementation Experimental Setup Results
Design Challenges Dozens of open-source implementations available online to take “inspiration” from. System calls Parallelism Efficient Cache
Design Challenges – System calls Thread-unsafe versionAlternative Thread-safe version/re- entrant asctime()asctime_r() gethostbyaddr()gethostbyaddr_r() gethostbyname()gethostbyname_r() inet_ntoa()inet_ntop()
Design Challenges – Parallelism The pseudo-code in the project description says “Spawn a worker thread to handle the connection” What if there were a 100 connection requests? If there is a request queue, how do threads atomically access this queue?
Design Challenges – Caching Size of cache? Eviction policy. Atomic and efficient access to Cache.
Outline Introduction Design Challenges Design Challenges - Solutions Implementation Experimental Setup Results
Design Challenges – Parallelism solutions If there were a 100 requests, create a pre- determined number of threads and assign a request as task. Use locks to access the task queue. Atomic and efficient access to Cache.
Design Challenges – Caching solutions Currently stores at most 20 responses (web pages) [TODO – limit size] Eviction policy used - LRU. Store the pages and timestamps in a doubly linked-list, make a node a head when accessed, use a hashmap to index elements of this list.
Outline Introduction Design Challenges Design Challenges - Solutions Implementation Experimental Setup Results
Implementation Used open-source object oriented design from “ Implemented pthreads and openmp versions for parallelism. Implemented cache as described above.
Outline Introduction Design Challenges Design Challenges - Solutions Implementation Experimental Setup Results
Experimental Setup Created command line scripts to open 30 distinct non-https websites (google-chrome commandline). Distinct websites avoids interference from browser cache. Measured total time to serve all requests in threadless, different pthreads versions and openmp version of the code. Conducted experiments on two machines (Core i5 2.4 GHz, 4GB RAM (desktop) and Core i GHz, 8GB RAM)(laptop).
Outline Introduction Design Challenges Design Challenges - Solutions Implementation Experimental Setup Results
Results – Access times (no-cache)
Results – Speedup (desktop)
Thank you!