Presentation is loading. Please wait.

Presentation is loading. Please wait.

IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko.

Similar presentations


Presentation on theme: "IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko."— Presentation transcript:

1 IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko Tozawa, and Tamiya Onodera Tokyo Research Laboratory IBM Research 2009/4/22 17:00-17:30 Web-Eng 3 - Web Architecture Aspect

2 IBM Research © 2009 IBM Corporation 2 Executive Summary Scale-up approach still matters for ultimate scalability –The performance of server-side applications is becoming increasingly important as more applications exploit the Web application model in the Web 2.0 era Our Main Contribution –A novel approach to boost Web application performance with what we call Zero Copy Data Transfer that reduces redundant memory copying and context switch overhead between user space and kernel space Achievements 126% performance improvement with micro-benchmarks 31% performance improvement for standard Web benchmark, SPECweb2005

3 IBM Research © 2009 IBM Corporation 3 Outline of this talk 1.Motivation 2.Our Solution: Zero Copy Data Transfer 3.Performance Evaluation 4.Related Work, Future Direction and Conclusions

4 IBM Research © 2009 IBM Corporation 4 Background : Web Server Performance Does Matter In recent years, software applications are increasingly being developed to adopt the Web application model via HTTP protocol in the Web 2.0 era. This rapidly growing use is dramatically increasing the performance requirements for Web application servers. Gigantic internet companies such as Microsoft, Google, Amazon.com, and Yahoo! have introduced ten thousands of physical hosts to maintain QoS and scalability Total Sites Across All Domains August 1995 - March 2009 (Netcraft) http://news.netcraft.com/archives/web_server_survey.html

5 IBM Research © 2009 IBM Corporation 5 Optimizing Web Server Performance Static Web Server (Only serving static files) – In kernel web servers [Almol, TOCS 04] [King,USENIX 01] – Zero-copy Approach [Lighttpd 1.5 beta, 08] Dynamic Web Server (including business logic) = HTTP Server + Connector (SAPI) + Script Execution Runtime – Scale-Out Approach with Load Balancing – Optimizing Script Execution Runtime with Just-in-time compiler [Tozawa, PHP 08] [YARV] – Dynamic Web Server Comparison : PHP vs. JSP [Trent, Middleware 2008] Lighttpd / FastCGI / PHP was the best ! How can we make web sites more scalable to support users ? We want more. Any other optimization ?

6 IBM Research © 2009 IBM Corporation 6 Profiling Dynamic Web Server with SPECweb2005 Dynamic web server consists of Lighttpd Web Server, PHP Runtime (P9), and FastCGI as SAPI The profiling result shows that significant time is spent on memory copying both in Web server and PHP runtime CPU Usage in PHP (P9) (SPECweb Banking) 53999 9.4771 libc-2.6.so lighttpd memcpy 39009 6.8463 libc-2.6.so phoebe-fcgi memcpy 34101 5.9849 e1000 lighttpd (no symbols) 23828 4.1819 libcrypto.so.0.9.8b lighttpd bn_mul_add_words 20247 3.5534 libp9rtsvc24.so phoebe-fcgi storeGenericAux 19645 3.4478 libcrypto.so.0.9.8b lighttpd bn_sqr_comba8 12944 2.2717 libcrypto.so.0.9.8b lighttpd BN_from_montgomery 11547 2.0266 libc-2.6.so lighttpd _int_malloc 10737 1.8844 libp9rtsvc24.so phoebe-fcgi loadIndex CPU Usage (%) Memory copy

7 IBM Research © 2009 IBM Corporation 7 Sniffing FastCGI Packets in SPECweb Banking \0x01\0x01\0\0x01\0\0x08\0\0\0\0x01\0\0\0\0\0\0\0x01\0x04\0\0x01\0x05T\0\0\0x0F\0x0FSERVER_SOFTWA RElighttpd/1.4.18\0x0B\0x19SERVER_NAMEmichis3.trl.ibm.com:8099\0x11\0x07GATEWAY_INTERF ACECGI/1.1\0x0B\0x04SERVER_PORT8099\0x0B\0x0CSERVER_ADDR9.116.14.105\0x0B\0x04RE MOTE_PORT3456\0x0B\0x0BREMOTE_ADDR9.116.14.91\0x0B\0x19SCRIPT_NAME/bank/account_ summary.php\0x09\0PATH_INFO\0x0FFSCRIPT_FILENAME/home/suzumura/software/lighttpd/var/ww w/html/bank/account_summary.php\r.DOCUMENT_ROOT/home/suzumura/software/lighttpd/var/www/h tml/\0x0B\0x19REQUEST_URI/bank/account_summary.php\0x0C\0QUERY_STRING\0x0E\0x03REQU EST_METHODGET\0x0F\0x03REDIRECT_STATUS200\0x0F\0x08SERVER_PROTOCOLHTTP/1.1\0x 09\0x19HTTP_HOSTmich-is3.trl.ibm.com:8099\0x0FYHTTP_USER_AGENTMozilla/5.0 (Windows; U; Windows NT 5.1; ja; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11\0x0BcHTTP_ACCEPTtext/xml,application/xml,application/xhtml+xml,text/html;q=0.9,tex t/plain;q=0.8,image/png,*/*;q=0.5\0x14\0x17HTTP_ACCEPT_LANGUAGEja,en- us;q=0.7,en;q=0.3\0x14\0x0CHTTP_ACCEPT_ENCODINGgzip,deflate\0x13\0x1DHTTP_ACCEPT_CH ARSETShift_JIS,utf- 8;q=0.7,*;q=0.7\0x0F\0x03HTTP_KEEP_ALIVE300\0x0F\nHTTP_CONNECTIONkeep- alive\0x0CFHTTP_REFERERhttp://mich- is3.trl.ibm.com:8099/bank/check_detail_html.php?check_no=1\0x0B\0x80\0\0x01VHTTP_COOKIECore ID6=46881926841911901663357; w3ibmProfile=2005081020414106874663086|gASP|760|488|null; sauidp=U162807841190166314000; s_nr=1190095558906; ibmSurvey=1211076852290; CoreID6=46881926841911901663357; w3ibmProfile=2005081020414106874663086|gASP|760|488|null; sauidp=U162807841190166314000; s_nr=1190095558906; ibmSurvey=1211076852290;userid=1\0x01\0x04\0\0x01\0\0\0\0\0x01\0x05\0\0x01\0\0\0\0 \0x01\0x06\0\0x01\0x1F\0xF8\0\0Set-Cookie: CoreID6=46881926841911901663357;p ath=/\r\nSet-Cookie: w3ibmProfile=2005081020414106874663086|gASP|760|488|null;pa th=/\r\nSet-Cookie: sauidp=U162807841190166314000;path=/\r\nSet-Cookie: s_nr=119 0095558906;path=/\r\nSet-Cookie: ibmSurvey=1211076852290;path=/\r\nSet-Cookie: u serid=1;path=/\r\nContent-Type: text/html\r\n\r\n\r\n< html>\r\n \r\n SPECweb2005: Account Summary \r\n \r\n \r\n \r\n < table summary="SPECweb2005_User_Id">\r\n User ID \r\n 1 \r\n \r\n

\r\n \r\n Account \r\n Type< /th>\r\n Current Balance \r\n Total Deposits \r\n Average Deposit \r\n Total Withdraws \r\n Average Withdraws \r\n \r\n \r\n 0000000251 \r\n Saving\r\n \r\n 9373.01 \r\n 73.01 \r\n 73.01 \r\n 67.01 \r\n 6 7.01 \r\n\0x09 \r\n \r\n 0000000252 \r\n Other\r\n \r\n 9373.01 \r\n
173.01 \r\n 73.01 \r\n 117.01 \r\n 67.01 \r\n\0x09 \r\n \r\n \r\n \r\n \r\n SPECwe b2005: Account Summary \r\n \r\n \r\n \r\n \0x09 \r\n\0x09 Account Summary \r\n\0x09 Check Detail \r\n Change Profile \r\n < a href="transfer.php">Transfer Money \r\n\0x09 Bill Pay\r\n \ r\n Add Payee \r\n Quick Pay \r\n Check Status \r\n \r\n\0x09 Orde r Check \r\n Logout \r\n\0x09 \r \r\n \r\n HTTP server PHP Runtime account_summary.php in SPECweb Banking Response: 19352 bytes (19KB) Request: 1381 bytes FastCGI

8 IBM Research © 2009 IBM Corporation 8 Relatively Static Part (Cached File) Simplified Dynamic Web Page Semi-Static Part (Cached File, Long Characters, etc) Footer Part DB Header Part File System Dynanmic Part

9 IBM Research © 2009 IBM Corporation 9 Interaction between HTTP Server and PHP Runtime HTTP Server PHP Runtime Kernel Space User Space socket Buffer PHP App header_processing(); echo file_get_contents(fileA); footer_processing(); File FastCGI File System Buffer File Content

10 IBM Research © 2009 IBM Corporation 10 Outline of this talk 1.Background & Motivation 2.Our Solution: Zero Copy Data Transfer 3.Performance Evaluation 4.Related Work, Future Direction and Conclusions

11 IBM Research © 2009 IBM Corporation 11 Our Solution: Zero Copy Data Transfer Reduces inter-process communication overhead between web server and PHP runtime Normally, PHP reads file contents, converts it to string buffer, and forwards it as a FastCGI packet. Instead, do the following –PHP runtime passes the file name within a FastCGI packet. –Web server then uses the information to invoke a zero copy system call such as sendfile, which is supported by major operating systems Note that PHP developers need not modify their scripts to use our optimization

12 IBM Research © 2009 IBM Corporation 12 Proposed Approach HTTP Server PHP Runtime Kernel Space User Space socket Buffer PHP App header_processing(); echo file_get_contents(fileA); footer_processing(); File FastCGI File System Buffer File URI X-ZeroCopy X-ZeroCopy Handling sendfile

13 IBM Research © 2009 IBM Corporation 13 File Processing : FTCS (File-Type Character String) Object A new type of character string object that only holds the file name (URI) of a file without reading the entire content of a file as an ordinary character string object abcdefghijklm nopqrstuvwxyz PHP Runtime abcdefghijklm nopqrstuvwxyz URI: fileA PHP Runtime Ordinal Character String Object File(fileA) FTCS Object

14 IBM Research © 2009 IBM Corporation 14 X-ZeroCopy: Enhancing FastCGI Protocol. X-ZeroCopy : New HTTP Header (HTTP Extension) –File name and location are included in the body portion of a FastCGI message –The header is recursively defined to allow the transmission of multiple files Our PHP runtime automatically generates an X- ZeroCopy header and body content in a transparent manner from unmodified PHP scripts X-ZeroCopy = X-ZeroCopy : # (offset / length )

15 IBM Research © 2009 IBM Corporation 15 Example X-ZeroCopy: 5/10 hello/tmp/A.htmworld FTCS (A.html) PHP Runtime

16 IBM Research © 2009 IBM Corporation 16 Transparency to Applications PHP developers need not modify their applications to leverage our proposed approach If there is a side effect on FTCS object, our PHP runtime loads the content of a while when needed (Lazy I/O Processing) –In this case, performance improvement can not be obtained

17 IBM Research © 2009 IBM Corporation 17 Outline of this talk 1.Background & Motivation 2.Our Solution: Zero Copy Data Transfer 3.Performance Evaluation 4.Related Work, Future Direction and Conclusions

18 IBM Research © 2009 IBM Corporation 18 Performance Evaluation Micro-benchmark –Theoretically our approach should be effective when the FastCGI communication overhead between the PHP runtime and the Web server is a major performance bottleneck. –To find the threshold where file size becomes a bottleneck, we prepared a simple PHP script micro-benchmark that simply displays a file (ranging from 10 KB to 200 KB) via the file_get_contents PHP extension. SPECweb2005 –A standard web benchmark consisting of 3 web representative scenarios: Banking, Ecommerce, and Support

19 IBM Research © 2009 IBM Corporation 19 Comparative PHP Runtimes: P9 vs. P9ZC vs. Zend P9 –Our research PHP runtime with Just-in-time compiler –Single Thread + 8 FastCGI Processes P9ZC –P9 with our proposed zero-copy data transfer –Single Thread + 8 FastCGI Processes Zend (PHP 5.2.5) (Only used for SPECweb) –A major PHP runtime available from www.php.net –APC (Alternative PHP Cache) is turned on to allow a PHP intermediate code to be cached in shared memory –Single Thread + 8 FastCGI Processes

20 IBM Research © 2009 IBM Corporation 20 Micro-benchmark: Throughput and Speedup (%) of P9ZC over P9 with varying file sizes. File Size large Speedup P9ZC The speedup of P9ZC over P9 increases from 1.26 for a 10 K file up to 2.26 for a 60 K file. After 60K, the speedup gradually decreases but P9ZC remains roughly twice as fast as P9 P9 Apache Bench (ab) with 1 process, 100 concurrent requests, and a 60 second run, measured after sufficient warm-up SUT (3GB, 3.4GHz Xeon, Fedora Core7), Prime Client (3.4 Ghz Xeon), P9 as of 2008/08/08, Zend (PHP 5.2.5) with APC enabled, Throughput

21 IBM Research © 2009 IBM Corporation 21 Micro-benchmark: CPU usage for the memcpy function. File Size large P9ZC significantly reduces CPU usage for the memcpy function while P9 needs uses approximately from 20% to 60% on them of CPU time to perform memory copying Lower is better % of memcpy

22 IBM Research © 2009 IBM Corporation 22 Performance Evaluation with SPECweb2005 Linux (kernel 2.6.17) PHP Process BESIM (database Simulator) HTTP server Application Server Backend (Business Logic / DB) Simulator 6 Clients Named pipe / TCPsocket tcp/ip socket Client (Emulator) FastCGI Protocol HTTP Apache HTTP server (FastCGI) Linux 2.6.18 2GB RAM, Xeon 2.4GH, 2CPU Fedora Core 7, kernel 2.6.17, Pentium 4 3.4GHz, 2GB RAM) Linux 2.6.17, Xeon 2.4GHz, 1GB) Client (Emulator) Client(s) 6 clients (Linux 2.6.2, Xeon 3.0GHz, 3GB) Lighttpd 1.4.19 mod_fcgi 8 processes

23 IBM Research © 2009 IBM Corporation 23 SPECweb2005 Banking No performance improvement is observed due to the fact that sendfile is ineffective with SSL communication and SUT (3GB, 3.4GHz Xeon, Fedora Core7), 3 Clients and Prime Client (3.4 Ghz Xeon), BESIM: Apache 2.2.26, IBM J2RE 1.5.0 Linux build, P9 as of 2008/08/08, Zend (PHP 5.2.5) with APC enabled, SPECweb 3 minute run Average Data Transfer Size: 34.8 KB Higher is better Throughput

24 IBM Research © 2009 IBM Corporation 24 SPECweb2005 Ecommerce SUT (3GB, 3.4GHz Xeon, Fedora Core7), 3 Clients and Prime Client (3.4 Ghz Xeon), BESIM: Apache 2.2.26, IBM J2RE 1.5.0 Linux build, P9 as of 2008/08/08, Zend (PHP 5.2.5) with APC enabled, SPECweb 3 minute run Our approach outperforms original P9 by 22%, and Zend by 57% Average Data Transfer Size: 143.9 KB Higher is better Throughput Sessions

25 IBM Research © 2009 IBM Corporation 25 SPECweb2005 Support SUT (3GB, 3.4GHz Xeon, Fedora Core7), 3 Clients and Prime Client (3.4 Ghz Xeon), BESIM: Apache 2.2.26, IBM J2RE 1.5.0 Linux build, P9 as of 2008/08/08, Zend (PHP 5.2.5) with APC enabled, SPECweb 3 minute run Our approach outperforms original P9 by 31%, and Zend by 61% Average Data Transfer Size: 78.5 KB Higher is better Throughput Sessions

26 IBM Research © 2009 IBM Corporation 26 CPU Usage of memcpy in 3 scenarios % CPU used for memcpy P9ZC P9 Zend P9ZC dramatically decreases CPU time used for memory copying Support : 78.5 KB Ecommerce: 143.9 KBBanking : 34.8 KB SUT (3GB, 3.4GHz Xeon, Fedora Core7), 3 Clients and Prime Client (3.4 Ghz Xeon), BESIM: Apache 2.2.26, IBM J2RE 1.5.0 Linux build, P9 as of 2008/08/08, Zend (PHP 5.2.5) with APC enabled, SPECweb 3 minute run

27 IBM Research © 2009 IBM Corporation 27 Outline of this talk 1.Background & Motivation 2.Our Solution: Zero Copy Data Transfer 3.Performance Evaluation 4.Related Work, Future Direction, and Conclusions

28 IBM Research © 2009 IBM Corporation 28 Related Work Zero Copy Approach – Evaluation of sendfile [Nahum, TON 02] – Faster FastCGI in Ligthttpd 1.5 beta [08] Implemented as P9 Level 1 SSL-enabled sendfile [Keromytis, TOCS 06] –Could allow our approach to be used for SPECweb Banking Scenario In kernel Web Servers [Armol, TOCS 04] [King, USENIX 01] –Their focus is on sending only static files

29 IBM Research © 2009 IBM Corporation 29 Future Directions Apply our approach to other programming languages such as Java, Ruby, and Python Performance evaluation with more applications such as SugarCRM, MediaWiki, and phpBB Extend the proposed approach to general cases where file processing is not explicitly required –Constant and long character sequences can be dynamically and/or statically stored in a flat file –A web server sends the file to web clients via the sendfile system call –Challenge: runtime overhead

30 IBM Research © 2009 IBM Corporation 30 Conclusions Proposed a novel approach that improves Web applications performance by a zero-copy approach Showed promising performance improvement over our original PHP runtime with SPECweb2005 126% performance improvement with micro- benchmarks 31% performance improvement for standard Web benchmark, SPECweb2005


Download ppt "IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko."

Similar presentations


Ads by Google