540 -> 624 Members meetup.com/achieverstech achievers.com/tech
Profiling & Tuning a Web Application Kaelen Proctor
So what if my site isn’t the fastest? Response time directly relates to your business – In 2007 Amazon determined that a 100ms increase in load time would cause a 1% drop in sales – In 2009 Shopzilla decreased page load time from 6 to 1.2 seconds, which netted a 7-12% conversion rate increase! The slower you can serve up pages the more frustrated your customers become
What exactly is profiling? Profiling is a dynamic analysis of the time complexity, frequency/duration of function calls, or memory allocation of a program A profiling tool runs this analysis by instrumenting either the source or executable, through a variety of techniques including event hooks or dynamic recompilation
Goals To show off some tools of the profiling trade To demonstrate how to use them effectively and identify the biggest “bang for you buck” bottlenecks To impress upon you the need to integrate profiling early and continuously into your development cycle
Agenda 1.Profiling your application I.What a single request looks like II.The database (MySQL) III.Application code (PHP) IV.In the Browser 2.Maintaining performance at scale I.Load testing II.Production monitoring
Before you embark What is your performance goal? Where is that relative to today? What processes are necessary to maintain the goal?
Profiling for a Web Application Web applications are all about speed; how quickly a response can be sent and usable On the app server, that means understanding the queries ran, 3 rd party libraries, web APIs, and application code Simple can sometimes be best – We use CodeIgniter (CI) here and its built-in request profiler is easy to use and extremely helpful
Enabling the CI profiler Drop this line anywhere before the controller ends: – $this->output->enable_profiler(true); The output code injects the profiling content at the end of the tag Lets see what it looks like site: Special K’s Video Rentals
How does the profile details help? Great overview of what is occurring in the request Queries executed is the most important aspect of a profile – Identification of long-running or duplicate queries Adding timing benchmarks can give a lot of insight – Especially if you leverage a lot of 3 rd party libraries or web services
It’s more of a guideline Most likely, you’re profiling on your dev machine with test data No idea how the request will scale – No competition for resources (i.e. database) – Have you profiled all possible configurations? Is anyone profiling or even paying attention to the details?
… In my experience, they aren’t No matter how much documentation you write on how to profile and which tools to use, it will get dropped in crunch time Most developers didn’t even turn the profiler on
Achievers Performance Header ™ Leveraging CI’s profiler, we tie the profiling summary with our performance targets Text is colour-coded on a linear scale from green to red as the further the request is from our targets Expanding the header shows the summarized performance details
Shoving in their faces Now once any target’s threshold is passed, the header defaults to the expanded view
Knowing is half the battle Finding issues early => more time to fix Always profiling => instant detection of a performance-killing change But there is balance – “Premature optimization is the root of all evil” – Wait until a feature is working before making it work fast
Database performance is critical It is the biggest shared resource your application contains Really slow queries will affect the speed of the entire database Scaling out your DB is not a simple task, so ensuring it isn’t bogged down is critical
Finding the stragglers First you need to identify the slow queries, so you can: 1.Manually review each query in your code 2.Profile every request and review each executed query 3.Let MySQL do the work with its slow query log Let’s go with option #3
Slow query log When on, MySQL logs any query that runs longer than a threshold # of seconds The log contains the total query time, lock time, and rows examined/sent To enable, add to the MySQL config file: log-slow-queries=/var/log/mysql/slow- query.log long_query_time=0.1
pt-query-digest http://www.percona.com/doc/percona- toolkit/2.2/pt-query-digest.html http://www.percona.com/doc/percona- toolkit/2.2/pt-query-digest.html Reads the slow query log and groups queries by their structure Outputs aggregated statistics on the whole log as well as for each query
Digesting All the Percona tools are Perl scripts, so execution is fairly straightforward Usage (on unix/linux): – pt-query-digest /var/log/mysql/slow-query.log > digest.out Options for specifying a date range, filter queries, writing the results to a DB, etc.
Well, now what? Now you have a great starting point for finding bottlenecks in your DB Slow queries - run MySQL EXPLAIN – Refer to the tech talk by Dr. Aris Zakinthinos – Most likely it is missing indexes – De-normalization may be necessary – Protip: Use your biggest data sets when running explain
Regurgitation Running pt-query-digest once won’t solve all your database issues Tuning your query performance is a never- ending process – Teach developers how to use EXPLAIN and optimize queries – Weekly reports using pt-query-digest to give visibility into DB performance
The Devil is in the details Callgrind is a language agnostic command line tool that profiles the function calls in your application (through emulation) It generates callgrind files which can contain the entire call stack, and can be read to summarize what your app code is doing
XDebug profiling to the rescue Awesomely, XDebug writes callgrind files when profiling is enabled – This makes generating the grind files trivial in PHP Just add to your php.ini: xdebug.profiler_enable=1 xdebug.profiler_output_dir = "/tmp/xdebug"
Other Grind Visualizers KCacheGrind was the original visualizer, which was ported as Windows as WinCacheGrind Regardless, all three aggregate the function calls into total # of calls and total cost(s) WebGrind is limited to a summary table, whereas the other two can display the full call tree
Installing WebGrind Installation: 1.Prerequisite: Install XDebug 2.Download zip from WebGrind’s GithubDownload zip from WebGrind’s Github 3.Extract zip to folder accessible by webserver 4.Setup virtual host for WebGrind 5.WebGrind will read from the XDebug profiler output directory automatically 6.Open in browser and voila!
Are we screwed? No! We can fix it, otherwise this wouldn’t be a good demo =) We ran into these performance issues with the OWASP library late in our security release
Output encoding Output encoding is a very expensive task – Simply put, the OWASP library encodes any non- alphanumeric character – It makes no assumptions on the incoming data, so ends up doing a lot of encoding detection and normalization before any real output encoding
Digging into OWASP How did we make it more efficient? – First, we installed WebGrind and started looking at exactly what OWASP was doing – We identified the functions that were taking too long or being called too often, and then dove into the code – A little elbow grease and trial/error later, we had it optimized and running smoothly
Opcode caching PHP is an interpreted language, so with every request, the code is read from disk, parsed, and compiled into opcode before executing An opcode cache stores the compiled opcode so the first three steps are skipped Speeds up your application by 2-5 times! Options: APC, XCache, Zend Optimizer+
Setting up APC http://pecl.php.net/package/APC Linux: – Install w/ PECL: pecl install apc-3.1.9 – Compile the extension yourself Windows: Download pre-compiled binary from http://downloads.php.net/pierre/http://downloads.php.net/pierre/ Enable by adding the extension in php.ini Sit back and enjoy the performance boost
Keep on Grindin’ Use WebGrind to summarize what your app code is doing; find the functions bottlenecking your application Make it second nature to profile your application code with WebGrind But for a quick boost, start using an opcode cache now and never look back!
Developer tools Firebug + webkit developer tools The most important aspect of these tools to performance is the network/timeline tab Shows you all resource requests and their timings including blocking, waiting, receiving, and more Displays when the DOMContent and Load events are fired
Yahoo’s YSlow and Google PageSpeed Browser extensions for Chrome + Firefox (sorry, IE) Analyzes a page request/response and offers best practices about how to improve performance Yahoo and Google know what they are talking about; follow the tools advice for the biggest wins in the shortest timeframe
A quick summary HTTP – Reduce # of requests – Parallelize downloads – Smaller cookies HTML – Reduce DOM nodes – Asynchronously load minor content CSS – Minify + concatenate – Load in the
Google Speed Tracer Chrome extension that shows a timeline of the internals of the UI thread including HTML parsing, script callbacks, painting, garbage collection, and many more Resolving issues found by Speed Tracer should be saved for after implementing all of YSlow and PageSpeed’s recommendations
The reasons you load test A more accurate portrayal of your site’s average performance, compared to request profiling and grind files Helps locate issues of scale that don’t appear when testing a single request
Before we dive into the tools First, you need to define a basic flow that you want to measure as your benchmark Ex. Login -> Newsfeed -> Catalog -> User Search -> Recognition -> Logout Should contain the most commonly accessed URLs Also nice to have a mix of GET and POST requests
Choosing a load tester Options abound – JMeter: GUI – Siege: CMD – CURL-loader: CMD – WebLOAD: GUI – Loadimpact.com - SaaS We will focus on JMeter since we use it =)
JMeter A GUI written in Java for load testing and benchmarking servers (HTTP, SOAP, JMS, etc.) Supports variables in requests, assertions on responses, cookies, and many aggregate reports Not the most intuitive UI until you get used to it
When to load test? Depends on your dev cycle, but once a week/sprint is a good starting point It’s more important to be consistent! If possible, should be part of an automated build/test suite
Getting ahead Metrics on multi-user response times before going to production is important Otherwise you have no idea how your app will scale to a real user load It’s probably a good idea to load test with more users (threads) than your average to know how you can handle spikes
Application Performance Management Tools that focus on monitoring + managing the performance, availability, and scalability of an application Some options: – New Relic - PHP/.NET/Ruby/Java/Python – Scout - Ruby – AppDynamics - Java/.NET – dynaTrace - Java/.NET
The benefits Too many to list – Real-time dashboards – Application response times – End-user monitoring (browser times) – Error reporting – Alerts for server/performance issues – Server monitoring – And more!
New Relic SaaS APM platform with a slick web UI Free lite version has real-time monitoring, server monitoring, and error detection, but only retains data for 24 hours Pro version is $150/server/month, but has many additional features, including full response traces
Real metrics = real insight APM tools are the culmination of all the profiling tools + techniques we’ve seen There is no substitute for real, user-driven performance numbers Review the bottlenecks the APM identifies, dig deeper using all the other tools we learned about, then watch as your app gets faster and more responsive
Recap 1.Profiling your application I.Profiling as an overall health check II.Digesting the slow query logs to find bottlenecks III.Grinding your code to find the hidden details IV.YSlow/PageSpeed and doing what they say 2.Maintaining performance I.Use JMeter to load test for scalability II.Monitoring prod to accumulate real metrics
Putting it all together Teach developers the importance of profiling their code; integrate into culture Performance must be top of mind/visible Profile and load test critical sections before release; confidence in your code Run an APM in production; real, actionable data on bottlenecks Never let up: the war is never over