Presentation is loading. Please wait.

Presentation is loading. Please wait.

© 2013 IBM Corporation Practical Performance Understand and improve the performance of your application San Hong Li – Technical lead of JTC Shanghai 23.

Similar presentations


Presentation on theme: "© 2013 IBM Corporation Practical Performance Understand and improve the performance of your application San Hong Li – Technical lead of JTC Shanghai 23."— Presentation transcript:

1 © 2013 IBM Corporation Practical Performance Understand and improve the performance of your application San Hong Li – Technical lead of JTC Shanghai 23 st July 2013

2 © 2013 IBM Corporation Important Disclaimers THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. WHILST EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. ALL PERFORMANCE DATA INCLUDED IN THIS PRESENTATION HAVE BEEN GATHERED IN A CONTROLLED ENVIRONMENT. YOUR OWN TEST RESULTS MAY VARY BASED ON HARDWARE, SOFTWARE OR INFRASTRUCTURE DIFFERENCES. ALL DATA INCLUDED IN THIS PRESENTATION ARE MEANT TO BE USED ONLY AS A GUIDE. IN ADDITION, THE INFORMATION CONTAINED IN THIS PRESENTATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM, WITHOUT NOTICE. IBM AND ITS AFFILIATED COMPANIES SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF: - CREATING ANY WARRANT OR REPRESENTATION FROM IBM, ITS AFFILIATED COMPANIES OR ITS OR THEIR SUPPLIERS AND/OR LICENSORS 2

3 © 2013 IBM Corporation Introduction to the speaker ■ 8 years working in Java. ■ Recent work focus: ■ Java Virtual Machine improvements for ‘cloud’ ■ Multi-tenancy technology ■ JVM(J9) development ■ Past lives ■ Java security development (Expeditor, kernel of Lotus notes) ■ Network programming ■ My contact information: –mail: lisanh@cn.ibm.com@cn.ibm.com –weibo: sanhong_li 3

4 © 2013 IBM Corporation Goals of the Talk ■ Introduce a simple general methodology for performance analysis ■ Discuss the common performance bottlenecks ■ Show how to analyze a simple application 4

5 © 2013 IBM Corporation Agenda 1. Approaches to performance 2. Layers of the application 3. Identifying and resolving performance issues 5

6 © 2013 IBM Corporation Approaches to performance Outside in approach – Start from where performance can be measured – Work along the activity path – Ideal for identified performance problems Layered approach – “Bottom up” or “Top down” – Analyze and eliminate layers of the application – Simplify the problem as you go – Ideal for application health check A hybrid of both approaches can often be useful 6

7 © 2013 IBM Corporation Performance baseline Important to have a repeatable and representative performance test Measure baseline performance – Internal measurements affect the performance of what your measuring – External measurements have less impact on system performance Where possible, measure multiple times – Variation will occur between test runs Where possible, ensure consistency – Not just the load test that's run – State of machine and network can have interesting effects 7

8 © 2013 IBM Corporation A Layered Approach ■ Three layers of a deployment: – Infrastructure: Machine hardware and Operating System – Java Runtime: Garbage collection – Java Application: Java application code ■ Each can suffer from resource constraints, typically: – Memory – CPU – Synchronization – I/O 8

9 © 2013 IBM Corporation Infrastructure Typical resource constraints: – Memory: insufficient physical memory results in paging/swapping – CPU: insufficient CPU time limits throughput of the application – I/O: insufficient I/O limits throughput of the application – Synchronization driven by Java runtime/Java application Easy to diagnose Easy to resolve (relatively) Note that each can also be caused by deficiencies higher up the stack! 9

10 © 2013 IBM Corporation Infrastructure: Memory usage ■ Infrastructure uses memory for: – Backing the process data: OS runtime, Java runtime, Java application – Caching of IO: filesystem and network buffers ■ Lack of physical memory causes: – Reduction and removal of IO caching – Paging/swapping of process memory to disk ■ Paging/swapping is costly for a Java process – Particularly affects Garbage Collection performance Paging usually occurs on Least Recently Used basis All of Java heap is traversed during mark and sweep phases Least Recently Used does not work well for the Java heap 10

11 © 2013 IBM Corporation Infrastructure: CPU usage Insufficient CPU time availability will reduce performance Can occur periodically: – Cron Jobs running batch applications – Database backups Or during periods of high load: – System becomes CPU bound, limiting performance 11

12 © 2013 IBM Corporation Detecting infrastructure issues Detect using Operating System level tools Memory on Windows: – Paging: using “perfmon” with “Process” counter for “Page Faults/sec” – File Cache: using “perfmon” with “Memory” counter for “System Cache Resident Bytes” CPU on Windows: – Per process: using “perfmon” with “Process” counter for “% Processor Time” – Per machine: using “perfmon” with “Processor” counter for “% Processor Time” IO on Windows: – Network: using “perfmon” with “Network Interface” counter for “Output Queue Length” – Disk: using “perfmon” with “Physical Disk” counter for “Current Disk Queue Length” 12

13 © 2013 IBM Corporation Swapping/Paging in perfmon 13

14 © 2013 IBM Corporation Resolving infrastructure issues Add more physical resources to the process – Assign more to the: Machine, Guest OS, LPAR, Zone, etc Reduce the physical resource requirements – Reduce the application footprint – Reduce the application CPU usage – Reduce the IO 14

15 © 2013 IBM Corporation Page response performance benchmark: Baseline Page Performance PlantsbyWebSphere 15

16 © 2013 IBM Corporation Page response performance benchmark: Paging removed Page Performance PlantsbyWebSphere 16

17 © 2013 IBM Corporation Effect on page URL performance –PlantsByWebSphere1.7% –servlet_ShoppingServlet{2}22.8% –servlet_ShoppingServlet0.3% –Shopping{1}0.4% –Shopping{4}21.4% –Shopping_1_117.9% –Shopping_2_22.8% –Shopping_2_316.7% –Shopping_2_424.1% –Shopping_2_513.8% ■ Improvement in page performance of 0.3% to 24.1% –Without outliers: 0.4% to 22.8% ■ Biggest gains are on most performant pages. Total gain is only ~4% ■ Small effect on overall page performance 17

18 © 2013 IBM Corporation Garbage Collection Pause Times 18

19 © 2013 IBM Corporation Garbage Collection Pause Times 19

20 © 2013 IBM Corporation Effect on Garbage Collection Pause Times ■ Reduction in: –Maximum pause time:38% –Average pause time:13% –Time spent in GC11% ■ Large effect on GC performance, particularly pause times 20

21 © 2013 IBM Corporation Swapping/Paging and Java Applications ■ Swapping/Paging usually occurs on Least Recently Used basis Live but non-recently used data is written out to disk to free RAM for used data ■ Java however “uses” non-recently used data during GC In order to determine if the data is “live” and still required by the application. Any Java heap memory that has been swapped out, must be swapped in during GC 21

22 © 2013 IBM Corporation Java runtime Typical resource constraints: – Memory: insufficient Java heap results in OutOfMemory or high GC overhead – CPU garbage collection overhead, or driven by Java application – Synchronization driven by Java application – IO driven by Java application Easy to diagnose Easy to resolve (relatively) 22

23 © 2013 IBM Corporation Java Process Memory Layout ■ An Operating System process like any other application: – Subject to OS and architecture restrictions – 32bit architecture has an addressable range of: 2^32 which is 0x00000000 – 0xFFFFFFFF which is 4GB ■ Not all addressable space is available to the application – The operating system needs memory for: The kernel and the runtime support libraries (eg C runtime) – Varies according to Operating System How much memory is needed and where memory is located 0 GB4 GB 0x00xFFFFFFFF 2 GB 0x80000000 1 GB3 GB 0x400000000xC0000000 23

24 © 2013 IBM Corporation Native Heap available to Application ■ On Windows ■ On AIX (1.4.2 with small heaps) 0 GB4 GB 0x0 2 GB 0x80000000 1 GB3 GB 0x400000000xC0000000 Operating System Space Libraries Java Heap 0xFFFFFFFF 0 GB4 GB 0x00xFFFFFFFF 2 GB 0x80000000 1 GB3 GB 0x400000000xC0000000 KernelLibraries Java Heap VM Resources Native Heap 24

25 © 2013 IBM Corporation What does Java use native memory for? ■ Java heap –Allocated as a contiguous chunk –Occupies the -Xmx value from startup ■ Just In Time (JIT) Compiler –Runtime data –Storing executable code ■ Virtual Machine Resources: –Debug Engine Trace Buffers Dump Buffers –GC Infrastructure ■ JNI Allocations –Native malloc/new in JNI functions –Used by the interface itself ■ Resources to underpin Java Objects –Classes and ClassLoaders –Threads –Direct java.nio.ByteBuffers –Sockets Kernel Space User Space Java Heap VM Native Allocations Java Libraries JIT data 25

26 © 2013 IBM Corporation Resources to underpin Application ■ Other memory usage can be indirectly driven by application usage and garbage collection e.g. An instance of java.lang.Thread has underlying native resources: –Native thread structure (eg. pthread_t) –Associated native stack (size is platform dependent) –Java stack structure Java HeapNative Heap 26

27 © 2013 IBM Corporation 32 bit versus 64 bit JVM ■ 32 bit Java processes have maximum possible heap size – Varies according to the OS and platform used – Determined by the process memory layout – Maximum possible Java heap size on AIX is 3.25 GB, advisable 2.5 GB ■ 64 bit processes do not have this limit – Limit exists, but is so large it can be effectively ignored – Addressability usually between 2^44 and 2^64 – Which is 16+ TeraBytes ■ Moving to 64bit removes the Java heap size limit ■ However, ability to use more memory is not “free” – 64bit applications perform slower More data has to be manipulated Cache performance is reduced – 64bit applications require more memory Java Object references are larger Internal pointers are larger ■ Major improvements to this in Java 6.0 due to compressed pointers ■ 64 bit JVM recommended only if a Java heap much greater than 2GB is required for higher performance or application uses computationally intensive algorithms for statistics, encryption etc which can benefit from high precision support 27

28 © 2013 IBM Corporation Java runtime problems Insufficient Java heap memory leads to: – OutOfMemoryError due to Java heap exhaustion – Garbage Collection running excessively, increasing CPU and affecting performance Insufficient non-Java (“native”) heap leads to: – OutOfMemoryError due to process address space exhaustion – Driver for Java heap garbage collection (DirectByteBuffer cleaners) 28

29 © 2013 IBM Corporation Detecting Java runtime problems Log and trace analysis: – “Native” heap: OS level logs (ps, svmon, perfmon) – Java heap: verbose:gc output Post processed using IBM Monitoring and Diagnostic Tools for Java - Garbage Collection and Memory Visualizer(GCMV) Live monitoring: – “Native” heap: IBM Monitoring and Diagnostic Tools for Java - Health Center – Java heap: IBM Monitoring and Diagnostic Tools for Java - Health Center Visual VM, Mission Control 29

30 © 2013 IBM Corporation Too Frequent Garbage Collection 30

31 © 2013 IBM Corporation Choosing right heap size : 43 ■ GC will resize heap to keep occupancy between 40% - 70% –Heap occupancy >70% causes frequent GCs  triggers grow == slow –Heap occupancy below 40% means infrequent GC cycles, but cycles longer than they need to be  triggers contraction == slow ■ The maximum heap size setting should therefore be 43% larger than the maximum occupancy of the application –Maximum occupancy + 43% means occupancy at 70% of total heap heapmax = occupancy + (0.3 x heapmax) (0.7 x heapmax ) = occupancy heapmax = occupancy / 0.7 heapmax = occupancy * 1.43 –e.g. For 70MB occupancy specify a 100 MB max heap size = 70 MB + (0.43 x 70 MB) = 70 MB + 30 MB = 100 MB 31

32 © 2013 IBM Corporation Resolving Java runtime problems Add more resources to the Java runtime – Java heap: Increase Java heap size – Native heap: Move to 64bit or reduce Java heap size Reduce the memory requirements – Reduce the Java application footprint 32

33 © 2013 IBM Corporation Increased Java heap size 33

34 © 2013 IBM Corporation Effect on Garbage Collection Pause Times ■ Reduction in: –Time spent in GC59% ■ However this is only 4.84% of total time 34

35 © 2013 IBM Corporation Page response performance benchmark: Baseline Page Performance PlantsbyWebSphere 35

36 © 2013 IBM Corporation Page response performance benchmark: Paging removed Page Performance PlantsbyWebSphere 36

37 © 2013 IBM Corporation Page response performance benchmark: Java Heap size increased Page Performance PlantsbyWebSphere 37

38 © 2013 IBM Corporation Page performance improvements –PlantsByWebSphere0.4% –servlet_ShoppingServlet0.0% –servlet_ShoppingServlet{2}33.2% –Shopping{1}-0.6% –Shopping{4}4.5% –Shopping_1_12.9% –Shopping_2_20.1% –Shopping_2_32.1% –Shopping_2_414.2% –Shopping_2_58.2% ■ Improvement in page performance of -0.6% to 33.2% Without outliers: 0.0% to 14.2% ■ Total gain is only ~4% Relatively small effect on overall page performance 38

39 © 2013 IBM Corporation Java application Typical resource constraints: – Memory: insufficient caching affects application throughput and responsiveness – CPU: insufficient threading causes limits on scalability – Synchronsation: synchronized resources limits scalability and throughput of the application – I/O: blocking on I/O limits throughput and responsiveness Hard to diagnose Can be expensive (or impossible!) to resolve 39

40 © 2013 IBM Corporation Java application CPU usage High CPU usage by Java methods highlight areas of potential optimization – Code is being invoked more than it needs to be Easily done with event driven models – An algorithm is not the most efficient Easily done if performance is not the focus at development time Fixing CPU bound applications requires knowledge of what code is being run – Identify methods which are suitable for optimization Optimizing methods which the application doesn’t spend time in is a waste of your time – Identify methods where more time is being spent that you expect “Why is so much of time being spent in this trivial method?” 40

41 © 2013 IBM Corporation Java application synchronization Throughput does not increase linearly with load At limit of throughput the CPU is still low – Inability to scale – Not all CPU can be utilized – Limit on throughput and responsiveness Bottleneck where threads need to synchronize with each other for application correctness – Caused by large numbers of threads requiring synchronized resource at the same time – Caused by long hold time by thread that owns resource – Or a mixture of both 41

42 © 2013 IBM Corporation Health Center: application method CPU usage http://mcsholding.com/DetailsPage.aspx?Page_Id=42 42

43 © 2013 IBM Corporation Health Center: application synchronization 43

44 © 2013 IBM Corporation ShoppingServlet.deliberateSlowMethod() 44

45 © 2013 IBM Corporation Page response performance benchmark: Baseline Page Performance PlantsbyWebSphere 45

46 © 2013 IBM Corporation Page response performance benchmark: Paging removed Page Performance PlantsbyWebSphere 46

47 © 2013 IBM Corporation Page response performance benchmark: Java Heap size increased Page Performance PlantsbyWebSphere 47

48 © 2013 IBM Corporation Page response performance benchmark: Java Heap size increased Page Performance PlantsbyWebSphere 48

49 © 2013 IBM Corporation Page URL performance improvements servlet_ShoppingServlet80.8%5x improvement Shopping{1}94.3%20x improvement ■ Improvement in page performance of 80 and 95% –5x and 20x improvement for affected pages ■ Total gain of 48%! 49

50 © 2013 IBM Corporation Summary Importance of: – Repeatable benchmark – Incremental measurements as changes are made – Repeated testing and verification Tools are available to help you see what's going on: – Garbage Collection and Memory Visualizer (all vendors) – HealthCenter (IBM only) – Other profilers (eg. YourKit) (all vendors) – Memory Analyzer (all vendors) 50

51 © 2013 IBM Corporation Page URL response performance benchmark: Summary 51

52 © 2013 IBM Corporation Summary ■ Infrastructure resources affect performance –Paging and Garbage Collection much less than you might expect –However, beware of CPU “starvation” from other processes! ■ Vast majority of performance gains are in the application! 52

53 © 2013 IBM Corporation References Get Products and Technologies: – IBM Monitoring and Diagnostic Tools for Java: https://www.ibm.com/developerworks/java/jdk/tools/ Learn: – Health Center InfoCenter: http://publib.boulder.ibm.com/infocenter/hctool/v1r0/index.jsp Discuss: – IBM on Troubleshooting Java Applications Blog: https://www.ibm.com/developerworks/mydeveloperworks/blogs/troubleshooti ngjava/ – Health Center Forum: http://www.ibm.com/developerworks/forums/forum.jspa?forumID=1461 – IBM Java Runtimes and SDKs Forum: http://www.ibm.com/developerworks/forums/forum.jspa?forumID=367&start=0 53

54 © 2013 IBM Corporation Copyright and Trademarks 54

55 © 2013 IBM Corporation …any final questions? 55


Download ppt "© 2013 IBM Corporation Practical Performance Understand and improve the performance of your application San Hong Li – Technical lead of JTC Shanghai 23."

Similar presentations


Ads by Google