Presentation is loading. Please wait.

Presentation is loading. Please wait.

UPortal Performance & Memory Issues Scott Battaglia Rutgers, the State University of New Jersey.

Similar presentations


Presentation on theme: "UPortal Performance & Memory Issues Scott Battaglia Rutgers, the State University of New Jersey."— Presentation transcript:

1 uPortal Performance & Memory Issues Scott Battaglia scott_battaglia@rutgers.edu Rutgers, the State University of New Jersey

2 Description of Problem Amount of memory consumed by uPortal grows consistently Continues to consume memory until there is no memory left Application stops working properly and hangs Consistent with definition of a memory leak

3 Background Launched myRutgers on uPortal 2.3 Issue was not seen in our QA Seeing issue in production since November 2004

4 Background Also seen in production by: Yale University University of Louisiana at Lafayette University of California at Irvine Cornell University

5 Temporary Workaround Monitor memory usage of uPortal When memory drops below 5% bounce JVM.

6 Issues with Workaround May be too aggressive In some cases, JVM may be able to garbage collect Causes users on that JVM to lose their session If miss window of opportunity to restart, can take down Apache also

7 Issues with Workaround Ultimately, does nothing to resolve memory issue. Just makes it barely livable

8 History of Fixes Removed caching of IPersons from PersonDirectory CError and CSecureInfo now pass events to wrapped channels. Restrict access to ChannelFactory’s channel cache, synchronized instantiateChannel method. Guest sessions created on time out AbstractMultithreadedChannels were not cleaning out their channel state maps (2 of them).

9 But…. 3 Months later, issue still exists. Previous steps solved memory leaks but still more exist. The search continues…

10 What’s Happening Today Renewed effort to search for memory leaks Initial Steps taken: Retooling of Load Tests Production Snapshots Incremental Updates Re-affirming that loadtest system matches production system

11 Retooling of Load Tests Attempt to mimic more closely what a user does in production. More custom layouts Less people logging out Hitting more popular channels more aggressively

12 Retooling of Load Tests Attempt to accomplish same throughput Determine average user session length Determine rate at which users access system

13 Retooling of Load Tests Bought test system with same specs/setup as production systems Ensure database optimizations are the same Ensure uPortal configuration is the same (i.e. StatsRecorder)

14 Production Snapshots Only seeing issue in production Need to capture production snapshots JVM Heap Size initially set at 2 GB

15 Production Snapshots Lowered JVM Heap Size to 128 MB on machine Allows us to compare snapshots When memory reaches 10% take it out of load balancing rotation Garbage Collect

16 Production Snapshots Capture snapshot Wait past session timeout Currently set at 15 minutes Garbage Collect again Take new snapshot Analyze Snapshot

17 Production Snapshots What do they tell us? They help us determine what objects are still in memory Tells us how much memory they are using Tells us how much memory items they reference are using

18 Understanding the Snapshots Use YourKit Java Profiler to capture memory snapshots YourKit consists of two parts: Component that runs on server Local application to open memory snapshots

19 Understanding the Snapshots YourKit tells us: Reports incoming and outgoing references Totals for objects of each type How much memory they consume Allows us to compare snapshots, showing the deltas of each object type. uPortal community has about 20 licenses for YourKit

20 Understanding the Snapshots Name Objects Shallow Size Retained Size

21 Understanding the Snapshots Trace the path to the root of the Garbage Collector Option of seeing first path or multiple paths In screenshot, we see first five

22 Understanding the Snapshots Example of object from “Retained Size” Only reason this object still exists is because XRTreeFrag has not been GCed.

23 Understanding the Snapshots Comparison of two snapshots (users vs. no users) See that XRTreeFrag retains number of objects

24 Understanding the Snapshots Also comparison of (users vs. no users) See that UserInstance gets garbage collected, as does ChannelStaticData, etc.

25 Incremental Updates In order to determine the impact of changes to the uPortal framework, we’ve adopted an incremental update approach. We apply one “fix” at a time, and monitor its impact.

26 Incremental Updates Currently in production… Threadpool switch from homegrown to Backport Concurrent Finalizer in UBC_Webmail In the queue… Update to AuthorizationImpl

27 What’s Happening Today Recently, flurry of activity on JASIG- DEV list about memory issues. Backport Concurrent Threadpool AuthorizationImpl Finalizers in UBC_Webmail

28 What’s Happening Today Backport Concurrent Thread Library Issues with current threadpool Potential for deadlock or infinite loop Potential for cleanup to fail in thread workers UnboundedThreadpool that extends BoundedThreadpool

29 What’s Happening Today Backport Concurrent Thread Library (cont) Action Item Aaron wrote patch against HEAD to replace thread library Rutgers manually applied patch to 2.4.1 and placed into production. Result: Undetermined: Most students were on Spring Break Preliminary results indicate may offer performance benefit rather than memory leak fix

30 What’s Happening Today AuthorizationImpl Current Issues Retaining references to principals No explicit removal of principal from cache Copying of map on each newPrincipal call that results in a new principal

31 What’s Happening Today AuthorizationImpl Action Item Rutgers volunteered to provide fix for HEAD Fix consists of replacing current newPrincipal method and replacing HashMap with a cache Patch is scheduled to be loadtested and placed into production Patch is scheduled to be committed to uPortal HEAD on successful test and deployment

32 What’s Happening Today AuthorizationImpl Consequences of Changes Introduced a CacheFactory Not specific to any one part of uPortal CacheFactory is interface (plug your own in!) Default CacheFactory using WhirlyCache Allows for declaring cache settings and policy in XML Allows for fine-grained caching strategies for each part of uPortal

33 What’s Happening Today UBC_Webmail Issue Finalizers are not properly cleaning up Action Item Rutgers has volunteered to refactor Finalizers

34 Continuing the Search… Rutgers, and other members of the uPortal community continue to search for the answer to the memory leaks

35 What can we do to help? Finalizer should be a last resort If a viable open source project exists that fills the requirements, consider using that Be aware of proper caching (where its needed vs. where its not needed, weak & soft references, etc.) Avoid circular references wherever possible

36 The End (finally!) Any questions, comments, concerns?


Download ppt "UPortal Performance & Memory Issues Scott Battaglia Rutgers, the State University of New Jersey."

Similar presentations


Ads by Google