CACHING: tuning your golden hammer By Ian Simpson from Mashery: an Intel Company.

Slides:



Advertisements
Similar presentations
Virtual Memory (II) CSCI 444/544 Operating Systems Fall 2008.
Advertisements

NHibernate Object/Relational Persistence for.NET.
Practical Caches COMP25212 cache 3. Learning Objectives To understand: –Additional Control Bits in Cache Lines –Cache Line Size Tradeoffs –Separate I&D.
Garbage Collection CSCI 2720 Spring Static vs. Dynamic Allocation Early versions of Fortran –All memory was static C –Mix of static and dynamic.
Keeping our websites running - troubleshooting with Appdynamics Benoit Villaumie Lead Architect Guillaume Postaire Infrastructure Manager.
CSC1016 Coursework Clarification Derek Mortimer March 2010.
CS 333 Introduction to Operating Systems Class 11 – Virtual Memory (1)
11/2/2004Comp 120 Fall November 9 classes to go! VOTE! 2 more needed for study. Assignment 10! Cache.
Memory Management and Paging CSCI 3753 Operating Systems Spring 2005 Prof. Rick Han.
1 Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Oct 31, 2005 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
Multiprocessing Memory Management
Honors Compilers Addressing of Local Variables Mar 19 th, 2002.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 3, 2003 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
Memory Management 2010.
Caching I Andreas Klappenecker CPSC321 Computer Architecture.
11/3/2005Comp 120 Fall November 10 classes to go! Cache.
1  Caches load multiple bytes per block to take advantage of spatial locality  If cache block size = 2 n bytes, conceptually split memory into 2 n -byte.
1  2004 Morgan Kaufmann Publishers Chapter Seven.
CS 524 (Wi 2003/04) - Asim LUMS 1 Cache Basics Adapted from a presentation by Beth Richardson
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Operating Systems CMPSCI 377 Lecture.
Google AppEngine. Google App Engine enables you to build and host web apps on the same systems that power Google applications. App Engine offers fast.
CLR: Garbage Collection Inside Out
Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.
Chocolate Bar! luqili. Milestone 3 Speed 11% of final mark 7%: path quality and speed –Some cleverness required for full marks –Implement some A* techniques.
Global NetWatch Copyright © 2003 Global NetWatch, Inc. Factors Affecting Web Performance Getting Maximum Performance Out Of Your Web Server.
CMPE 421 Parallel Computer Architecture
Ideas to Improve SharePoint Usage 4. What are these 4 Ideas? 1. 7 Steps to check SharePoint Health 2. Avoid common Deployment Mistakes 3. Analyze SharePoint.
Lecture 19: Virtual Memory
Lecture Topics: 11/17 Page tables TLBs Virtual memory flat page tables
Chapter 8 – Main Memory (Pgs ). Overview  Everything to do with memory is complicated by the fact that more than 1 program can be in memory.
Installing, Configuring And Troubleshooting Coldfusion Mark A Kruger CFG Ryan Stille CF Webtools.
A Presentation to Oracle OpenWorld Blistering Web Applications with Oracle TimesTen In Memory Option.
How to Build a CPU Cache COMP25212 – Lecture 2. Learning Objectives To understand: –how cache is logically structured –how cache operates CPU reads CPU.
Computer Architecture Memory organization. Types of Memory Cache Memory Serves as a buffer for frequently accessed data Small  High Cost RAM (Main Memory)
Log-structured Memory for DRAM-based Storage Stephen Rumble, John Ousterhout Center for Future Architectures Research Storage3.2: Architectures.
Chapter 4 Memory Management Virtual Memory.
Memory Management Fundamentals Virtual Memory. Outline Introduction Motivation for virtual memory Paging – general concepts –Principle of locality, demand.
CSE 241 Computer Engineering (1) هندسة الحاسبات (1) Lecture #3 Ch. 6 Memory System Design Dr. Tamer Samy Gaafar Dept. of Computer & Systems Engineering.
Cache Memory By Tom Austin. What is cache memory? A cache is a collection of duplicate data, where the original data is expensive to fetch or compute.
1 How will execution time grow with SIZE? int array[SIZE]; int sum = 0; for (int i = 0 ; i < ; ++ i) { for (int j = 0 ; j < SIZE ; ++ j) { sum +=
CSCI1600: Embedded and Real Time Software Lecture 18: Real Time Languages Steven Reiss, Fall 2015.
Review °Apply Principle of Locality Recursively °Manage memory to disk? Treat as cache Included protection as bonus, now critical Use Page Table of mappings.
Lecture 10 Page 1 CS 111 Summer 2013 File Systems Control Structures A file is a named collection of information Primary roles of file system: – To store.
CS.305 Computer Architecture Memory: Caches Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made available.
Big Data Engineering: Recent Performance Enhancements in JVM- based Frameworks Mayuresh Kunjir.
Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in.
Dynamo: Amazon’s Highly Available Key-value Store DAAS – Database as a service.
Thread basics. A computer process Every time a program is executed a process is created It is managed via a data structure that keeps all things memory.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.
Scalable Data Scale #2 site on the Internet (time on site) >200 billion monthly page views Over 1 million developers in 180 countries.
Lecture 7 Page 1 CS 111 Summer 2013 Dynamic Domain Allocation A concept covered in a previous lecture We’ll just review it here Domains are regions of.
COMP SYSTEM ARCHITECTURE PRACTICAL CACHES Sergio Davies Feb/Mar 2014COMP25212 – Lecture 3.
#SummitNow Inspecting Alfresco – Tools and Techniques Nathan McMinn Technical Consultant - Alfresco.
W4118 Operating Systems Instructor: Junfeng Yang.
Memory Hierarchy and Cache. A Mystery… Memory Main memory = RAM : Random Access Memory – Read/write – Multiple flavors – DDR SDRAM most common 64 bit.
CMSC 611: Advanced Computer Architecture
Java 9: The Quest for Very Large Heaps
How will execution time grow with SIZE?
Multiprocessor Cache Coherency
Swapping Segmented paging allows us to have non-contiguous allocations
Software Architecture in Practice
Lecture 10: Buffer Manager and File Organization
Chapter 8: Main Memory.
Page Replacement.
ColdFusion Performance Troubleshooting and Tuning
Memory Hierarchy Memory: hierarchy of components of various speeds and capacities Hierarchy driven by cost and performance In early days Primary memory.
Morgan Kaufmann Publishers Memory Hierarchy: Cache Basics
Update : about 8~16% are writes
Memory Principles.
Presentation transcript:

CACHING: tuning your golden hammer By Ian Simpson from Mashery: an Intel Company

What’s being covered (or not)  Caching within the JVM  Not focusing on external caches like memcached  Not focusing on sharding  Both are important, but we only have an hour

Motivation for this talk  Overall, caching is great!  More performance!  Greater scalability!  When do we have too much of a good thing?  How do we know if/when we’re doing it wrong?

When is more not better?  Ultimately caching isn’t free (but is usually cheap)  Some caches are not effective (low cache/hit ratio)  Some caches hold on to too much data  Some caches cause resource contention  If used incorrectly, caches can destroy performance

Memory: abundant but not free  When we cache, we store data in memory  Generally a replacement for fetching data from (slow) persistent storage  Servers have lots of memory!  The JVM only handles so much memory gracefully* * This varies based on the garbage collection strategy and JVM vendor, which is beyond the scope of this talk

JVM heap  Split between young, tenured, and permanent  Objects start in young gen  Most objects in young gen are short lived and GC’d  Objects in young that survive GCs go to tenured  Objects in tenured are long lived (e.g. cached data)

Heap o’ trouble  Tenured grows as more objects survive  GCs cause “stop the world” events, or pauses  GCs on tenured get more costly the larger it gets  Long GC pauses can cause performance problems  Full heap can cause thrashing (lots of GCs)

Cache structure and data retention  Usually caches are maps of keys to values  Under the covers, often Map  Anything you put in a Map is strongly referenced  Any reachable strong reference can’t be GC’d  Caches need to be reachable to be useful  Caches don’t automagically clean up stale data

How do we solve heap issues?  Don’t cache more than is necessary  Cache/hit ratio should be high (> 80%)  Understand how and when data is used  Estimate how large your data is or will be  Use background thread to clean stale data  Guava has support for this

Can I just not use the heap?  Yes! You can do off heap caching!  No! It’s not particularly straight forward!  On 32-bit JVM, 4GB (or 2) – max heap available  On 64-bit JVM, all available memory – max heap  Space is part of the Java process  Not managed by the garbage collector

What’s so difficult about off heap?  You have to manage your own memory  Allocated as a block via java.nio.ByteBuffer  Can also be allocated via sun.misc.Unsafe  You need to know where things are in the array  Serialization into and deserialization from byte[]  Cost to allocate block of memory

Blocking vs non-blocking cache

 Cache miss means data has to be loaded  Loading data means blocking threads requesting it  Generally unavoidable unless returning nothing is OK  Cache hit on stale data means reloading  Blocking on reloading data can be optional  Return stale data, reload asynchronously

Blocking cache: pros and cons  Pros  Stale data is predictably stale  Less resource intensive than loading asynchronously  Cons  Can have huge latency on popular keys across threads  Read mutexes can result in serial read performance  Better with many keys and low key contention

Non-blocking cache: pros and cons  Pros  Can return stale data and load in the background  No read mutex required!  Cons  Requires a thread pool for asynchronous loading  Data is unpredictably stale  Better with fewer keys and high key contention

Which is which? What do I use?  EhCache is a blocking cache  Guava cache can be either  Neither is the best choice in all cases  Data characteristics will help you decide  Not sure what to use? Talk to your coworkers!

Example cases  Product catalog with millions of items  Blocking  Subset of products on the homepage  Non-blocking  Subset of products on sale for limited time  Questionable – contention and staleness both important  Split data on TTL, static vs dynamic  Refresh price/inventory frequently and asynchronously

Local vs distributed caching

 Local means each server has its own cache  Distributed means multiple servers share cache  Local is very simple  Distributed is very complex  Greater detail of distributed is outside scope

Local: pros and cons  Pros  Basically just a decorated Map  No serialization overhead  Good ones available for free  Cons  Less efficient: more cache misses and memory usage  In large clusters, cache/hit can suffer greatly  In clusters, consistency becomes a problem

Distributed: pros and cons  Pros  Fewer cache misses in clusters; improves with size  Consistency across cluster  Great for splitting up large data sets  Cons  Complex: more data management and network chatter  Serialization overhead for remote retrieval  Enterprise solutions cost $$$, free ones more barebones

Which one do I use?  Again, no one perfect solution; depends on data  Local is quick and dirty  Distributed is better for large data sets  Distributed is usually better for clusters  Distributed comes at a cost of complexity and $$$  Not sure which one? Talk to your coworkers!

Example cases  CMS with large view templates  Distributed (size and consistency)  Display data like greetings, canned messages, etc  Local (small, changes infrequently)  Session store for user behavior  Questionable – consistency and resource concerns  Not mission critical data, but good to have  Local with sticky sessions an option

Getting closer with your data

Questions you should ask  How expensive is it to load?  Size: IO/memory bound; electrons only move so fast  Cost: high DB or CPU load means you must cache  Frequency: death by 1000 cuts  How often will I hit cache?  Example: static content often means high cache hit ratio  Example: highly diverse data means low cache hit ratio

Preparation through estimation  Estimate how data will grow  Number of new records per month?  How much is considered active?  Estimate how diverse data will become  Thousands of unique entries? Millions?  Subset of data that is more popular?  Work with product owners on expected use of data

And now for a recap!

A recap: fundamentals  Caching is great, but it’s not free  Get to know your data  Consider how much memory your app will use  Consider how expensive it is to load data  Take cache/hit ratios seriously  Not everything needs a cache

A recap: choices to make  Non-blocking: great for bounded data  Blocking: great for diverse data  Non-blocking: uses more resources, more stale data  Blocking: bad when keys have high contention  Local: simple, cheap, but can impact memory more  Distributed: great for big data sets & large clusters

Getting runtime info  Track cache statistics via JMX  Cache hits and misses  Total objects in memory  Tools like Nagios and AppDynamics can track JMX  Guava doesn’t have JMX, but it’s easy to add

BUT WHICH ONE DO I USE?!  Ultimately, you’ll have to figure it out  … with the help of your coworkers  Before you can decide on what to use, you need to understand the characteristics of your data

Shameless self promotion  Website:  Has several cache and memory related posts  Github Gists:  Lots of code samples (referenced in blog posts)  Interested in these problems?  Mashery is hiring!  API management for companies big and small  Scalability is core to our business

Obligatory “Questions” slide  Well?