Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Recovery-Friendly, Self-Managing Session State Store Benjamin Ling and Armando Fox

Similar presentations


Presentation on theme: "A Recovery-Friendly, Self-Managing Session State Store Benjamin Ling and Armando Fox"— Presentation transcript:

1 A Recovery-Friendly, Self-Managing Session State Store Benjamin Ling and Armando Fox {bling,fox}@cs.stanford.edu

2 © 2003 Benjamin Ling Outline n Motivation: What is Session State? n Existing solutions n SSM: Architecture and Algorithm n SSM: Recovery-friendly n SSM: Self-Managing n Related and Future Work n Conclusion

3 © 2003 Benjamin Ling Example of Session State

4 © 2003 Benjamin Ling Session State and Existing Solutions n We focus on a subcategory of session state l Single-user, serial access, semi-persistent data l Examples: Temporary application data, application workflow l Example of usage (e.g. J2EE): Browser App Server 1 2 3 4 5 6

5 © 2003 Benjamin Ling Existing solutions : n File System and Databases l Poor failure behavior n Lose data (FS) l Slow recovery (Both) l Difficult to administer (DB) l Difficult to tune (both) n In-memory replication using primary/secondary: l Performance coupling l Poor failover (uneven load balancing)

6 © 2003 Benjamin Ling Goal n Build a session state store that is: l Failure-friendly n Does not lose data on crash n Degrades gracefully l Recovery-friendly n Recovers fast l Self-Managing l High performance n Avoids performance coupling

7 © 2003 Benjamin Ling Session State Manager (SSM) Brick 1 Brick 2 Brick 3 Brick 4 Brick 5 AppServer STUBSTUB STUBSTUB Redundant, in-memory hash table distributed across nodes Algorithm: Redundancy similar to quorums Write to many random nodes, wait for few (avoid performance coupling) Write to many random nodes, wait for few (avoid performance coupling) Read one Read one RAM, Network Interface

8 © 2003 Benjamin Ling Write example: “Write to Many, Wait for Few” Browser AppServer STUBSTUB Brick 1 Brick 2 Brick 3 Brick 4 Try to write to W random bricks, W = 4 Must wait for WQ bricks to reply, WQ = 2 Brick 5

9 © 2003 Benjamin Ling Write example: “Write to Many, Wait for Few” Browser AppServer STUBSTUB Brick 1 Brick 2 Brick 3 Brick 4 Try to write to W random bricks, W = 4 Must wait for WQ bricks to reply, WQ = 2 Brick 5

10 © 2003 Benjamin Ling Write example: “Write to Many, Wait for Few” Browser AppServer STUBSTUB Brick 1 Brick 2 Brick 3 Brick 4 Try to write to W random bricks, W = 4 Must wait for WQ bricks to reply, WQ = 2 Brick 5

11 © 2003 Benjamin Ling Write example: “Write to Many, Wait for Few” Browser AppServer STUBSTUB Brick 1 Brick 2 Brick 3 Brick 4 Try to write to W random bricks, W = 4 Must wait for WQ bricks to reply, WQ = 2 1414 Brick 5

12 © 2003 Benjamin Ling Algorithm Properties n Client remembers metadata l Fate sharing n Stubs are stateless n Negative feedback loop

13 © 2003 Benjamin Ling SSM: Recovery-Friendly n Failure l No data is lost, WQ-1 copies of the data remain l State is available for R/W during failure n Recovery l Start a new brick – don’t need to recover anything l No special case recovery code (restart=recovery) l State is available for R/W during brick restart n Repair phase does not reduce throughput/performance l Session state is self-recovering n User’s access pattern will cause data to be rewritten

14 © 2003 Benjamin Ling SSM: Self-Managing n Adaptive: l Stub maintains count of maximum allowable in-flight requests to each brick n Additive increase on successful request n Multiplicative decrease on timeout l Stubs discover load capacity of each brick  Self-Tuning n Admission control l Stubs say “no” if insufficient bricks l Propagate backpressure from bricks to clients n Turn users away under overload  Self-Protecting

15 © 2003 Benjamin Ling Self-Tuning and Self-Protecting Without Add Inc/Mult Dec adapatation… Overload with AI/MD adaptation

16 © 2003 Benjamin Ling Other implementation details n Garbage collection l Generational hash table n Hash table of hash tables n Each hash table has an associated time range n When time has passed, GC that table l No reference counting, scanning, etc.

17 © 2003 Benjamin Ling Is it cheap? Is it fast? Is it easy to use? n How much does replication cost? l With 10 bricks, 1G memory, state size 8k, replication factor of 3 l Serve around 416,000 concurrent users n Configurable request timeout – currently 60 ms l Dwarfed by computation time and client RT time n Easy to add a brick, kill a brick l System continues running

18 © 2003 Benjamin Ling Publications The Case for a Session State Storage Layer Ben Ling, Armando Fox 9th Workshop on Hot Topics in Operating Systems (HotOS IX), Lihue, HI, May 2003 A Self-Managing Session State Layer Ben Ling, Armando Fox Accepted to the 5th Annual Workshop On Active Middleware Services (AMS 2003), Seattle, WA, June 2003 http://swig.stanford.edu/public/publications

19 © 2003 Benjamin Ling Related Work n Palimpsest – Timothy Roscoe, Intel l Temporal storage l Erasure coding l No guarantees, just estimates n DeStor – Andy Huang, Stanford l Persistent, multi-user, non-transactional data n FAB – HP Labs l Enterprise disk storage l Redundancy at disk block level

20 © 2003 Benjamin Ling Future Work n Do fault analysis and model failure l Memory and network failure modes l Performance faults? n How to choose replication factor? l 10 bricks, WQ of 3, inter-request rate of 5 minutes -> “5 nines” of availability if MTTF of bricks > 22 minutes n Adaptively change replication factor?

21 © 2003 Benjamin Ling SSM: Relaxing ACID n A – we guarantee n C – guaranteed by workload (full rewrite of state) n I – guaranteed by workload (single user, serial-access) n D – relaxed (ephemeral guarantee, RAM enough)  n Fast, simple, clean recovery l No data loss on failure l Data can be R/W during failure/recovery n Self-Managing

22 © 2003 Benjamin Ling Summary n We have built a system for: l Semi-persistent storage for single-user, serial-access data l Recovery friendly: n Crash Only – Crash-safe, fast recovery n No special case recovery code n Reboot any individual node n Continuous data availability l Self-Managing: n Self-Tuning and Protecting n Simple management and fault enforcement model Benjamin Ling bling@cs.stanford.edu http://swig.stanford.edu/ bling@cs.stanford.edu

23 © 2003 Benjamin Ling SSM: Recovery-Friendly, Self-Managing Store Questions or Comments? Benjamin Ling bling@cs.stanford.edu http://swig.stanford.edu/ bling@cs.stanford.edu


Download ppt "A Recovery-Friendly, Self-Managing Session State Store Benjamin Ling and Armando Fox"

Similar presentations


Ads by Google