Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dealing with the chaos monkey

Similar presentations


Presentation on theme: "Dealing with the chaos monkey"— Presentation transcript:

1 Dealing with the chaos monkey
Mobile Computing Bruce Scharlau, University of Aberdeen, 2012

2 Bruce Scharlau, University of Aberdeen, 2012
Background You have large international service built on top of web services in ‘the cloud’, which you rely upon What happens to your service if they disappear? How will your customers respond? Bruce Scharlau, University of Aberdeen, 2012

3 We can place data elsewhere on the network
Use a web service to store data elsewhere – save photos to flickr, files to some other app in the cloud. Can save files automatically, or at user discretion with time values, etc. (twitter, apps, or photo capture) Bruce Scharlau, University of Aberdeen, 2012

4 Bruce Scharlau, University of Aberdeen, 2012
Amazon Web Services died for several days a few years ago: only one company who used them carried on while others suffered the outage Working on form online and lose the connection Work disconnected, and then sync device when ‘in contact’ Save state in a game Persistence lets you add ‘memory’ to the application Bruce Scharlau, University of Aberdeen, 2012

5 Netflix’s chaos monkey saved them
They had built a service to create random outages of services they used. This forced them to provide a minimal service despite outages When Amazon went down, they were prepared Bruce Scharlau, University of Aberdeen, 2012

6 Feed & grow your chaos monkey
How often will remote data be accessed? How quickly does remote data need to appear? How often will the data be updated/edited? Where will minimal data be stored? These answers will suggest solutions for you Bruce Scharlau, University of Aberdeen, 2012

7 Remote data may not be always needed
Depending upon what you put on remote servers depends upon your own product and how it is deployed. These answers will suggest solutions for you Bruce Scharlau, University of Aberdeen, 2012

8 Remote data may not be instant
If remote data is not expected to be instant, then slower servers of your own may suffice for interim periods These answers will suggest solutions for you Bruce Scharlau, University of Aberdeen, 2012

9 Remote data can be slowly edited
Remote data can be staged so that current versions are local and thus can be used when remote services fail These answers will suggest solutions for you Bruce Scharlau, University of Aberdeen, 2012

10 Storing your own minimal data may be necessary
Remote web services help, but are not the only route to success These answers will suggest solutions for you Bruce Scharlau, University of Aberdeen, 2012

11 All depends upon data storage needs
How often will the data be accessed? How quickly does the data need to appear? How often will the data be updated/edited? Will the data be added to over time? Will the data be deleted? How will the data need to be used? These answers will suggest solutions for you Bruce Scharlau, University of Aberdeen, 2012

12 Use caches to manage data
Caches come in different shapes and sizes and some can handle data before it’s written to db Some can hold data while db is changed, etc Bruce Scharlau, University of Aberdeen, 2012

13 Remove 3rd party dependencies
Don’t make your app wait for third party responses before it replies to user Find way to use the 3rd party in asynchronous manner so your speed isn’t determined by their response time Bruce Scharlau, University of Aberdeen, 2012

14 Separate out functions, etc
Keep functions in separate libraries to ease maintenance and development When everything is put in one component it becomes entangled and causes problems with response rates Bruce Scharlau, University of Aberdeen, 2012

15 Take this further and assume anything could fail
Servers die, power fails, things fail. Build your system to withstand this and you’ll do fine You will end up with a resilient infrastructure Bruce Scharlau, University of Aberdeen, 2012

16 When code is ready then test
https://github.com/Netflix/SimianArmy/wiki/Quick-Start-Guide will guide you Run automatic tests on code, but test code works by randomly stopping services, etc Bruce Scharlau, University of Aberdeen, 2012

17 Run these tests when suitable staff are available
Run these tests when staff expect them so that they can respond accordingly and learn from them Run them on production side so that responses can be organised accordingly Better now than at 3am at the weekend… Bruce Scharlau, University of Aberdeen, 2012

18 Must be run against production
Chaos monkey must be run against production as this is where it counts and where nuances exist that can’t be replicated in test environments All of this fits into larger ‘devops’ approach to development Bruce Scharlau, University of Aberdeen, 2012

19 Mobile ticketing site example
Tickets for purchase from many events Huge demand when tickets first released Unpredictable demand when events go viral Bruce Scharlau, University of Aberdeen, 2012


Download ppt "Dealing with the chaos monkey"

Similar presentations


Ads by Google