Presentation is loading. Please wait.

Presentation is loading. Please wait.

Caching for Sustainability Alex Bunch. Agenda Intro Overview Background Analysis Implementation Future.

Similar presentations


Presentation on theme: "Caching for Sustainability Alex Bunch. Agenda Intro Overview Background Analysis Implementation Future."— Presentation transcript:

1 Caching for Sustainability Alex Bunch

2 Agenda Intro Overview Background Analysis Implementation Future

3 Intro Caching is a systems technique to use relatively expensive hardware with special features On-chip SRAM is fast but costs more than memory Memory is faster than disk but… Web caching services (like Akamai) have low network latency to end users but can’t scale like datacenters How it works: Caching relies on evidence that some pieces of data are more likely to be accessed

4 Intro Methods for determining likelihood of access Spatial Locality: Data near data that has just been accessed is likely to be accessed. Temporal Locality: Data that has just been accessed is likely to be accessed again.

5 10000 ft. view The principle idea behind this research is that green hosts are a new type of hardware with special features These hosts offer either a service that is entirely run by renewable sources, or they supplement it by purchasing enough renewable energy credits to offset any dirty energy used

6 10000 ft. view The idea behind Greenmail is that it acts as a cache for emails that are likely to be accessed and due to the fact that it is a zero carbon service the overall carbon footprint of the user goes down.

7 10000 ft. view

8

9 Background On green trends On green hosting On greenmail locality

10 Background One of the fundamental ideas that Greenmail is based on this that people want their services to be green. This idea is validated by the fact that the customer base for green hosts have increased 60% a year from 2002-2008[1]

11 Background Beyond simple customer interest, green products need to be competitively priced, as 83% of consumers would rather use a green service if it did not cost more than their dirty alternative[2] Green hosting is becoming significantly more prolific and in turn becomes competitive with dirty energy prices.

12 Background Green hosts are internet hosting companies that perform ‘green’ actions for their users that offset any carbon caused by their datacenter, either through the direct use of renewable energy, planting trees, or buying offsets.

13 Background Stating that email exhibits temporal and/or spatial locality is a lofty claim, but intuition argues that a user who accesses an important email will eventually reference it again. Our hope is that these claims are validated by the data.

14 Analysis One of the most classic equations in relation to caches is in regard to the Average Memory Access Time(AMAT): AMAT = Ht + r*Mt Where Ht is the cache hit time, r is the miss rate, and Mt is the miss penalty

15 Analysis Beyond serving as a great high level analogy, greenmail has a similar equation for Average Carbon Footprint: ACFP = Hc + r*Mc Where Hc is carbon associated with a cache hit, r is the miss rate, and Mc is the carbon miss penalty

16 Analyis Due to the fact that Greenmail is carbon neutral then Hc is 0 and since Mc is based on the original email provider then the rate (r) is the only element of this equation that we can attempt to minimize, subject to our constraints.

17 Constraints As with classic caches, the miss rate is based partially on the size of the cache and the algorithm used to replace data. While the Algorithm can be modified depending on experimental data, the cache size has a cap.

18 Constraints Our cache size is self imposed to keep greenmail economically sound: our cost of maintaining the cache should not exceed the cost that the original email provider spends storing all of a single users data.

19 Constraints The reason that this makes our cache smaller is that email providers have two elements working to reduce their energy costs: Dirty energy – costs less than green energy. Economy of Scale – more users translates into spending less per user.

20 Constraints example Email Host A uses dirty power that costs half as much as green power, and due to the number of users it has it is able to purchase hardware at 75% the price Greenmail can. Greenmail must hold at most 37.5% of the emails that the host does.

21 Implementation Our implementation of Greenmail is based on a modified version of SquirrelMail, a free open source web based email application that has access to an IMAP proxy server.

22 Implementation Cache functionality comes from modifying the SquirrellMail IMAP functions. A single IMAP session consists of many messages being sent between the user running SquirrelMail and the initial email provider, but only a few of them are worth caching.

23 Implementation Only two of these messages are ‘worth’ caching due to the fact most of the others are just a few lines long: ‘Get Headers’ – Returns a list of all the email subjects in the relevant mailbox/search ‘Get Body’ – Returns the body of the email requested

24 Implementation ‘Get Body’ – An encrypted local copy is made whenever this is called and when any subsequent calls are made the local copy is retrieved. ‘Get Headers’ – theoretically should be easy to cache, except there is a timestamp baked into it that is used for error checking

25 Implementation In addition to the modifications made to SquirrelMail, additional scripts needed to be made to allow for users to quickly and easily set up their own cache. Separate directories are made for each user due to how SquirrelMail stores IMAP configurations.

26 Results Currently in the process of collecting data from real users as there is no set test suite / benchmark that models users accessing emails In the future if a good user ‘profile’ is found it is possible to automate this (x% spam, y% accessed frequently, etc)

27 Example Locality Analysis (not from Greenmail)

28 Future Work Heavy data analysis Cache Algorithms Caching Headers Caches searches Used to limit mailbox refresh rate Zoolander backend

29 Questions/References [1] The AMD Opteron Processor Helps AISO. www.vmware.com. [2] N. Holdings. The nielsen global online environmental survey, 2011.


Download ppt "Caching for Sustainability Alex Bunch. Agenda Intro Overview Background Analysis Implementation Future."

Similar presentations


Ads by Google