Caching for Sustainability Alex Bunch. Agenda Intro Overview Background Analysis Implementation Future.

Slides:



Advertisements
Similar presentations
SE-292 High Performance Computing
Advertisements

SE-292 High Performance Computing Memory Hierarchy R. Govindarajan
Performance of Cache Memory
Information and Control in Gray-Box Systems Arpaci-Dusseau and Arpaci-Dusseau SOSP 18, 2001 John Otto Wi06 CS 395/495 Autonomic Computing Systems.
Segmentation and Paging Considerations
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
© Karen Miller, What do we want from our computers?  correct results we assume this feature, but consider... who defines what is correct?  fast.
Chapter 101 Virtual Memory Chapter 10 Sections and plus (Skip:10.3.2, 10.7, rest of 10.8)
The Memory Hierarchy CPSC 321 Andreas Klappenecker.
Internet Networking Spring 2006 Tutorial 12 Web Caching Protocols ICP, CARP.
Memory Management (II)
Virtual Memory Chapter 8. Hardware and Control Structures Memory references are dynamically translated into physical addresses at run time –A process.
1 Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
1 Spring Semester 2007, Dept. of Computer Science, Technion Internet Networking recitation #13 Web Caching Protocols ICP, CARP.
Translation Buffers (TLB’s)
1 Chapter 8 Virtual Memory Virtual memory is a storage allocation scheme in which secondary memory can be addressed as though it were part of main memory.
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
Implementing ISA Server Caching. Caching Overview ISA Server supports caching as a way to improve the speed of retrieving information from the Internet.
Overview: Memory Memory Organization: General Issues (Hardware) –Objectives in Memory Design –Memory Types –Memory Hierarchies Memory Management (Software.
1  2004 Morgan Kaufmann Publishers Chapter Seven.
1 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value is stored as a charge.
CS 524 (Wi 2003/04) - Asim LUMS 1 Cache Basics Adapted from a presentation by Beth Richardson
1 CSE SUNY New Paltz Chapter Seven Exploiting Memory Hierarchy.
Electronic Data Interchange (EDI)
Web Proxy Server Anagh Pathak Jesus Cervantes Henry Tjhen Luis Luna.
Cache Memories Effectiveness of cache is based on a property of computer programs called locality of reference Most of programs time is spent in loops.
Duncan Fraiser, Adam Gambrell, Lisa Schalk, Emily Williams
Systems I Locality and Caching
FIREWALL TECHNOLOGIES Tahani al jehani. Firewall benefits  A firewall functions as a choke point – all traffic in and out must pass through this single.
Norman SecureTide Powerful cloud solution to stop spam and threats before it reaches your network.
Computer Orgnization Rabie A. Ramadan Lecture 7. Wired Control Unit What are the states of the following design:
CH2 System models.
CMPE 421 Parallel Computer Architecture
Distributed File Systems
Lecture 10 Memory Hierarchy and Cache Design Computer Architecture COE 501.
The Memory Hierarchy 21/05/2009Lecture 32_CA&O_Engr Umbreen Sabir.
10/18: Lecture topics Memory Hierarchy –Why it works: Locality –Levels in the hierarchy Cache access –Mapping strategies Cache performance Replacement.
Computer Architecture Memory organization. Types of Memory Cache Memory Serves as a buffer for frequently accessed data Small  High Cost RAM (Main Memory)
3-May-2006cse cache © DW Johnson and University of Washington1 Cache Memory CSE 410, Spring 2006 Computer Systems
CS 1104 Help Session I Caches Colin Tan, S
Cache Coherence Protocols 1 Cache Coherence Protocols in Shared Memory Multiprocessors Mehmet Şenvar.
ICP and the Squid Web Cache Duanc Wessels k Claffy August 13, 1997 元智大學系統實驗室 宮春富 2000/01/26.
Virtual Classes Provides an Innovative App for Education that Stimulates Engagement and Sharing Content and Experiences in Office 365 MICROSOFT OFFICE.
Introduction: Memory Management 2 Ideally programmers want memory that is large fast non volatile Memory hierarchy small amount of fast, expensive memory.
A BRIEF INTRODUCTION TO CACHE LOCALITY YIN WEI DONG 14 SS.
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
Caching Chapter 7.
11 Intro to cache memory Kosarev Nikolay MIPT Nov, 2009.
Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in.
1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.
1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson.
1  1998 Morgan Kaufmann Publishers Chapter Seven.
Overview on Web Caching COSC 513 Class Presentation Instructor: Prof. M. Anvari Student name: Wei Wei ID:
Office 365 is cloud- based productivity, hosted by Microsoft. Business-class Gain large, 50GB mailboxes that can send messages up to 25MB in size,
1 Contents Memory types & memory hierarchy Virtual memory (VM) Page replacement algorithms in case of VM.
CSE 351 Section 9 3/1/12.
Ramya Kandasamy CS 147 Section 3
Basic Performance Parameters in Computer Architecture:
Web Caching? Web Caching:.
Virtual Memory Chapter 8.
Andy Wang Operating Systems COP 4610 / CGS 5765
Translation Buffers (TLB’s)
Contents Memory types & memory hierarchy Virtual memory (VM)
Translation Buffers (TLB’s)
Memory Hierarchy Memory: hierarchy of components of various speeds and capacities Hierarchy driven by cost and performance In early days Primary memory.
Translation Buffers (TLBs)
Operating Systems: Internals and Design Principles, 6/E
Sarah Diesburg Operating Systems CS 3430
Review What are the advantages/disadvantages of pages versus segments?
Sarah Diesburg Operating Systems COP 4610
Presentation transcript:

Caching for Sustainability Alex Bunch

Agenda Intro Overview Background Analysis Implementation Future

Intro Caching is a systems technique to use relatively expensive hardware with special features On-chip SRAM is fast but costs more than memory Memory is faster than disk but… Web caching services (like Akamai) have low network latency to end users but can’t scale like datacenters How it works: Caching relies on evidence that some pieces of data are more likely to be accessed

Intro Methods for determining likelihood of access Spatial Locality: Data near data that has just been accessed is likely to be accessed. Temporal Locality: Data that has just been accessed is likely to be accessed again.

10000 ft. view The principle idea behind this research is that green hosts are a new type of hardware with special features These hosts offer either a service that is entirely run by renewable sources, or they supplement it by purchasing enough renewable energy credits to offset any dirty energy used

10000 ft. view The idea behind Greenmail is that it acts as a cache for s that are likely to be accessed and due to the fact that it is a zero carbon service the overall carbon footprint of the user goes down.

10000 ft. view

Background On green trends On green hosting On greenmail locality

Background One of the fundamental ideas that Greenmail is based on this that people want their services to be green. This idea is validated by the fact that the customer base for green hosts have increased 60% a year from [1]

Background Beyond simple customer interest, green products need to be competitively priced, as 83% of consumers would rather use a green service if it did not cost more than their dirty alternative[2] Green hosting is becoming significantly more prolific and in turn becomes competitive with dirty energy prices.

Background Green hosts are internet hosting companies that perform ‘green’ actions for their users that offset any carbon caused by their datacenter, either through the direct use of renewable energy, planting trees, or buying offsets.

Background Stating that exhibits temporal and/or spatial locality is a lofty claim, but intuition argues that a user who accesses an important will eventually reference it again. Our hope is that these claims are validated by the data.

Analysis One of the most classic equations in relation to caches is in regard to the Average Memory Access Time(AMAT): AMAT = Ht + r*Mt Where Ht is the cache hit time, r is the miss rate, and Mt is the miss penalty

Analysis Beyond serving as a great high level analogy, greenmail has a similar equation for Average Carbon Footprint: ACFP = Hc + r*Mc Where Hc is carbon associated with a cache hit, r is the miss rate, and Mc is the carbon miss penalty

Analyis Due to the fact that Greenmail is carbon neutral then Hc is 0 and since Mc is based on the original provider then the rate (r) is the only element of this equation that we can attempt to minimize, subject to our constraints.

Constraints As with classic caches, the miss rate is based partially on the size of the cache and the algorithm used to replace data. While the Algorithm can be modified depending on experimental data, the cache size has a cap.

Constraints Our cache size is self imposed to keep greenmail economically sound: our cost of maintaining the cache should not exceed the cost that the original provider spends storing all of a single users data.

Constraints The reason that this makes our cache smaller is that providers have two elements working to reduce their energy costs: Dirty energy – costs less than green energy. Economy of Scale – more users translates into spending less per user.

Constraints example Host A uses dirty power that costs half as much as green power, and due to the number of users it has it is able to purchase hardware at 75% the price Greenmail can. Greenmail must hold at most 37.5% of the s that the host does.

Implementation Our implementation of Greenmail is based on a modified version of SquirrelMail, a free open source web based application that has access to an IMAP proxy server.

Implementation Cache functionality comes from modifying the SquirrellMail IMAP functions. A single IMAP session consists of many messages being sent between the user running SquirrelMail and the initial provider, but only a few of them are worth caching.

Implementation Only two of these messages are ‘worth’ caching due to the fact most of the others are just a few lines long: ‘Get Headers’ – Returns a list of all the subjects in the relevant mailbox/search ‘Get Body’ – Returns the body of the requested

Implementation ‘Get Body’ – An encrypted local copy is made whenever this is called and when any subsequent calls are made the local copy is retrieved. ‘Get Headers’ – theoretically should be easy to cache, except there is a timestamp baked into it that is used for error checking

Implementation In addition to the modifications made to SquirrelMail, additional scripts needed to be made to allow for users to quickly and easily set up their own cache. Separate directories are made for each user due to how SquirrelMail stores IMAP configurations.

Results Currently in the process of collecting data from real users as there is no set test suite / benchmark that models users accessing s In the future if a good user ‘profile’ is found it is possible to automate this (x% spam, y% accessed frequently, etc)

Example Locality Analysis (not from Greenmail)

Future Work Heavy data analysis Cache Algorithms Caching Headers Caches searches Used to limit mailbox refresh rate Zoolander backend

Questions/References [1] The AMD Opteron Processor Helps AISO. [2] N. Holdings. The nielsen global online environmental survey, 2011.