1 Rules of Thumb in Data Engineering Jim Gray International Conference on Data Engineering San Diego, CA 4 March 2000

Slides:



Advertisements
Similar presentations
1 The 5 Minute Rule Jim Gray Microsoft Research Kilo10 3 Mega10 6 Giga10 9 Tera10 12 today,
Advertisements

IT253: Computer Organization
CS 6560: Operating Systems Design
CA 714CA Midterm Review. C5 Cache Optimization Reduce miss penalty –Hardware and software Reduce miss rate –Hardware and software Reduce hit time –Hardware.
CS 277 – Spring 2002Notes 21 CS 277: Database System Implementation Notes 02: Hardware Arthur Keller.
4/5/20001 Windows 2000 IO Performance Leonard Chung & Jim Gray.
1 Magnetic Disks 1956: IBM (RAMAC) first disk drive 5 Mb – Mb/in $/year 9 Kb/sec 1980: SEAGATE first 5.25’’ disk drive 5 Mb – 1.96 Mb/in2 625.
CSCE 212 Chapter 8 Storage, Networks, and Other Peripherals Instructor: Jason D. Bakos.
1 Advanced Database Technology February 12, 2004 DATA STORAGE (Lecture based on [GUW ], [Sanders03, ], and [MaheshwariZeh03, ])
CS CS 5150 Software Engineering Lecture 19 Performance.
1  1998 Morgan Kaufmann Publishers Chapter 8 Storage, Networks and Other Peripherals.
Computational Astrophysics: Methodology 1.Identify astrophysical problem 2.Write down corresponding equations 3.Identify numerical algorithm 4.Find a computer.
1 CS 501 Spring 2005 CS 501: Software Engineering Lecture 22 Performance of Computer Systems.
Secondary Storage CSCI 444/544 Operating Systems Fall 2008.
1 CS222: Principles of Database Management Fall 2010 Professor Chen Li Department of Computer Science University of California, Irvine Notes 01.
Secondary Storage Management Hank Levy. 8/7/20152 Secondary Storage • Secondary Storage is usually: –anything outside of “primary memory” –storage that.
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
1 CS : Technology Trends Ion Stoica ( September 12, 2011.
Systems I Locality and Caching
“Five minute rule ten years later and other computer storage rules of thumb” Authors: Jim Gray, Goetz Graefe Reviewed by: Nagapramod Mandagere Biplob Debnath.
Computer Organization CS224 Fall 2012 Lesson 51. Measuring I/O Performance  I/O performance depends on l Hardware: CPU, memory, controllers, buses l.
DISKS IS421. DISK  A disk consists of Read/write head, and arm  A platter is divided into Tracks and sector  The R/W heads can R/W at the same time.
Storage & Peripherals Disks, Networks, and Other Devices.
1 CHAPTER 2 COMPUTER HARDWARE. 2 The Significance of Hardware  Pace of hardware development is extremely fast. Keeping up requires a basic understanding.
Lecture 11: DMBS Internals
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Operating Systems Overview: Using Hardware.
I/O – Chapter 8 Introduction Disk Storage and Dependability – 8.2 Buses and other connectors – 8.4 I/O performance measures – 8.6.
1 Chapter 7: Storage Systems Introduction Magnetic disks Buses RAID: Redundant Arrays of Inexpensive Disks.
IT 344: Operating Systems Winter 2010 Module 13 Secondary Storage Chia-Chi Teng CTB 265.
Chapter 111 Chapter 11: Hardware (Slides by Hector Garcia-Molina,
Hardware Trends. Contents Memory Hard Disks Processors Network Accessories Future.
1 Selecting LAN server (Week 3, Monday 9/8/2003) © Abdou Illia, Fall 2003.
1 CS 501 Spring 2006 CS 501: Software Engineering Lecture 22 Performance of Computer Systems.
+ CS 325: CS Hardware and Software Organization and Architecture Memory Organization.
1 Yotta Zetta Exa Peta Tera Giga Mega Kilo Storage: Alternate Futures Jim Gray Microsoft Research IBM Almaden,
Programming for GCSE Topic 5.1: Memory and Storage T eaching L ondon C omputing William Marsh School of Electronic Engineering and Computer Science Queen.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
1 CS : Technology Trends Ion Stoica and Ali Ghodsi ( August 31, 2015.
Csci 136 Computer Architecture II – IO and Storage Systems Xiuzhen Cheng
1Thu D. NguyenCS 545: Distributed Systems CS 545: Distributed Systems Spring 2002 Communication Medium Thu D. Nguyen
1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.
1 Yotta Zetta Exa Peta Tera Giga Mega Kilo Storage: Alternate Futures Jim Gray Microsoft Research Research.Micrsoft.com/~Gray/talks NetStore ’99 Seattle.
1 Rules of Thumb in Data Engineering Jim Gray University of Illinois at Urbana Champaign 23 April 2001
CS 6290 I/O and Storage Milos Prvulovic. Storage Systems I/O performance (bandwidth, latency) –Bandwidth improving, but not as fast as CPU –Latency improving.
Abstract Increases in CPU and memory will be wasted if not matched by similar performance in I/O SLED vs. RAID 5 levels of RAID and respective cost/performance.
Storage Systems CSE 598d, Spring 2007 Lecture ?: Rules of thumb in data engineering Paper by Jim Gray and Prashant Shenoy Feb 15, 2007.
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
COSC 6340: Disks 1 Disks and Files DBMS stores information on (“hard”) disks. This has major implications for DBMS design! » READ: transfer data from disk.
Main Memory Main memory – –a collection of storage locations, –each with a unique identifier called the address. Word- –Data are transferred to and from.
1 Lecture 16: Data Storage Wednesday, November 6, 2006.
1 Meta-Message: Technology Ratios Matter Price and Performance change. If everything changes in the same way, then nothing really changes. If some things.
CSE 451: Operating Systems Spring 2010 Module 12.5 Secondary Storage John Zahorjan Allen Center 534.
TYPES OF MEMORY.
Processing Device and Storage Devices
Hardware Technology Trends and Database Opportunities
Lecture 16: Data Storage Wednesday, November 6, 2006.
IT 344: Operating Systems Winter 2008 Module 13 Secondary Storage
CS : Technology Trends August 31, 2015 Ion Stoica and Ali Ghodsi (
Virtual Memory Main memory can act as a cache for the secondary storage (disk) Advantages: illusion of having more physical memory program relocation protection.
CSE 451: Operating Systems Winter 2006 Module 13 Secondary Storage
CSE 451: Operating Systems Winter 2007 Module 13 Secondary Storage
CSE 451: Operating Systems Spring 2006 Module 13 Secondary Storage
CSE 451: Operating Systems Secondary Storage
CSE 451: Operating Systems Spring 2005 Module 13 Secondary Storage
CSE 451: Operating Systems Autumn 2004 Secondary Storage
CSE 451: Operating Systems Winter 2004 Module 13 Secondary Storage
Cache Memory and Performance
[Altinbuke, Walsh, Weatherspoon, Bala, Bracy, McKee, and Sirer]
CS 245: Database System Principles Notes 02: Hardware
Presentation transcript:

1 Rules of Thumb in Data Engineering Jim Gray International Conference on Data Engineering San Diego, CA 4 March

2 Credits & Thank You!! Prashant Shenoy U. Mass, Amherst analysis of web caching rules. Terrance Kelly, U. Michigan, lots of advice on fixing the paper, interesting work on caching at: Dave Lomet, Paul Larson, Surajit Chaudhuri how big should database pages be? Remzi Arpaci-Dusseau, Kim Keeton, Erik Riedel discussions about balanced systems an IO Windsor Hsu, Alan Smith, & Honesty Young, also studied TPC-C and balanced systems (very nice work!) Anastassia Ailamaki, Kim Keeton cpi measurements Gordon Bell discussions on balanced systems.

3 Woops! and Apology….. Printed/Published paper has MANY bugs! Conclusions OK (sort of  ), but typos, flaws, errors,… Revised version at and in CoRR and MS Research tech report archive. By 15 March Sorry!

4 Outline Moore’s Law and consequences Storage rules of thumb Balanced systems rules revisited Networking rules of thumb Caching rules of thumb

5 Trends: Moore’s Law Performance/Price doubles every 18 months 100x per decade Progress in next 18 months = ALL previous progress New storage = sum of all old storage (ever) New processing = sum of all old processing. E. coli double ever 20 minutes! 15 years ago

6 Trends: ops/s/$ Had Three Growth Phases Mechanical Relay 7-year doubling Tube, transistor, year doubling Microprocessor 1.0 year doubling

7 Trends: Gilder’s Law: 3x bandwidth/year for 25 more years Today: 10 Gbps per channel 4 channels per fiber: 40 Gbps 32 fibers/bundle = 1.2 Tbps/bundle In lab 3 Tbps/fiber (400 x WDM) In theory 25 Tbps per fiber 1 Tbps = USA 1996 WAN bisection bandwidth Aggregate bandwidth doubles every 8 months! 1 fiber = 25 Tbps

8 Trends: Magnetic Storage Densities Amazing progress Ratios have changed Capacity grows 60%/y Access speed grows 10x more slowly

9 Trends: Density Limits The end is near! Products:11 Gbpsi Lab: 35 Gbpsi “limit”: 60 Gbpsi But limit keeps rising & there are alternatives Bit Density 3 2 3,000 2,000 1, b/µm 2 Gb/in CD DVD ODD Wavelength Limit SuperParmagnetic Limit ?: NEMS, Florescent? Holograpic, DNA? Figure adapted from Franco Vitaliano, “The NEW new media: the growing attraction of nonmagnetic storage”, Data Storage, Feb 2000, pp 21-32, Density vs Time b/µm 2 & Gb/in 2

10 Trends: promises NEMS (Nano Electro Mechanical Systems) ( also Cornell, IBM, CMU,… 250 Gbpsi by using tunneling electronic microscope Disk replacement Capacity:180 GB now, 1.4 TB in 2 years Transfer rate: 100 MB/sec R&W Latency: 0.5msec Power: 23W active,.05W Standby 10k$/TB now, 2k$/TB in 2002

11 Consequence of Moore’s law: Need an address bit every 18 months. Moore’s law gives you 2x more in 18 months. RAM Today we have 10 MB to 100 GB machines (24-36 bits of addressing) then In 9 years we will need 6 more bits: bit addressing (4TB ram). Disks Today we have 10 GB to 100 TB file systems/DBs (33-47 bit file addresses) In 9 years, we will need 6 more bits bit file addresses (100 PB files)

12 Architecture could change this 1-level store: System 48, AS400 has 1-level store. Never re-uses an address. Needs 96-bit addressing today. NUMAs and Clusters Willing to buy a 100 M$ computer? Then add 6 more address bits. Only 1-level store pushes us beyond 64-bits Still, these are “logical” addresses, 64-bit physical will last many years

13 Outline Moore’s Law and consequences Storage rules of thumb Balanced systems rules revisited Networking rules of thumb Caching rules of thumb

14 Storage Latency: How Far Away is the Data? Registers On Chip Cache On Board Cache Memory Disk Tape /Optical Robot Olympia This Hotel This Room My Head 10 min 1.5 hr 2 Years 1 min Pluto 2,000 Years Andromeda

15 Storage Hierarchy : Speed & Capacity vs Cost Tradeoffs Typical System (bytes) Size vs Speed Access Time (seconds) Cache Main Secondary Disc Nearline Tape Offline Tape Online Tape $/MB Price vs Speed Access Time (seconds) Cache Main Secondary Disc Nearline Tape Offline Tape Online Tape

16 Disks: Today Disk is 8GB to 80 GB MBps 5k-15k rpm (6ms-2ms rotational latency) 12ms-7ms seek 7K$/IDE-TB, 20k$/SCSI-TB For shared disks most time spent waiting in queue for access to arm/controller Seek Rotate Transfer Seek Rotate Transfer Wait

17 Standard Storage Metrics Capacity: RAM: MB and $/MB: today at 512MB and 3$/MB Disk:GB and $/GB: today at 40GB and 20$/GB Tape: TB and $/TB: today at 40GB and 10k$/TB (nearline) Access time (latency) RAM: 100 ns Disk: 15 ms Tape: 30 second pick, 30 second position Transfer rate RAM: 1-10 GB/s Disk: MB/s - - -Arrays can go to 10GB/s Tape: 5-15 MB/s Arrays can go to 1GB/s

18 New Storage Metrics: Kaps, Maps, SCAN Kaps: How many kilobyte objects served per second The file server, transaction processing metric This is the OLD metric. Maps: How many megabyte objects served per sec The Multi-Media metric SCAN: How long to scan all the data the data mining and utility metric And Kaps/$, Maps/$, TBscan/$

19 For the Record (good 1999 devices packaged in system X 100 Tape is 1Tb with 4 DLT readers at 5MBps each.

20 For the Record (good 1999 devices packaged in system ) Tape is 1Tb with 4 DLT readers at 5MBps each.

21 Disk Changes Disks got cheaper: 20k$ -> 1K$ (or even 200$) $/Kaps etc improved 100x (Moore’s law!) (or even 500x) One-time event (went from mainframe prices to PC prices) Disk data got cooler (10x per decade): 1990 disk ~ 1GB and 50Kaps and 5 minute scan 2000 disk ~70GB and 120Kaps and 45 minute scan So 1990: 1 Kaps per 20 MB 2000: 1 Kaps per 500 MB disk scans take longer (10x per decade) Backup/restore takes a long time (too long)

22 Storage Ratios Changed 10x better access time 10x more bandwidth 100x more capacity Data 25x cooler (1Kaps/20MB vs 1Kaps/500MB) 4,000x lower media price 20x to 100x lower disk price Scan takes 10x longer (3 min vs 45 min) DRAM/disk media price ratio changed : : :1 today ~ 0.03$/MB disk 100:1 3$/MB dram

23 Data on Disk Can Move to RAM in 10 years 100:1 10 years

24 More Kaps and Kaps/$ but…. Disk accesses got much less expensive Better disks Cheaper disks! But: disk arms are expensive the scarce resource 45 minute Scan vs 5 minutes in GB 30 MB/s

25 Disk vs Tape Disk 40 GB 20 MBps 5 ms seek time 3 ms rotate latency 7$/GB for drive 3$/GB for ctlrs/cabinet 4 TB/rack 1 hour scan Tape 40 GB 10 MBps 10 sec pick time second seek time 2$/GB for media 8$/GB for drive+library 10 TB/rack 1 week scan The price advantage of tape is narrowing, and the performance advantage of disk is growing At 10K$/TB, disk is competitive with nearline tape. Guestimates Cern: 200 TB 3480 tapes 2 col = 50GB Rack = 1 TB =20 drives

26 Caveat: Tape vendors may innovate Sony DTF-2 is 100 GB, 24 MBps 30 second pick time So, 2x better Prices not clear

27 It’s Hard to Archive a Petabyte It takes a LONG time to restore it. At 1GBps it takes 12 days! Store it in two (or more) places online (on disk?). A geo-plex Scrub it continuously (look for errors) On failure, use other copy until failure repaired, refresh lost copy from safe copy. Can organize the two copies differently (e.g.: one by time, one by space)

28 The “Absurd” 10x (=5 year) Disk 2.5 hr scan time (poor sequential access) 1 aps / 5 GB (VERY cold data) It’s a tape! 1 TB 100 MB/s 200 Kaps

29 How to cool disk data: Cache data in main memory See 5 minute rule later in presentation Fewer-larger transfers Larger pages (512-> 8KB -> 256KB) Sequential rather than random access Random 8KB IO is 1.5 MBps Sequential IO is 30 MBps (20:1 ratio is growing) Raid1 (mirroring) rather than Raid5 (parity).

30 Stripes, Mirrors, Parity (RAID 0,1, 5) RAID 0: Stripes bandwidth RAID 1: Mirrors, Shadows,… Fault tolerance Reads faster, writes 2x slower RAID 5: Parity Fault tolerance Reads faster Writes 4x or 6x slower. 0,3,6,..1,4,7,..2,5,8,.. 0,1,2,.. 0,2,P2,..1,P1,4,..P0,3,5,..

31 RAID 10 (strips of mirrors) Wins “wastes space, saves arms” RAID 5 (6 disks 1 vol): Performance 675 reads/sec 210 writes/sec Write  4 logical IO,  2 seek rotate SAVES SPACE Performance degrades on failure RAID1 (6 disks, 3 pairs) Performance 750 reads/sec 300 writes/sec Write  2 logical IO  2 seek 0.7 rotate SAVES ARMS Performance improves on failure

32 Shows Best Page Index Page Size ~16KB

33 Auto Manage Storage 1980 rule of thumb: A DataAdmin per 10GB, SysAdmin per mips 2000 rule of thumb A DataAdmin per 5TB SysAdmin per 100 clones (varies with app). Problem: 5TB is 60k$ today, 10k$ in a few years. Admin cost >> storage cost !!!! Challenge: Automate ALL storage admin tasks

34 Summarizing storage rules of thumb (1) Moore’s law: 4x every 3 years 100x more per decade Implies 2 bit of addressing every 3 years. Storage capacities increase 100x/decade Storage costs drop 100x per decade Storage throughput increases 10x/decade Data cools 10x/decade Disk page sizes increase 5x per decade.

35 Summarizing storage rules of thumb (2) RAM:Disk and Disk:Tape cost ratios are 100:1 and 3:1 So, in 10 years, disk data can move to RAM since prices decline 100x per decade. A person can administer a million dollars of disk storage: that is 1TB - 100TB today Disks are replacing tapes as backup devices. You can’t backup/restore a Petabyte quickly so geoplex it. Mirroring rather than Parity to save disk arms

36 Outline Moore’s Law and consequences Storage rules of thumb Balanced systems rules revisited Networking rules of thumb Caching rules of thumb

37 Standard Architecture (today) PCI Bus 2 System Bus PCI Bus 1

38 Amdahl’s Balance Laws parallelism law: If a computation has a serial part S and a parallel component P, then the maximum speedup is (S+P)/S. balanced system law: A system needs a bit of IO per second per instruction per second: about 8 MIPS per MBps. memory law:  = 1: the MB/MIPS ratio (called alpha (  )), in a balanced system is 1. IO law: Programs do one IO per 50,000 instructions.

39 Amdahl’s Laws Valid 35 Years Later? Parallelism law is algebra: so SURE! Balanced system laws? Look at tpc results (tpcC, tpcH) at Some imagination needed: What’s an instruction (CPI varies from 1-3)?  RISC, CISC, VLIW, … clocks per instruction,… What’s an I/O?

40 Disks / cpu TPC systems Normalize for CPI (clocks per instruction) TPC-C has about 7 ins/byte of IO TPC-H has 3 ins/byte of IO TPC-H needs ½ as many disks, sequential vs random Both use 9GB 10 krpm disks (need arms, not bytes) MHz/ cpu CPImips KB / IO IO/s / disk Disk s MB/s / cpu Ins/ IO Byte Amdahl TPC-C= random TPC-H= sequential

41 TPC systems: What’s alpha (=MB/MIPS ) ? Hard to say: Intel 32 bit addressing (= 4GB limit). Known CPI. IBM, HP, Sun have 64 GB limit. Unknown CPI. Look at both, guess CPI for IBM, HP, Sun Alpha is between 1 and 6 MipsMemory Alpha Amdahl11 1 tpcC Intel8x262 = 2Gips4GB 2 tpcH Intel8x458 = 4Gips4GB 1 tpcC IBM24 cpus ?= 12 Gips64GB 6 tpcH HP32 cpus ?= 16 Gips32 GB 2

42 Instructions per IO? We know 8 mips per MBps of IO So, 8KB page is 64 K instructions And 64KB page is 512 K instructions. But, sequential has fewer instructions/byte. (3 vs 7 in tpcH vs tpcC). So, 64KB page is 200 K instructions.

43 Amdahl’s Balance Laws Revised Laws right, just need “interpretation” (imagination?) Balanced System Law: A system needs 8 MIPS/MBpsIO, but instruction rate must be measured on the workload. Sequential workloads have low CPI (clocks per instruction), random workloads tend to have higher CPI. Alpha (the MB/MIPS ratio) is rising from 1 to 6. This trend will likely continue. One Random IO’s per 50k instructions. Sequential IOs are larger One sequential IO per 200k instructions

44 PAP vs RAP Peak Advertised Performance vs Real Application Performance File System Application Data 133 MBps 90 MBps PCI 66 MBps 25 MBps Disks SCSI 160 MBps 90 MBps 1600 MBps 500 MBps System Bus 550 x4 Mips = 2 Bips 1-3 cpi = mips CPU PCI Bus 2 System Bus PCI Bus 1

45 Outline Moore’s Law and consequences Storage rules of thumb Balanced systems rules revisited Networking rules of thumb Caching rules of thumb

46 Standard IO (Infiniband™) in 5 Years? Probably Replace PCI with something better will still need a mezzanine bus standard Multiple serial links directly from processor Fast (10 GBps/link) for a few meters System Area Networks (SANS) ubiquitous (VIA morphs to SIO?)

47 1 GBps Ubiquitous 10 GBps SANs in 5 years 1Gbps Ethernet are reality now. Also FiberChannel,MyriNet, GigaNet, ServerNet,, ATM,… 10 Gbps x4 WDM deployed now (OC192) 3 Tbps WDM working in lab In 5 years, expect 10x, wow!! 5 MBps 20 MBps 40 MBps 80 MBps 120 MBps (1Gbps)

48 Networking WANS are getting faster than LANS G8 = OC192 = 8Gbps is “standard” Link bandwidth improves 4x per 3 years Speed of light (60 ms round trip in US) Software stacks have always been the problem. Time = SenderCPU + ReceiverCPU + bytes/bandwidth This has been the problem

49 The Promise of SAN/VIA:10x in 2 years Yesterday: 10 MBps (100 Mbps Ethernet) ~20 MBps tcp/ip saturates 2 cpus round-trip latency ~250 µs Now Wires are 10x faster Myrinet, Gbps Ethernet, ServerNet,… Fast user-level communication  tcp/ip ~ 100 MBps 10% cpu  round-trip latency is 15 us 1.6 Gbps demoed on a WAN

50 How much does wire-time cost? $/Mbyte? CostTime Gbps Ethernet.2µ$ 10 ms 100 Mbps Ethernet.3µ$100 ms OC12 (650 Mbps).003$ 20 ms DSL.0006$ 25 sec POTs.002$200 sec Wireless:.80$500 sec

51 The Network Revolution Networking folks are finally streamlining LAN case (SAN). Offloading protocol to NIC ½ power point is 8KB Min round trip latency is ~50 µs. 3k ins +.1 ins/byte High-Performance Distributed Objects over a System Area Network Li, L. ; Forin, A. ; Hunt, G. ; Wang, Y., MSR-TR-98-68

52 Outline Moore’s Law and consequences Storage rules of thumb Balanced systems rules revisited Networking rules of thumb Caching rules of thumb

53 The Five Minute Rule Trade DRAM for Disk Accesses Cost of an access (Drive_Cost / Access_per_second) Cost of a DRAM page ( $/MB/ pages_per_MB) Break even has two terms: Technology term and an Economic term Grew page size to compensate for changing ratios. Now at 5 minutes for random, 10 seconds sequential

54 Cost a RAM Page RAM_$_Per_MB PagesPerMB The 5 Minute Rule Derived Breakeven: RAM_$_Per_MB = _____ DiskPrice. PagesPerMB T x AccessesPerSecond T = DiskPrice x PagesPerMB. RAM_$_Per_MB x AccessPerSecond $ ( ) /T T =TimeBetweenReferences to Page Disk Access Cost /T DiskPrice. AccessesPerSecond

55 Plugging in the Numbers PPM/aps disk$/Ram$ Break Even Random 128/120 ~1 1000/3 ~300 5 minutes Sequential 1/30 ~.03 ~ seconds Trend is longer times because disk$ not changing much, RAM$ declining 100x/decade 5 Minutes & 10 second rule

56 When to Cache Web Pages. Caching saves user time Caching saves wire time Caching costs storage Caching only works sometimes: New pages are a miss Stale pages are a miss

57 The 10 Instruction Rule Spend 10 instructions /second to save 1 byte Cost of instruction: I =ProcessorCost/MIPS*LifeTime Cost of byte: B = RAM_$_Per_B/LifeTime Breakeven: NxI = B N = B/I = (RAM_$_B X MIPS)/ ProcessorCost ~ (3E-6x5E8)/500 = 3 ins/B for Intel ~ (3E-6x3E8)/10 = 10 ins/B for ARM

58 Web Page Caching Saves People Time Assume people cost 20$/hour (or.2 $/hr ???) Assume 20% hit in browser, 40% in proxy Assume 3 second server time Caching saves people time 28$/year to 150$/year of people time or.28 cents to 1.5$/year.

59 Web Page Caching Saves Resources Wire cost is penny (wireless) to 100µ$ LAN Storage is 8 µ$/mo Breakeven: wire cost = storage rent 4 to 7 months Add people cost: breakeven is ~ 4 years. “cheap people” (.2$/hr)  6 to 8 months.

60 Caching Disk caching 5 minute rule for random IO 11 second rule for sequential IO Web page caching: If page will be re-referenced in 18 months: with free users 15 years: with valuable users then cache the page in the client/proxy. Challenge: guessing which pages will be re-referenced detecting stale pages (page velocity)

61 Outline Moore’s Law and consequences Storage rules of thumb Balanced systems rules revisited Networking rules of thumb Caching rules of thumb