Presentation on theme: "1 The 5 Minute Rule Jim Gray Microsoft Research Kilo10 3 Mega10 6 Giga10 9 Tera10 12 today,"— Presentation transcript:
1 The 5 Minute Rule Jim Gray Microsoft Research Kilo10 3 Mega10 6 Giga10 9 Tera10 12 today, we are here Peta10 15 Exa10 18
2 Storage Hierarchy (9 levels) Cache 1, 2Cache 1, 2 Main (1, 2, 3 if nUMA).Main (1, 2, 3 if nUMA). Disk (1 (cached), 2)Disk (1 (cached), 2) Tape (1 (mounted), 2)Tape (1 (mounted), 2)
3 Meta-Message: Technology Ratios Are Important If everything gets faster & cheaper at the same rate THEN nothing really changes.If everything gets faster & cheaper at the same rate THEN nothing really changes. Things getting MUCH BETTER:Things getting MUCH BETTER: –communication speed & cost 1,000x –processor speed & cost 100x –storage size & cost 100x Things staying about the sameThings staying about the same –speed of light (more or less constant) –people (10x more expensive) –storage speed (only 10x better)
4 Todays Storage Hierarchy : Speed & Capacity vs Cost Tradeoffs Typical System (bytes) Size vs Speed Access Time (seconds) Cache Main Secondary Disc Nearline Tape Offline Tape Online Tape $/MB Price vs Speed Access Time (seconds) Cache Main Secondary Disc Nearline Tape Offline Tape Online Tape
5 Storage Ratios Changed 10x better access time10x better access time 10x more bandwidth10x more bandwidth 4,000x lower media price4,000x lower media price DRAM/DISK 100:1 to 10:10 to 50:1DRAM/DISK 100:1 to 10:10 to 50:1
6 Thesis: Performance =Storage Accesses not Instructions Executed In the old days we counted instructions and IOsIn the old days we counted instructions and IOs Now we count memory referencesNow we count memory references Processors wait most of the timeProcessors wait most of the time
7 The Pico Processor 1 M SPECmarks 10 6 clocks/ fault to bulk ram Event-horizon on chip. VM reincarnated Multi-program cache Terror Bytes!
8 Storage Latency: How Far Away is the Data? Registers On Chip Cache On Board Cache Memory Disk Tape /Optical Robot Sacramento This Campus This Room My Head 10 min 1.5 hr 2 Years 1 min Pluto 2,000 Years Andromeda
9 The Five Minute Rule Trade DRAM for Disk AccessesTrade DRAM for Disk Accesses Cost of an access (DriveCost / Access_per_second)Cost of an access (DriveCost / Access_per_second) Cost of a DRAM page ( $/MB / pages_per_MB)Cost of a DRAM page ( $/MB / pages_per_MB) Break even has two terms:Break even has two terms: Technology term and an Economic termTechnology term and an Economic term Grew page size to compensate for changing ratios.Grew page size to compensate for changing ratios. Still at 5 minute for random, 1 minute sequentialStill at 5 minute for random, 1 minute sequential
10 Shows Best Page Index Page Size ~16KB
11 Standard Storage Metrics Capacity:Capacity: –RAM: MB and $/MB: today at 10MB & 100$/MB –Disk:GB and $/GB: today at 10 GB and 200$/GB –Tape: TB and $/TB: today at.1TB and 25k$/TB (nearline) Access time (latency)Access time (latency) –RAM:100 ns –Disk: 10 ms –Tape: 30 second pick, 30 second position Transfer rateTransfer rate –RAM: 1 GB/s –Disk: 5 MB/s Arrays can go to 1GB/s –Tape: 5 MB/s striping is problematic
12 New Storage Metrics: Kaps, Maps, SCAN? Kaps: How many kilobyte objects served per secondKaps: How many kilobyte objects served per second –The file server, transaction processing metric –This is the OLD metric. Maps: How many megabyte objects served per secondMaps: How many megabyte objects served per second –The Multi-Media metric SCAN: How long to scan all the dataSCAN: How long to scan all the data –the data mining and utility metric AndAnd –Kaps/$, Maps/$, TBscan/$
13 For the Record (good 1998 devices packaged in system ) X 14
14 How To Get Lots of Maps, SCANs parallelism: use many little devices in parallelparallelism: use many little devices in parallel Beware of the media mythBeware of the media myth Beware of the access time mythBeware of the access time myth At 10 MB/s: 1.2 days to scan 1,000 x parallel: 100 seconds SCAN. Parallelism: divide a big problem into many smaller ones to be solved in parallel.
15 The Disk Farm On a Card The 100GB disc card An array of discs Can be used as 100 discs 100 discs 1 striped disc 1 striped disc 10 Fault Tolerant discs 10 Fault Tolerant discs....etc....etc LOTS of accesses/second bandwidth bandwidth 14" Life is cheap, its the accessories that cost ya. Processors are cheap, its the peripherals that cost ya (a 10k$ disc card).
16 Tape Farms for Tertiary Storage Not Mainframe Silos Scan in 27 hours. many independent tape robots (like a disc farm) 10K$ robot 14 tapes 500 GB 5 MB/s 20$/GB 30 Maps 100 robots 50TB 50$/GB 3K Maps 27 hr Scan 1M$
,000 10, ,000 1,, 1000 xDisc Farm STC Tape Robot 6,000 tapes, 8 readers 100x DLTTape Farm GB/K$ Maps SCANS/Day Kaps The Metrics: Disk and Tape Farms Win Data Motel: Data checks in, but it never checks out
18 Tape & Optical: Beware of the Media Myth Optical is cheap: 200 $/platter 2 GB/platter => 100$/GB (2x cheaper than disc) Tape is cheap:30 $/tape 20 GB/tape => 1.5 $/GB (100x cheaper than disc).
19 Tape & Optical Reality: Media is 10% of System Cost Tape needs a robot (10 k$... 3 m$ ) tapes (at 20GB each) => 20$/GB $/GB (1x…10x cheaper than disc) Optical needs a robot (100 k$ ) 100 platters = 200GB ( TODAY ) => 400 $/GB ( more expensive than mag disc ) Robots have poor access times Not good for Library of Congress (25TB) Data motel: data checks in but it never checks out!
20 The Access Time Myth The Myth: seek or pick time dominates The reality: (1) Queuing dominates (2) Transfer dominates BLOBs (2) Transfer dominates BLOBs (3) Disk seeks often short (3) Disk seeks often short Implication: many cheap servers better than one fast expensive server –shorter queues –parallel transfer –lower cost/access and cost/byte This is now obvious for disk arrays This will be obvious for tape arrays