Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 EuroSpeech-2003: Where have we been and where are we going? 1.Consistent progress over decades Moore’s.

Similar presentations


Presentation on theme: "1 EuroSpeech-2003: Where have we been and where are we going? 1.Consistent progress over decades Moore’s."— Presentation transcript:

1 1 EuroSpeech-2003: Where have we been and where are we going? Kenneth.Church@jhu.edu Kenneth.Church@jhu.edu 1.Consistent progress over decades Moore’s Law, Speech Coding, Error Rate 2.History repeats itself Empiricism: 1950s Rationalism: 1970s Empiricism: 1990s Rationalism: 2010s (?)  Discontinuities: Fundamental changes that invalidate fundamental assumptions Petabytes: $2,000,000  $2,000 (in a decade) Can demand keep up with supply? If not  Tech meltdown New priorities: Supply-side economics  Demand-side economics Search >> Compression & Dictation March 16, 2011

2 2 Bit Rate (kb/s) Speech Quality Excellent Good Fair Poor Bad Old Priority: Speech Compression New Priority: Speech Consumption ITU Recommendations Cellular Standards Secure Telephony 1980 Profile 1990 Profile 2000 Profile 2000 1980 1990 Borrowed Slide Rich Cox

3 3 How much is a Petabyte? (10 15 bytes) Question from execs: –How do I explain to a lay audience How much is a petabyte And why everyone will buy lots of them Wrong answer: –10 6 is a million (a floppy disk/email msg) –10 9 is a billion (a billion here, a billion there…) –10 12 is a trillion (the US debt) –10 15 is a zillion (= , an unimaginably large #) March 16, 2011

4 4 How much is a Petabyte? (10 15 bytes) Question from execs: –How do I explain to a lay audience How much is a petabyte And why everyone will buy lots of them Wrong answer: –10 6 is a million (a floppy disk/email msg) –10 9 is a billion (a billion here, a billion there…) –10 12 is a trillion (the US debt) –10 15 is a zillion (=∞, an unimaginably large #) March 16, 2011

5 5 Digital Immortality: Gordon Bell & Jim Gray (2000) Estimated Lifetime Storage Requirements Data-typesPer dayPer Lifetime email, papers, text0.5 MB15 GB photos2 MB150 GB speech40 MB1.2 TB music60 MB5.0 TB video-lite (200 Kb/s) 1 GB100 TB DVD video (4.3 Mb/s = 1.8 GB/hour) 20 GB1 PB Text won’t consume PB/person; Speech won’t either (but it’s closer) March 16, 2011

6 Our Problem: Create Demand for DataScope (5PBs) Text won’t do it –Internet Archive: 2PBs + 20TBs/month www.archive.org/about/faqs.php Speech won’t either (but it is closer) –Worldwide Telephone Traffic: 1PB/day 1PB/day = 3B hours of speech * 450 KBs/hour 6March 16, 2011 6B people use the phone for an hour/day (half of which is speech and half is silence)

7 Take-Aways Economics of Computing: –New Bottlenecks: Supply  Demand Old Challenge: How do we get more computing? Disks? Cycles? New Challenge: How do we consume more? New Challenge is Challenging –Text won’t consume DataScope (5PBs) –Speech won’t either (but it’s closer) Text & Speech Sizes –Internet Archive: 2PBs + 20TBs/month –Worldwide Telephone Traffic: 1PB/day New Bottlenecks: Storage  Transport –Transporting Internet Archive is more challenging than storing it –WANs  Sneakernet (Shipping Swappable Disks) 7March 16, 2011

8 DataScope Exciting (Expensive) Performance Servers –With lots of oomph Plus “Boring” (Cheap) Storage Servers –With lots of slots for swappable disks –(and not much else) –(as little as we can get away with) March 16, 20118

9 The Mobility Gap  Sneakernet Everything is getting better with Moore’s Law, but some things are getting better faster March 16, 20119 LawResource Growth per year Growth per decade Nielsen's Law Bandwidth to Home 1.5x57x Moore's Law CPU1.6x100x Internet Backbone 1.7x256x Kryder's Law Disk Capacity2.0x1024x

10 Sneakernet Gray’s Question: What is the best way to move a terabyte from place to place? Updated Version: What is the best way to move a petabyte from place to place? March 16, 201110

11 DataScope Storage Servers Moveable & non-moveable media –Wires: better latency & convenience (low vol) –Disks: better throughput & cost (high vol) Hybrid Combination: –Incremental Backup  Wires –Large Restores  Sneakernet –Import / Export  Sneakernet Users show up with 50 lbs of disks (carryon luggage) And use them productively within a day Metrics: Space & Time  Power & Pounds March 16, 201111

12 Amazon’s Import / Export (Scaled Up…) March 16, 201112 Internet Speed Time to Transfer 1PB at 80% Network Utilization When to Consider Sneakernet 100 Mbps1157 days5TB or more 1 Gbps116 days60TB or more 10 Gbps12 days600TBs or more

13 Moore’s Law & Sneakernet Moore’s Law: – Sneakernet  more attractive over time Moving companies compete with WANs –Man with Shipping Container –like Man with Van Commute between Work & Home –Car moves 10s of TBs per day –Flash on phone moves 10s of GBs per day Comparable to bandwidth of broadband to home March 16, 201113

14 Sneakernet: Better Throughput & Cost (at vol), but Worse Latency & Convenience March 16, 201114 WiresThroughputMonthly RentCapital Broadband to Home 1 Mbps$50/ monthNegligible WANs 10 Gbps$10k/ monthNegligible Moveable MediaPacket Size Throughput@ 1 Packet/Day Monthly RentCapital Shipping Containers 3 PBs276 GbpsNegligible$4 M Cars 10 TBs0.93 GbpsNegligible$1,000 Laptops 1 TB93 MbpsNegligible$1,000 Flash on Phones 100 GBs9.3 MbpsNegligible$100

15 Conclusion Challenge: –Increase Demand –To Keep up with Supply If demand fails to keep up with supply  –Tech Meltdown Sneakernet: Best hope –Wires can’t keep up –Wires cannot fill up today’s DataScope, Let alone tomorrow’s DataScope Roach Motels considered Harmful! –Data can check in, but it can’t check out... March 16, 201115

16 BACKUP March 16, 201116

17 On Delivering Embarrassingly Distributed Cloud Services Hotnets-2008 Ken Church Albert Greenberg James Hamilton {church, albert, jamesrh}@microsoft.com $1B $2M Board Affordable

18 Containers: Disruptive Technology Implications for Shipping –New Ships, Ports, Unions Implications for Hotnets –New Data Center Designs –Power/Networking Trade-offs –Cost Models: Expense vs. Capital –Apps: Embarrassingly Distributed Restriction on Embarrassingly Parallel –Machine Models Distributed Parallel Cluster  Parallel Cluster 18March 16, 2011

19 Mega vs. Micro Data Centers (DCs) POPs Cores/ POP Hardware/ POP Co-located With/Near 11,000,0001000 containers Mega Data Center 10100,000100 containers 10010,00010 containers Fiber Hotel/ Power Substation 1,000 1 container 10,0001001 rackCentral Office 100,000101 mini-tower P2P 1,000,0001embedded 19March 16, 2011

20 Data Center (DC) Cost Models: Power vs. Networking Big ticket items: Contents, Power, Networking –Currently, cost of DC (excluding contents) ≈ cost of contents Moore’s Law: more appropriate for contents than DC Eventually, cost of DC >> cost of contents Power: costs more than networking –Opportunity: Power Self-Serving Recommendation: –Save big $$ on power –By investing in what we do Networking Redesigning Apps (for geo-diversity) 20March 16, 2011

21 Cost Models: Expense vs. Capital Wide Agreement: Cost dominated by power –Less Agreement: Expense or Capital? Can we save $$ by putting clouds to sleep? –Yes, but only a little expense Big Opportunity: Capital –Capital: Batteries, Generators, Power Distribution –Worst case forecast (capital) >> Actual usage (expense) Bottleneck: Sunk Costs (Independent of Usage) –Rights to consume power: like the last seat on an airplane –Putting clouds to sleep: Like stripping weight on last seat Saves a little expense (fuel), But better to sell last seat (even at a deep discount) 21 Use it or lose it (more applicable for sunk capital than expense) March 16, 2011

22 Risk Management  Micro Risky and expensive –To put all our eggs in one basket (Mark Twain) Mega DC  Redundancy Mechanisms –Power: Batteries, Generators, Diesel Fuel Over 20% of DC costs is in power redundancy –Networking: Protection (SONET) Micro Alternative: Geo-Redundancy –N+1 Redundancy: More attractive for large N –Geo-Redundancy: Not appropriate for all apps But many apps are Embarrassingly Distributed 22March 16, 2011

23 Micro is Better: Both Capital & Expense 23March 16, 2011

24 Clouds  Condos  Containers Whenever we see a crazy idea –Within 2x of current practice, –Something is wrong Let’s go pick some low hanging fruit Though maybe not as crazy as we thought… –Changes are coming 24March 16, 2011

25 Container Abstraction Modular Data Center (≈2k Servers/Container) Cheap Units: millions, not billions –Affordable, even by universities Just-in-Time: –Easy to build, provision, move, buy/sell, operate Sealed Boxcar: No Serviceable Parts –No Humans Inside –No room for people, too hot, noisy, unsafe (no fire exits) –Not OSHA Compliant Self-contained unit with everything but –Power: 480V –External Network: 1 Gbps –Cooling: Chilled Water 25March 16, 2011

26 Containers are a Disruptive Technology: Implication for CS Theory/Algorithms Abstract Machine Model –Distributed Parallel Cluster  Parallel Cluster Distributed Container Farm –Parallel Cluster (with Boundaries) Better communication within containers Than across containers (wide area network) Challenge: –Find appropriate apps (not for everything) Apps that fit within boundaries Or distribute nicely across them (Embarrassingly Distributed) Embarrassingly Distributed: –Stronger Condition than Embarrassingly Parallel –Map Reduce, Sort, Scatter Gather 26March 16, 2011

27 Embarrassingly Distributed Apps Currently distributed: –voice mail, telephony (Skype), P2P file sharing (Napster), multicast, eBay, online games (Xbox Live), grid computing Obvious candidates: –spam filtering & email (Hotmail), backup, grep (simple but common forms of searching through a large corpus) Less obvious candidates: –map reduce computations (in the most general case), sort (in the most general case), social networking (Facebook) 27March 16, 2011

28 Conclusions Current Status: Long on Mega Industry is investing billions in mega –Lots of new mega Data Centers –Long-Term Assets/Liabilities Depreciating over 15 years Bottom line: –The industry would do well to consider –The micro alternative Geo-Diversity >> Batteries & Generators –Fragmentation Tax (sublinear) << Redundancy Tax (linear) Just-in-Time Options: –Easy to build, provision, move, buy/sell, operate Risk (hedge the long position on mega) $1B $2M 28 is aggressively adopting Board Affordable March 16, 2011


Download ppt "1 EuroSpeech-2003: Where have we been and where are we going? 1.Consistent progress over decades Moore’s."

Similar presentations


Ads by Google