Download presentation
Presentation is loading. Please wait.
Published byCarol Riley Modified over 8 years ago
1
1 EuroSpeech-2003: Where have we been and where are we going? Kenneth.Church@jhu.edu Kenneth.Church@jhu.edu 1.Consistent progress over decades Moore’s Law, Speech Coding, Error Rate 2.History repeats itself Empiricism: 1950s Rationalism: 1970s Empiricism: 1990s Rationalism: 2010s (?) Discontinuities: Fundamental changes that invalidate fundamental assumptions Petabytes: $2,000,000 $2,000 (in a decade) Can demand keep up with supply? If not Tech meltdown New priorities: Supply-side economics Demand-side economics Search >> Compression & Dictation March 16, 2011
2
2 Bit Rate (kb/s) Speech Quality Excellent Good Fair Poor Bad Old Priority: Speech Compression New Priority: Speech Consumption ITU Recommendations Cellular Standards Secure Telephony 1980 Profile 1990 Profile 2000 Profile 2000 1980 1990 Borrowed Slide Rich Cox
3
3 How much is a Petabyte? (10 15 bytes) Question from execs: –How do I explain to a lay audience How much is a petabyte And why everyone will buy lots of them Wrong answer: –10 6 is a million (a floppy disk/email msg) –10 9 is a billion (a billion here, a billion there…) –10 12 is a trillion (the US debt) –10 15 is a zillion (= , an unimaginably large #) March 16, 2011
4
4 How much is a Petabyte? (10 15 bytes) Question from execs: –How do I explain to a lay audience How much is a petabyte And why everyone will buy lots of them Wrong answer: –10 6 is a million (a floppy disk/email msg) –10 9 is a billion (a billion here, a billion there…) –10 12 is a trillion (the US debt) –10 15 is a zillion (=∞, an unimaginably large #) March 16, 2011
5
5 Digital Immortality: Gordon Bell & Jim Gray (2000) Estimated Lifetime Storage Requirements Data-typesPer dayPer Lifetime email, papers, text0.5 MB15 GB photos2 MB150 GB speech40 MB1.2 TB music60 MB5.0 TB video-lite (200 Kb/s) 1 GB100 TB DVD video (4.3 Mb/s = 1.8 GB/hour) 20 GB1 PB Text won’t consume PB/person; Speech won’t either (but it’s closer) March 16, 2011
6
Our Problem: Create Demand for DataScope (5PBs) Text won’t do it –Internet Archive: 2PBs + 20TBs/month www.archive.org/about/faqs.php Speech won’t either (but it is closer) –Worldwide Telephone Traffic: 1PB/day 1PB/day = 3B hours of speech * 450 KBs/hour 6March 16, 2011 6B people use the phone for an hour/day (half of which is speech and half is silence)
7
Take-Aways Economics of Computing: –New Bottlenecks: Supply Demand Old Challenge: How do we get more computing? Disks? Cycles? New Challenge: How do we consume more? New Challenge is Challenging –Text won’t consume DataScope (5PBs) –Speech won’t either (but it’s closer) Text & Speech Sizes –Internet Archive: 2PBs + 20TBs/month –Worldwide Telephone Traffic: 1PB/day New Bottlenecks: Storage Transport –Transporting Internet Archive is more challenging than storing it –WANs Sneakernet (Shipping Swappable Disks) 7March 16, 2011
8
DataScope Exciting (Expensive) Performance Servers –With lots of oomph Plus “Boring” (Cheap) Storage Servers –With lots of slots for swappable disks –(and not much else) –(as little as we can get away with) March 16, 20118
9
The Mobility Gap Sneakernet Everything is getting better with Moore’s Law, but some things are getting better faster March 16, 20119 LawResource Growth per year Growth per decade Nielsen's Law Bandwidth to Home 1.5x57x Moore's Law CPU1.6x100x Internet Backbone 1.7x256x Kryder's Law Disk Capacity2.0x1024x
10
Sneakernet Gray’s Question: What is the best way to move a terabyte from place to place? Updated Version: What is the best way to move a petabyte from place to place? March 16, 201110
11
DataScope Storage Servers Moveable & non-moveable media –Wires: better latency & convenience (low vol) –Disks: better throughput & cost (high vol) Hybrid Combination: –Incremental Backup Wires –Large Restores Sneakernet –Import / Export Sneakernet Users show up with 50 lbs of disks (carryon luggage) And use them productively within a day Metrics: Space & Time Power & Pounds March 16, 201111
12
Amazon’s Import / Export (Scaled Up…) March 16, 201112 Internet Speed Time to Transfer 1PB at 80% Network Utilization When to Consider Sneakernet 100 Mbps1157 days5TB or more 1 Gbps116 days60TB or more 10 Gbps12 days600TBs or more
13
Moore’s Law & Sneakernet Moore’s Law: – Sneakernet more attractive over time Moving companies compete with WANs –Man with Shipping Container –like Man with Van Commute between Work & Home –Car moves 10s of TBs per day –Flash on phone moves 10s of GBs per day Comparable to bandwidth of broadband to home March 16, 201113
14
Sneakernet: Better Throughput & Cost (at vol), but Worse Latency & Convenience March 16, 201114 WiresThroughputMonthly RentCapital Broadband to Home 1 Mbps$50/ monthNegligible WANs 10 Gbps$10k/ monthNegligible Moveable MediaPacket Size Throughput@ 1 Packet/Day Monthly RentCapital Shipping Containers 3 PBs276 GbpsNegligible$4 M Cars 10 TBs0.93 GbpsNegligible$1,000 Laptops 1 TB93 MbpsNegligible$1,000 Flash on Phones 100 GBs9.3 MbpsNegligible$100
15
Conclusion Challenge: –Increase Demand –To Keep up with Supply If demand fails to keep up with supply –Tech Meltdown Sneakernet: Best hope –Wires can’t keep up –Wires cannot fill up today’s DataScope, Let alone tomorrow’s DataScope Roach Motels considered Harmful! –Data can check in, but it can’t check out... March 16, 201115
16
BACKUP March 16, 201116
17
On Delivering Embarrassingly Distributed Cloud Services Hotnets-2008 Ken Church Albert Greenberg James Hamilton {church, albert, jamesrh}@microsoft.com $1B $2M Board Affordable
18
Containers: Disruptive Technology Implications for Shipping –New Ships, Ports, Unions Implications for Hotnets –New Data Center Designs –Power/Networking Trade-offs –Cost Models: Expense vs. Capital –Apps: Embarrassingly Distributed Restriction on Embarrassingly Parallel –Machine Models Distributed Parallel Cluster Parallel Cluster 18March 16, 2011
19
Mega vs. Micro Data Centers (DCs) POPs Cores/ POP Hardware/ POP Co-located With/Near 11,000,0001000 containers Mega Data Center 10100,000100 containers 10010,00010 containers Fiber Hotel/ Power Substation 1,000 1 container 10,0001001 rackCentral Office 100,000101 mini-tower P2P 1,000,0001embedded 19March 16, 2011
20
Data Center (DC) Cost Models: Power vs. Networking Big ticket items: Contents, Power, Networking –Currently, cost of DC (excluding contents) ≈ cost of contents Moore’s Law: more appropriate for contents than DC Eventually, cost of DC >> cost of contents Power: costs more than networking –Opportunity: Power Self-Serving Recommendation: –Save big $$ on power –By investing in what we do Networking Redesigning Apps (for geo-diversity) 20March 16, 2011
21
Cost Models: Expense vs. Capital Wide Agreement: Cost dominated by power –Less Agreement: Expense or Capital? Can we save $$ by putting clouds to sleep? –Yes, but only a little expense Big Opportunity: Capital –Capital: Batteries, Generators, Power Distribution –Worst case forecast (capital) >> Actual usage (expense) Bottleneck: Sunk Costs (Independent of Usage) –Rights to consume power: like the last seat on an airplane –Putting clouds to sleep: Like stripping weight on last seat Saves a little expense (fuel), But better to sell last seat (even at a deep discount) 21 Use it or lose it (more applicable for sunk capital than expense) March 16, 2011
22
Risk Management Micro Risky and expensive –To put all our eggs in one basket (Mark Twain) Mega DC Redundancy Mechanisms –Power: Batteries, Generators, Diesel Fuel Over 20% of DC costs is in power redundancy –Networking: Protection (SONET) Micro Alternative: Geo-Redundancy –N+1 Redundancy: More attractive for large N –Geo-Redundancy: Not appropriate for all apps But many apps are Embarrassingly Distributed 22March 16, 2011
23
Micro is Better: Both Capital & Expense 23March 16, 2011
24
Clouds Condos Containers Whenever we see a crazy idea –Within 2x of current practice, –Something is wrong Let’s go pick some low hanging fruit Though maybe not as crazy as we thought… –Changes are coming 24March 16, 2011
25
Container Abstraction Modular Data Center (≈2k Servers/Container) Cheap Units: millions, not billions –Affordable, even by universities Just-in-Time: –Easy to build, provision, move, buy/sell, operate Sealed Boxcar: No Serviceable Parts –No Humans Inside –No room for people, too hot, noisy, unsafe (no fire exits) –Not OSHA Compliant Self-contained unit with everything but –Power: 480V –External Network: 1 Gbps –Cooling: Chilled Water 25March 16, 2011
26
Containers are a Disruptive Technology: Implication for CS Theory/Algorithms Abstract Machine Model –Distributed Parallel Cluster Parallel Cluster Distributed Container Farm –Parallel Cluster (with Boundaries) Better communication within containers Than across containers (wide area network) Challenge: –Find appropriate apps (not for everything) Apps that fit within boundaries Or distribute nicely across them (Embarrassingly Distributed) Embarrassingly Distributed: –Stronger Condition than Embarrassingly Parallel –Map Reduce, Sort, Scatter Gather 26March 16, 2011
27
Embarrassingly Distributed Apps Currently distributed: –voice mail, telephony (Skype), P2P file sharing (Napster), multicast, eBay, online games (Xbox Live), grid computing Obvious candidates: –spam filtering & email (Hotmail), backup, grep (simple but common forms of searching through a large corpus) Less obvious candidates: –map reduce computations (in the most general case), sort (in the most general case), social networking (Facebook) 27March 16, 2011
28
Conclusions Current Status: Long on Mega Industry is investing billions in mega –Lots of new mega Data Centers –Long-Term Assets/Liabilities Depreciating over 15 years Bottom line: –The industry would do well to consider –The micro alternative Geo-Diversity >> Batteries & Generators –Fragmentation Tax (sublinear) << Redundancy Tax (linear) Just-in-Time Options: –Easy to build, provision, move, buy/sell, operate Risk (hedge the long position on mega) $1B $2M 28 is aggressively adopting Board Affordable March 16, 2011
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.