Scaleable Computing Jim Gray Microsoft Corporation

Slides:



Advertisements
Similar presentations
Symantec 2010 Windows 7 Migration EMEA Results. Methodology Applied Research performed survey 1,360 enterprises worldwide SMBs and enterprises Cross-industry.
Advertisements

Symantec 2010 Windows 7 Migration Global Results.
From Startup to Enterprise A Story of MySQL Evolution Vidur Apparao, CTO Stephen OSullivan, Manager of Data and Grid Technologies April 2009.
Distributed Systems Architectures
Computer Industry Laws (rules of thumb)
Clustering Technology For Scaleability Jim Gray Microsoft Research
U Computer Systems Research: Past and Future u Butler Lampson u People have been inventing new ideas in computer systems for nearly four decades, usually.
1 The 5 Minute Rule Jim Gray Microsoft Research Kilo10 3 Mega10 6 Giga10 9 Tera10 12 today,
How Much Do I Remember? Are you ready to play.....
SYNAPSE I.S. Ticket Sales & Management. SYNAPSE I.S. What is Olympia ? Olympia is NOT just another internet sales service. Olympia is a tool for the ticketing.
I/O Chapter 8. Outline Introduction Disk Storage and Dependability – 8.2 Buses and other connectors – 8.4 I/O performance measures – 8.6.
The IP Revolution. Page 2 The IP Revolution IP Revolution Why now? The 3 Pillars of the IP Revolution How IP changes everything.
Why should I consider Implementing a Document Imaging / Management System? Created by Harold Hegerhorst North American Technology. LLC © North American.
Distributed Processing, Client/Server and Clusters
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Introduction to Computer Administration Introduction.
Chapter 9: The Client/Server Database Environment
Hardware & the Machine room Week 5 – Lecture 1. What is behind the wall plug for your workstation? Today we will look at the platform on which our Information.
Operating System.
1 Chapter 11: Data Centre Administration Objectives Data Centre Structure Data Centre Structure Data Centre Administration Data Centre Administration Data.
System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
Database Architectures and the Web
Technical Architectures
CS CS 5150 Software Engineering Lecture 19 Performance.
INTERNET DATABASE Chapter 9. u Basics of Internet, Web, HTTP, HTML, URLs. u Advantages and disadvantages of Web as a database platform. u Approaches for.
Server Platforms Week 11- Lecture 1. Server Market $ 46,100,000,000 ($ 46.1 Billion) Gartner.
1 CS 501 Spring 2005 CS 501: Software Engineering Lecture 22 Performance of Computer Systems.
The Architecture of Transaction Processing Systems
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
By Mr. Abdalla A. Shaame 1. What is Computer An electronic device that stores, retrieves, and processes data, and can be programmed with instructions.
Introduction to Computer Terminology
1 © Prentice Hall, 2002 The Client/Server Database Environment.
Introduction to Computers Essential Understanding of Computers and Computer Operations.
MIS 175 Spring Learning Objectives When you finish this chapter, you will: –Recognize major components of an electronic computer. –Understand how.
Client/Server Architectures
Introduction to Computers
6/1/2001 Supplementing Aleph Reports Using The Crystal Reports Web Component Server Presented by Bob Gerrity Head.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
Operating System. Architecture of Computer System Hardware Operating System (OS) Programming Language (e.g. PASCAL) Application Programs (e.g. WORD, EXCEL)
Local Area Networks (LAN) are small networks, with a short distance for the cables to run, typically a room, a floor, or a building. - LANs are limited.
What does a Computer Do?. What is a Computer? A computer is an electronic device, operating under the control of instructions stored in its own memory,
Computing and the Web Operating Systems. Overview n What is an Operating System n Booting the Computer n User Interfaces n Files and File Management n.
Future of Application Development Keith Jaeger. ©1998 YOUR COMPANY NAME HERE Unprecedented Change Huge amounts will be spent to change applications in.
Eng.Abed Al Ghani H. Abu Jabal Introduction to computers.
Unit – I CLIENT / SERVER ARCHITECTURE. Unit Structure  Evolution of Client/Server Architecture  Client/Server Model  Characteristics of Client/Server.
1 CS 501 Spring 2006 CS 501: Software Engineering Lecture 22 Performance of Computer Systems.
Scaleable Computing Jim Gray Researcher US-WAT MSR San Francisco Microsoft Corporation ™
Parallel Database Systems Instructor: Dr. Yingshu Li Student: Chunyu Ai.
Database Architectures Database System Architectures Considerations – Data storage: Where do the data and DBMS reside? – Processing: Where.
20 October Management of Information Technology Chapter 6 Chapter 6 IT Infrastructure and Platforms Asst. Prof. Wichai Bunchua.
1 Laws of Cyberspace Jim Gray Microsoft Research with help from Gordon Bell, Nathan Myrvold and laws by Bell, Moore, Gates, Joy, Gilder, Grove, Grosch,
Scaleable Computing Jim Gray Microsoft Corporation ™
Scaleable Servers Jim Gray
Scaleable Computing Jim Gray Microsoft Corporation ™
Scaleable Computing Jim Gray Microsoft Corporation ™
1 CS 501 Spring 2003 CS 501: Software Engineering Lecture 23 Performance of Computer Systems.
E-commerce Architecture Ayşe Başar Bener. Client Server Architecture E-commerce is based on client/ server architecture –Client processes requesting service.
Chapter 9: The Client/Server Database Environment
Netscape Application Server
Operating System.
Open Source distributed document DB for an enterprise
CSE 410, Spring 2006 Computer Systems
Distributed system (Lecture 02)
The Client/Server Database Environment
TYPES OFF OPERATING SYSTEM
Scaleable Computing Jim Gray Researcher US-WAT MSR San Francisco Microsoft Corporation ™
Clustering Technology For Fault Tolerance
Chapter 17: Database System Architectures
Scaleout vs. Scaleup Robert Barnes Microsoft
Database System Architectures
Lecture Topics: 11/1 Hand back midterms
Presentation transcript:

Scaleable Computing Jim Gray Microsoft Corporation Gray@Microsoft.com ™

Thesis: Scaleable Servers Commodity hardware allows new applications New applications need huge servers Clients and servers are built of the same “stuff” Commodity software and Commodity hardware Servers should be able to Scale up (grow node by adding CPUs, disks, networks) Scale out (grow by adding nodes) Scale down (can start small) Key software technologies Objects, Transactions, Clusters, Parallelism

1987: 256 tps Benchmark 14 M$ computer (Tandem) A dozen people False floor, 2 rooms of machines Admin expert Hardware experts A 32 node processor array Auditor Network expert Simulate 25,600 clients Manager Performance expert OS expert DB expert A 40 GB disk array (80 drives)

1988: DB2 + CICS Mainframe 65 tps IBM 4391 Simulated network of 800 clients 2m$ computer Staff of 6 to do benchmark 2 x 3725 network controllers Refrigerator-sized CPU 16 GB disk farm 4 x 8 x .5GB

1997: 10 years later 1 Person and 1 box = 1250 tps 1 Breadbox ~ 5x 1987 machine room 23 GB is hand-held One person does all the work Cost/tps is 1,000x less 25 micro dollars per transaction 4x200 Mhz cpu 1/2 GB DRAM 12 x 4GB disk Hardware expert OS expert Net expert DB expert App expert 3 x7 x 4GB disk arrays

What Happened? Moore’s law: Things get 4x better every 3 years (applies to computers, storage, and networks) New Economics: Commodity class price/mips software $/mips k$/year mainframe 10,000 100 minicomputer 100 10 microcomputer 10 1 GUI: Human - computer tradeoff optimize for people, not computers mainframe mini micro time price

What Happens Next ? Last 10 years: 1000x improvement 1985 2005 1995 performance ? Last 10 years: 1000x improvement Next 10 years: ???? Today: text and image servers are free 25 m$/hit => advertising pays for them Future: video, audio, … servers are free “You ain’t seen nothing yet!”

Kinds Of Information Processing Point-to-point Broadcast Lecture Concert Conversation Money Network Immediate Time- shifted Mail Book Newspaper Database It’s ALL going electronic Immediate is being stored for analysis (so ALL database) Analysis and automatic processing are being added

Why Put Everything In Cyberspace? Point-to-point OR broadcast Low rent - min $/byte Shrinks time - now or later Shrinks space - here or there Automate processing - knowbots Network Immediate OR time-delayed Locate Process Analyze Summarize Database

Magnetic Storage Cheaper Than Paper File cabinet: cabinet (four drawer) 250$ paper (24,000 sheets) 250$ space (2x3 @ 10$/ft2) 180$ total 700$ 3¢/sheet Disk: disk (4 GB =) 800$ ASCII: 2 mil pages 0.04¢/sheet (80x cheaper) Image: 200,000 pages 0.4¢/sheet (8x cheaper) Store everything on disk

Databases Information at Your Fingertips™ Information Network™ Knowledge Navigator™ All information will be in an online database (somewhere) You might record everything you Read: 10MB/day, 400 GB/lifetime (eight tapes today) Hear: 400MB/day, 16 TB/lifetime (three tapes/year today) See: 1MB/s, 40GB/day, 1.6 PB/lifetime (maybe someday)

Database Store ALL Data Types The old world: Millions of objects 100-byte objects The new world: Billions of objects Big objects (1 MB) Objects have behavior (methods) People Name Address Mike Won David NY Berk Austin Paperless office Library of Congress online All information online Entertainment Publishing Business WWW and Internet People Name Address Papers Picture Voice David NY Mike Berk Won Austin

Billions Of Clients Every device will be “intelligent” Doors, rooms, cars… Computing will be ubiquitous

Billions Of Clients Need Millions Of Servers All clients networked to servers May be nomadic or on-demand Fast clients want faster servers Servers provide Shared Data Control Coordination Communication Clients Mobile clients Fixed clients Servers Server Super server

Thesis Many little beat few big $1 million 1 MM 3 $100 K $10 K Pico Processor Micro Nano 1 MB 10 pico-second ram Mainframe Mini 10 microsecond ram 10 millisecond disc 10 second tape archive 10 nano-second ram 10 MB 1 0 GB 1 TB 1 00 TB 2.5" 1.8" 3.5" 5.25" 1 M SPECmarks, 1TFLOP 106 clocks to bulk ram Event-horizon on chip VM reincarnated Multiprogram cache, On-Chip SMP 9" 14" Smoking, hairy golf ball How to connect the many little parts? How to program the many little parts? Fault tolerance?

Future Super Server: 4T Machine Array of 1,000 4B machines 1 bps processors 1 BB DRAM 10 BB disks 1 Bbps comm lines 1 TB tape robot A few megabucks Challenge: Manageability Programmability Security Availability Scaleability Affordability As easy as a single system CPU 50 GB Disc 5 GB RAM Cyber Brick a 4B machine Future servers are CLUSTERS of processors, discs Distributed database techniques make clusters work

Performance = Storage Accesses not Instructions Executed In the “old days” we counted instructions and IO’s Now we count memory references Processors wait most of the time Where the time goes: clock ticks used by AlphaSort Components Disc Wait Sort 70 MIPS “real” apps have worse Icache misses so run at 60 MIPS if well tuned, 20 MIPS if not Sort OS Disc Wait Memory Wait I-Cache Miss B-Cache D-Cache Data Miss Miss

Storage Latency: How Far Away is the Data? Andromeda 9 10 Tape /Optical 2,000 Years Robot 6 Pluto 10 Disk 2 Years Clock Ticks Sacramento 1.5 hr 100 Memory This Campus 10 On Board Cache 10 min 2 On Chip Cache This Room 1 Registers My Head 1 min

The Hardware Is In Place… And then a miracle occurs ? SNAP: scaleable network and platforms Commodity-distributed OS built on: Commodity platforms Commodity network interconnect Enables parallel applications

Thesis: Scaleable Servers Commodity hardware allows new applications New applications need huge servers Clients and servers are built of the same “stuff” Commodity software and Commodity hardware Servers should be able to Scale up (grow node by adding CPUs, disks, networks) Scale out (grow by adding nodes) Scale down (can start small) Key software technologies Objects, Transactions, Clusters, Parallelism

Scaleable Servers BOTH SMP And Cluster Grow up with SMP; 4xP6 is now standard Grow out with cluster Cluster has inexpensive parts SMP super server Departmental server Personal system Cluster of PCs

SMPs Have Advantages Single system image easier to manage, easier to program threads in shared memory, disk, Net 4x SMP is commodity Software capable of 16x Problems: >4 not commodity Scale-down problem (starter systems expensive) There is a BIGGEST one SMP super server Departmental server Personal system

Building the Largest Node There is a biggest node (size grows over time) Today, with NT, it is probably 1TB We are building it (with help from DEC and SPIN2) 1 TB GeoSpatial SQL Server database (1.4 TB of disks = 320 drives). 30K BTU, 8 KVA, 1.5 metric tons. Will put it on the Web as a demo app. 10 meter image of the ENTIRE PLANET. 2 meter image of interesting parts (2% of land) One pixel per meter = 500 TB uncompressed. Better resolution in US (courtesy of USGS). www.SQL.1TB.com Support files 1-TB SQL Server DB Satellite and aerial photos Todo loo da loo-rah, ta da ta-la la la 1-TB home page TM

What’s TeraByte? 1 Terabyte: Library of Congress (in ASCII) is 25 TB 1,000,000,000 business letters 150 miles of book shelf 100,000,000 book pages 15 miles of book shelf 50,000,000 FAX images 7 miles of book shelf 10,000,000 TV pictures (mpeg) 10 days of video 4,000 LandSat images 16 earth images (100m) 100,000,000 web page 10 copies of the web HTML Library of Congress (in ASCII) is 25 TB 1980: $200 million of disc 10,000 discs $5 million of tape silo 10,000 tapes 1997: $200 k$ of magnetic disc 48 discs $30 k$ nearline tape 20 tapes Terror Byte !

TB DB User Interface Next

Tpc-C Web-Based Benchmarks Client is a Web browser (7,500 of them!) Submits Order Invoice Query to server via Web page interface Web server translates to DB SQL does DB work Net: easy to implement performance is GREAT! HTTP IIS = Web ODBC SQL

TPC-C Shows How Far SMPs have come Performance is amazing: 2,000 users is the min! 30,000 users on a 4x12 alpha cluster (Oracle) Peak Performance: 30,390 tpmC @ $305/tpmC (Oracle/DEC) Best Price/Perf: 6,712 tpmC @ $65/tpmC (MS SQL/DEC/Intel) graphs show UNIX high price & diseconomy of scaleup

TPC C SMP Performance SMPs do offer speedup but 4x P6 is better than some 18x MIPSco

The TPC-C Revolution Shows How Far NT and SQL Server have Come Economy of scale on Windows NT Recent Microsoft SQL Server benchmarks are Web-based tpmC and $/tpmC MS SQL Server: Economy of Scale & Low Price $250 DB2 $200 Informix $150 Price $/TPM-C Better Microsoft $100 Oracle $50 Sybase $0 1000 2000 3000 4000 5000 6000 7000 8000 Performance tpmC

What Happens To Prices? No expensive UNIX front end (20$/tpmC) No expensive TP monitor software (10$/tpmC) => 65$/tpmC

1 billion transactions per day Grow UP and OUT 1 Terabyte DB Cluster: a collection of nodes as easy to program and manage as a single node SMP super server 1 billion transactions per day Departmental server Personal system

Clusters Have Advantages Clients and servers made from the same stuff Inexpensive: Built with commodity components Fault tolerance: Spare modules mask failures Modular growth Grow by adding small modules Unlimited growth: no biggest one

Windows NT clusters Key goals: Easy: to install, manage, program Reliable: better than a single node Scaleable: added parts add power Microsoft & 60 vendors defining NT clusters Almost all big hardware and software vendors involved No special hardware needed - but it may help Enables Commodity fault-tolerance Commodity parallelism (data mining, virtual reality…) Also great for workgroups! Initial: two-node failover Beta testing since December96 SAP, Microsoft, Oracle giving demos. File, print, Internet, mail, DB, other services Easy to manage Each node can be 4x (or more) SMP Next (NT5) “Wolfpack” is modest size cluster About 16 nodes (so 64 to 128 CPUs) No hard limit, algorithms designed to go further

SQL Server™ Failover Using “Wolfpack” Windows NT Clusters Each server “owns” half the database When one fails… The other server takes over the shared disks Recovers the database and serves it Shared SCSI disk strings A B Private disks Clients

How Much Is 1 Billion Transactions Per Day? 1 Btpd = 11,574 tps (transactions per second) ~ 700,000 tpm (transactions/minute) AT&T 185 million calls (peak day worldwide) Visa ~20 M tpd 400 M customers 250,000 ATMs worldwide 7 billion transactions / year (card+cheque) in 1994 Millions of transactions per day 1,000. 100. Mtpd 10. 1. 0.1 AT&T 1 Btpd Visa BofA NYSE

Billion Transactions per Day Project Building a 20-node Windows NT Cluster (with help from Intel) > 800 disks All commodity parts Using SQL Server & DTC distributed transactions Each node has 1/20 th of the DB Each node does 1/20 th of the work 15% of the transactions are “distributed”

Parallelism The OTHER aspect of clusters Clusters of machines allow two kinds of parallelism Many little jobs: online transaction processing TPC-A, B, C… A few big jobs: data search and analysis TPC-D, DSS, OLAP Both give automatic parallelism

Kinds of Parallel Execution Any Any Sequential Sequential Pipeline Program Program Partition outputs split N ways inputs merge M ways Any Any Sequential Sequential Program Program Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey 81

Data Rivers Split + Merge Streams N X M Data Streams M Consumers N producers River Producers add records to the river, Consumers consume records from the river Purely sequential programming. River does flow control and buffering does partition and merge of data records River = Split/Merge in Gamma = Exchange operator in Volcano. Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey 82

Partitioned Execution Spreads computation and IO among processors Partitioned data gives NATURAL parallelism Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey 83

N x M way Parallelism N inputs, M outputs, no bottlenecks. Partitioned Data Partitioned and Pipelined Data Flows Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey 84

The Parallel Law Of Computing Grosch's Law: Parallel Law: Needs: Linear speedup and linear scale-up Not always possible 2x $ is 4x performance 1 MIPS 1 $ 1,000 MIPS 32 $ .03$/MIPS 2x $ is 2x performance 1,000 MIPS 1,000 $ 1 MIPS 1 $

Thesis: Scaleable Servers Commodity hardware allows new applications New applications need huge servers Clients and servers are built of the same “stuff” Commodity software and Commodity hardware Servers should be able to Scale up (grow node by adding CPUs, disks, networks) Scale out (grow by adding nodes) Scale down (can start small) Key software technologies Objects, Transactions, Clusters, Parallelism

The BIG Picture Components and transactions Software modules are objects Object Request Broker (a.k.a., Transaction Processing Monitor) connects objects (clients to servers) Standard interfaces allow software plug-ins Transaction ties execution of a “job” into an atomic unit: all-or-nothing, durable, isolated Object Request Broker

ActiveX and COM COM is Microsoft model, engine inside OLE ALL Microsoft software is based on COM (ActiveX) CORBA + OpenDoc is equivalent Heated debate over which is best Both share same key goals: Encapsulation: hide implementation Polymorphism: generic operations key to GUI and reuse Versioning: allow upgrades Transparency: local/remote Security: invocation can be remote Shrink-wrap: minimal inheritance Automation: easy COM now managed by the Open Group

Linking And Embedding Objects are data modules; transactions are execution modules Link: pointer to object somewhere else Think URL in Internet Embed: bytes are here Objects may be active; can callback to subscribers

Commodity Software Components Inexpensive OS, DBMS…and plug-ins Recent TPC-C prices Oracle on DEC UNIX: 30.4 k tpmC @ 305$/tpmC Informix on DEC UNIX: 13.6 k tpmC @ 277$/tpmC DB2 on Solaris: 6.4 ktpmC @ 200$/tpmC SQL Server on Compaq, Windows NT: 7.3 ktpmC @ 65$/tpmC (using Web, no TP monitor!) Oracle on Windows NT: 3.1 ktpmC @ 198$/tpmC Net: “Open” solutions can do even biggest jobs; thousands of online users per “node” of cluster ActiveX, VBX, and Java plug-ins Spreadsheets, GeoQuery, FAX, voice, image libraries, commodity component market

Objects Meet Databases The basis for universal data servers, access, & integration object-oriented (COM oriented) programming interface to data Breaks DBMS into components Anything can be a data source Optimization/navigation “on top of” other data sources A way to componentized a DBMS Makes an RDBMS and O-R DBMS (assumes optimizer understands objects) Database Spreadsheet Photos Mail Map Document DBMS engine

The Pattern: Three Tier Computing Presentation Clients do presentation, gather input Clients do some workflow (Xscript) Clients send high-level requests to ORB (Object Request Broker) ORB dispatches workflows and business objects -- proxies for client, orchestrate flows & queues Server-side workflow scripts call on distributed business objects to execute task workflow Business Objects Database

The Three Tiers DCOM (oleDB, ODBC,...) Object server Pool HTTP+ DCOM Web Client HTML VB or Java Script Engine Virt Machine VBscritpt JavaScrpt VB Java plug-ins Internet ORB HTTP+ DCOM Object server Pool Middleware TP Monitor Web Server... DCOM (oleDB, ODBC,...) Object & Data server. LU6.2 IBM Legacy Gateways

Why Did Everyone Go To Three-Tier? Manageability Business rules must be with data Middleware operations tools Performance (scaleability) Server resources are precious ORB dispatches requests to server pools Technology & Physics Put UI processing near user Put shared data processing near shared data Presentation workflow Business Objects Database

Why Put Business Objects at Server? Customer comes to store with list Gives list to clerk Clerk gets goods, makes invoice Customer pays clerk, gets goods Easy to manage Clerks controls access Encapsulation MOM’s Business Objects DAD’sRaw Data Customer comes to store Takes what he wants Fills out invoice Leaves money for goods Easy to build No clerks

What Middleware Does ORB, TP Monitor, Workflow Mgr, Web Server Registers transaction programs workflow and business objects (DLLs) Pre-allocates server pools Provides server execution environment Dynamically checks authority (request-level security) Does parameter binding Dispatches requests to servers parameter binding load balancing Provides Queues Operator interface

Server Side Objects Easy Server-Side Execution A Server Give simple execution environment Object gets start invoke shutdown Everything else is automatic Drag & Drop Business Objects Network Receiver Queue Management Connections Context Security Configuration Thread Pool Service logic Synchronization Shared Data

A new programming paradigm Develop object on the desktop Better yet: download them from the Net Script work flows as method invocations All on desktop Then, move work flows and objects to server(s) Gives desktop development three-tier deployment Software Cyberbricks

Transactions Coordinate Components (ACID) Transaction properties Atomic: all or nothing Consistent: old and new values Isolated: automatic locking or versioning Durable: once committed, effects survive Transactions are built into modern OSs MVS/TM Tandem TMF, VMS DEC-DTM, NT-DTC

Transactions & Objects Application requests transaction identifier (XID) XID flows with method invocations Object Managers join (enlist) in transaction Distributed Transaction Manager coordinates commit/abort

Transactions Coordinate Components (ACID) Programmer’s view: bracket a collection of actions A simple failure model Only two outcomes: Begin() action Commit() Begin() action Rollback() Begin() action Rollback() Fail ! Success! Failure!

Distributed Transactions Enable Huge Throughput Each node capable of 7 KtmpC (7,000 active users!) Can add nodes to cluster (to support 100,000 users) Transactions coordinate nodes ORB / TP monitor spreads work among nodes

Distributed Transactions Enable Huge DBs Distributed database technology spreads data among nodes Transaction processing technology manages nodes

Thesis: Scaleable Servers Scaleable Servers Built from Cyberbricks Allow new applications Servers should be able to Scale up, out, down Key software technologies Clusters (ties the hardware together) Parallelism: (uses the independent cpus, stores, wires Objects (software CyberBricks) Transactions: masks errors.

Computer Industry Laws (Rules of thumb) Metcalf’s law Moore’s first law Bell’s computer classes (7 price tiers) Bell’s platform evolution Bell’s platform economics Bill’s law Software economics Grove’s law Moore’s second law Is info-demand infinite? The death of Grosch’s law

Metcalf’s Law Network Utility = Users2 How many connections can it make? 1 user: no utility 100,000 users: a few contacts 1 million users: many on Net 1 billion users: everyone on Net That is why the Internet is so “hot” Exponential benefit

Moore’s First Law XXX doubles every 18 months 60% increase per year Micro processor speeds Chip density Magnetic disk density Communications bandwidth WAN bandwidth approaching LANs Exponential growth: The past does not matter 10x here, 10x there, soon you’re talking REAL change PC costs decline faster than any other platform Volume and learning curves PCs will be the building bricks of all future systems 128KB 128MB 2000 8KB 1MB 8MB 1GB 1970 1980 1990 1M 16M bits: 1K 4K 16K 64K 256K 4M 64M 256M 1 chip memory size ( 2 MB to 32 MB)

Bumps In The Moore’s Law Road 1000000 1 100 10000 1970 1980 1990 2000 $/MB of DRAM DRAM: 1988: United States anti-dumping rules 1993-1995: ?price flat Magnetic disk: 1965-1989: 10x/decade 1989-1996: 4x/3year! 100X/decade .01 1 100 10,000 1970 1980 1990 2000 $/MB of DISK

Gordon Bell’s 1975 VAX Planning Model... He Didn’t Believe It! System Price = 5 x 3 x .04 x memory size/ 1.26 (t-1972) K$ 5x: Memory is 20% of cost 3x: DEC markup .04x: $ per byte He didn’t believe: the projection $500 machine He couldn’t comprehend the implications 0.01K$ 0.1K$ 1.K$ 10.K$ 100.K$ 1,000.K$ 10,000.K$ 100,000.K$ 1960 1970 1980 1990 2000 16 KB 64 KB 256 KB 1 MB 8 MB

Gordon Bell’s Processing Memories, And Comm 100 Years 1947 1967 1987 2007 2027 2047 Processing Pri. Mem Sec. Mem. POTS(bps) Backbone

Gordon Bell’s Seven Price Tiers 10$: wrist watch computers 100$: pocket/ palm computers 1,000$: portable computers 10,000$: personal computers (desktop) 100,000$: departmental computers (closet) 1,000,000$: site computers (glass house) 10,000,000$: regional computers (glass castle) Super server: costs more than $100,000 “Mainframe”: costs more than $1 million Must be an array of processors, disks, tapes, comm ports

Bell’s Evolution Of Computer Classes Technology enables two evolutionary paths: 1. constant performance, decreasing cost 2. constant price, increasing performance ?? Time Mainframes (central) Minis (dep’t.) PCs (personals) Log price WSs 1.26 = 2x/3 yrs -- 10x/decade; 1/1.26 = .8 1.6 = 4x/3 yrs --100x/decade; 1/1.6 = .62

Gordon Bell’s Platform Economics Traditional computers: custom or semi-custom, high-tech and high-touch New computers: high-tech and no-touch 100000 10000 Price (K$) 1000 Volume (K) 100 Application price 10 1 0.1 0.01 Mainframe WS Browser Computer type

Software Economics An engineer costs about $150,000/year Microsoft: $9 billion An engineer costs about $150,000/year R&D gets [5%…15%] of budget Need [$3 million… $1 million] revenue per engineer Profit 24% R&D 16% Tax 13% SG&A 34% Product and Service 13% Intel: $16 billion IBM: $72 billion Oracle: $3 billion Profit 6% Profit 15% R&D 8% R&D 9% Profit R&D 8% Tax 5% 22% Tax 7% SG&A 11% SG&A 22% Tax SG&A 12% P&S 47% P&S 59% P&S 26% 43%

Software Economics: Bill’s Law Fixed_ Cost Price Marginal _Cost = + Units Bill Joy’s law (Sun): don’t write software for less than 100,000 platforms @$10 million engineering expense, $1,000 price Bill Gate’s law: don’t write software for less than 1,000,000 platforms @$10 engineering expense, $100 price Examples: UNIX versus Windows NT: $3,500 versus $500 Oracle versus SQL-Server: $100,000 versus $6,000 No spreadsheet or presentation pack on UNIX/VMS/... Commoditization of base software and hardware

Grove’s Law The New Computer Industry Horizontal integration is new structure Each layer picks best from lower layer Desktop (C/S) market 1991: 50% 1995: 75% Function Example Operation AT&T Integration EDS Applications SAP Middleware Oracle Baseware Microsoft Systems Compaq Silicon & Oxide Intel & Seagate

Moore’s Second Law The cost of fab lines doubles every generation (three years) Money limit hard to imagine: $10-billion line $20-billion line $40-billion line Physical limit Quantum effects at 0.25 micron now 0.05 micron seems hard 12 years, three generations Lithograph: need Xray below 0.13 micron $1 $10 $100 $1,000 $10,000 1960 1970 1980 1990 2000 Year $million/ Fab Line

Constant Dollars Versus Constant Work One SuperServer can do all the world’s computations Constant dollars: The world spends 10% on information processing Computers are moving from 5% penetration to 50% $300 billion to $3 trillion We have the patent on the byte and algorithm

Crossing The Chasm New market Old market Old technology New technology Product finds customers No product no customers Hard Very hard Old market Hard Boring competitive slow growth Customers find product Old technology New technology