The Future of Distributed Systems .

Slides:



Advertisements
Similar presentations
Symantec 2010 Windows 7 Migration EMEA Results. Methodology Applied Research performed survey 1,360 enterprises worldwide SMBs and enterprises Cross-industry.
Advertisements

Números.
Symantec 2010 Windows 7 Migration Global Results.
PDAs Accept Context-Free Languages
ALAK ROY. Assistant Professor Dept. of CSE NIT Agartala
Process Description and Control
Distributed Systems Architectures
Computer Industry Laws (rules of thumb)
Clustering Technology For Scaleability Jim Gray Microsoft Research
U Computer Systems Research: Past and Future u Butler Lampson u People have been inventing new ideas in computer systems for nearly four decades, usually.
Sequential Logic Design
Copyright © 2013 Elsevier Inc. All rights reserved.
Add Governors Discretionary (1G) Grants Chapter 6.
CALENDAR.
Chapter 6 File Systems 6.1 Files 6.2 Directories
1 Chapter 12 File Management Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design Principles,
1 Advanced Tools for Account Searches and Portfolios Dawn Gamache Cindy Bylander.
The 5S numbers game..
Database Systems: Design, Implementation, and Management
© Tally Solutions Pvt. Ltd. All Rights Reserved Shoper 9 License Management December 09.
A Fractional Order (Proportional and Derivative) Motion Controller Design for A Class of Second-order Systems Center for Self-Organizing Intelligent.
Welcome. © 2008 ADP, Inc. 2 Overview A Look at the Web Site Question and Answer Session Agenda.
Our Digital World Second Edition
The basics for simulations
Database Performance Tuning and Query Optimization
Employee & Manager Self Service Overview
Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China
Dynamic Access Control the file server, reimagined Presented by Mark on twitter 1 contents copyright 2013 Mark Minasi.
Distributed Processing, Client/Server and Clusters
ICS 434 Advanced Database Systems
Chapter 6 File Systems 6.1 Files 6.2 Directories
Chapter 9: The Client/Server Database Environment
CSE 6007 Mobile Ad Hoc Wireless Networks
Hardware & the Machine room Week 5 – Lecture 1. What is behind the wall plug for your workstation? Today we will look at the platform on which our Information.
When you see… Find the zeros You think….
Before Between After.
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
Figure 10–1 A 64-cell memory array organized in three different ways.
Distributed Database Management Systems
Static Equilibrium; Elasticity and Fracture
Lial/Hungerford/Holcomb/Mullins: Mathematics with Applications 11e Finite Mathematics with Applications 11e Copyright ©2015 Pearson Education, Inc. All.
WARNING This CD is protected by Copyright Laws. FOR HOME USE ONLY. Unauthorised copying, adaptation, rental, lending, distribution, extraction, charging.
A Data Warehouse Mining Tool Stephen Turner Chris Frala
1 DIGITAL INTERACTIVE MEDIA Wednesday, October 28, 2009.
Outlook 2013 Web App (OWA) User Guide Durham Technical Community College.
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
Introduction Embedded Universal Tools and Online Features 2.
Introduction to cloud computing Jiaheng Lu Department of Computer Science Renmin University of China
Distributed Processing, Client/Server, and Clusters
Chapter 16 Client/Server Computing Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design Principles,
Technical Architectures
Distributed Systems Architectures
Ch 12 Distributed Systems Architectures
The Architecture of Transaction Processing Systems
Chapter 12 Distributed Database Management Systems
Object Based Operating Systems1 Learning Objectives Object Orientation and its benefits Controversy over object based operating systems Object based operating.
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
Client/Server Architecture
1 © Prentice Hall, 2002 The Client/Server Database Environment.
Client Server Technologies Middleware Technologies Ganesh Panchanathan Alex Verstak.
Unit – I CLIENT / SERVER ARCHITECTURE. Unit Structure  Evolution of Client/Server Architecture  Client/Server Model  Characteristics of Client/Server.
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 12 Distributed Database Management Systems.
Lecture 22: Client-Server Software Engineering
Chapter 1: Introduction
Chapter 9: The Client/Server Database Environment
Netscape Application Server
Distributed system (Lecture 02)
The Client/Server Database Environment
Chapter 9 – RPCs, Messaging & EAI
Presentation transcript:

The Future of Distributed Systems . Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com ™

Outline Global forces Distributed Systems Concepts and terms Moore’s, Metcalf’s, Bell’s, Bills, Andy’s laws Micro dollars per transaction Cyber-content is key value because distribution costs go to zero Distributed Systems Concepts and terms Key software technologies objects, transactions

Metcalf’s Law Network Utility = Users2 How many connections can it make? 1 user: no utility 100,000 users: a few contacts 1 million users: many on Net 1 billion users: everyone on Net That is why the Internet is so “hot” Exponential benefit

Moore’s First Law XXX doubles every 18 months 60% increase per year Micro processor speeds Chip density Magnetic disk density Communications bandwidth WAN bandwidth approaching LAN speeds Exponential growth: The past does not matter 10x here, 10x there, soon you’re talking REAL change PC costs decline faster than any other platform Volume and learning curves PCs will be the building bricks of all future systems 1GB 128MB 1 chip memory size ( 2 MB to 32 MB) 8MB 1MB 128KB 8KB 1970 1980 1990 2000 bits: 1K 4K 16K 64K 256K 1M 4M 16M 64M 256M

Bumps In The Moore’s Law Road 1000000 1 100 10000 1970 1980 1990 2000 $/MB of DRAM DRAM: 1988: United States anti-dumping rules 1993-1995: ?price flat Magnetic disk: 1965-1989: 10x/decade 1989-1996: 4x/3year! 100X/decade .01 1 100 10,000 1970 1980 1990 2000 $/MB of DISK

Gordon Bell’s Seven Price Tiers 10$: wrist watch computers 100$: pocket/ palm computers 1,000$: portable computers 10,000$: personal computers (desktop) 100,000$: departmental computers (closet) 1,000,000$: site computers (glass house) 10,000,000$: regional computers (glass castle) Super server: costs more than $100,000 “Mainframe”: costs more than $1 million Must be an array of processors, disks, tapes, comm ports

Bell’s Evolution Of Computer Classes Technology enables two evolutionary paths: 1. constant performance, decreasing cost 2. constant price, increasing performance Mainframes (central) Minis (dep’t.) Log price WSs PCs (personals) ?? Time 1.26 = 2x/3 yrs -- 10x/decade; 1/1.26 = .8 1.6 = 4x/3 yrs --100x/decade; 1/1.6 = .62

Software Economics An engineer costs about $150,000/year R&D gets [5%…15%] of budget Need [$3 million… $1 million] revenue per engineer Microsoft: $9 billion Profit 24% R&D 16% Tax 13% SG&A 34% Product and Service 13% Intel: $16 billion IBM: $72 billion Oracle: $3 billion Profit 6% Profit 15% R&D 8% R&D 9% Profit R&D 8% Tax 5% 22% Tax 7% SG&A 11% SG&A 22% Tax SG&A 12% P&S 47% P&S 59% P&S 26% 43%

Software Economics: Bill’s Law Fixed_ Cost Price Marginal _Cost = + Units Bill Joy’s law (Sun): don’t write software for less than 100,000 platforms @$10 million engineering expense, $1,000 price Bill Gate’s law: don’t write software for less than 1,000,000 platforms @$10 engineering expense, $100 price Examples: UNIX versus Windows NT: $3,500 versus $500 Oracle versus SQL-Server: $100,000 versus $6,000 No spreadsheet or presentation pack on UNIX/VMS/... Commoditization of base software and hardware

Gordon Bell’s Platform Economics Traditional computers: custom or semi-custom, high-tech and high-touch New computers: high-tech and no-touch 100000 10000 Price (K$) 1000 Volume (K) 100 Application price 10 1 0.1 0.01 Mainframe WS Browser Computer type

Grove’s Law The New Computer Industry Horizontal integration is new structure Each layer picks best from lower layer Desktop (C/S) market 1991: 50% 1995: 75% Function Example Operation AT&T Integration EDS Applications SAP Middleware Oracle Baseware Microsoft Systems Compaq Silicon & Oxide Intel & Seagate

Outline Global forces Distributed Systems Concepts and terms Moore’s, Metcalf’s, Bell’s, Bills, Andy’s laws Micro dollars per transaction Cyber-content is key value because distribution costs go to zero Distributed Systems Concepts and terms Key software technologies objects, transactions

1987: 256 tps Benchmark 14 M$ computer (Tandem) A dozen people False floor, 2 rooms of machines Admin expert Hardware experts A 32 node processor array Auditor Network expert Simulate 25,600 clients Manager Performance expert OS expert DB expert A 40 GB disk array (80 drives)

1988: DB2 + CICS Mainframe 65 tps IBM 4391 Simulated network of 800 clients 2m$ computer Staff of 6 to do benchmark 2 x 3725 network controllers Refrigerator-sized CPU 16 GB disk farm 4 x 8 x .5GB

1997: 10 years later 1 Person and 1 box = 1250 tps 1 Breadbox ~ 5x 1987 machine room 23 GB is hand-held One person does all the work Cost/tps is 1,000x less 1 micro dollar per transaction 4x200 Mhz cpu 1/2 GB DRAM 12 x 4GB disk Hardware expert OS expert Net expert DB expert App expert 3 x7 x 4GB disk arrays

What Happened? Moore’s law: Things get 4x better every 3 years (applies to computers, storage, and networks) New Economics: Commodity class price/mips software $/mips k$/year mainframe 10,000 100 minicomputer 100 10 microcomputer 10 1 GUI: Human - computer tradeoff optimize for people, not computers mainframe mini micro time price

What Happens Next ? Last 10 years: 1000x improvement Next 10 years: ???? Today: text and image servers are free 1 m$/hit cost 70,000m$/hit advertising revenue Advertising pays for them Content is only “real” expense “You ain’t seen nothing yet!” 1985 2005 1995 performance ?

Kinds Of Information Processing Point-to-point Broadcast Lecture Concert Conversation Money Network Immediate Time- shifted Mail Book Newspaper Database It’s ALL going electronic Immediate is being stored for analysis (so ALL database) Analysis and automatic processing are being added

Why Put Everything In Cyberspace? Point-to-point OR broadcast Low rent - min $/byte Shrinks time - now or later Shrinks space - here or there Automate processing - knowbots Network Immediate OR time-delayed Locate Process Analyze Summarize Database

Billions Of Clients Every device will be “intelligent” Doors, rooms, cars… Computing will be ubiquitous

Billions Of Clients Need Millions Of Servers All clients networked to servers May be nomadic or on-demand Fast clients want faster servers Servers provide Shared Data Control Coordination Communication Clients Mobile clients Fixed clients Servers Server Super server

Thesis Many little beat few big $1 million 1 MM 3 $100 K $10 K Pico Processor Micro Nano 1 MB 10 pico-second ram Mainframe Mini 10 microsecond ram 10 millisecond disc 10 second tape archive 10 nano-second ram 10 MB 1 0 GB 1 TB 1 00 TB 2.5" 1.8" 3.5" 5.25" 1 M SPECmarks, 1TFLOP 106 clocks to bulk ram Event-horizon on chip VM reincarnated Multiprogram cache, On-Chip SMP 9" 14" Smoking, hairy golf ball How to connect the many little parts? How to program the many little parts? Fault tolerance?

Future Super Server: 4T Machine Array of 1,000 4B machines 1 bps processors 1 BB DRAM 10 BB disks 1 Bbps comm lines 1 TB tape robot A few megabucks Challenge: Manageability Programmability Security Availability Scaleability Affordability As easy as a single system CPU 50 GB Disc 5 GB RAM Cyber Brick a 4B machine Future servers are CLUSTERS of processors, discs Distributed database techniques make clusters work

The Hardware Is In Place… And then a miracle occurs ? SNAP: scaleable network and platforms Commodity-distributed OS built on: Commodity platforms Commodity network interconnect Enables parallel applications

Outline Global forces Distributed Systems Concepts and terms Moore’s, Metcalf’s, Bell’s, Bills, Andy’s laws Micro dollars per transaction Cyber-content is key value because distribution costs go to zero Distributed Systems Concepts and terms Key software technologies objects, transactions

Outline Concepts and Terminology Why Distributed Distributed data & objects Distributed execution Three tier architectures Transaction concepts

What’s a Distributed System? Centralized: everything in one place stand-alone PC or Mainframe Distributed: some parts remote distributed users distributed execution distributed data

Why Distribute? No best organization Companies constantly swing between Centralized: focus, control, economy Decentralized: adaptive, responsive, competitive Why distribute? reflect organization or application structure empower users / producers improve service (response / availability) distributed load use PC technology (economics)

What Should Be Distributed? Users and User Interface Thin client Processing Trim client Data Fat client Will discuss tradeoffs later Presentation workflow Business Objects Database

Transparency in Distributed Systems Make distributed system as easy to use and manage as a centralized system Give a Single-System Image Location transparency: hide fact that object is remote hide fact that object has moved hide fact that object is partitioned or replicated Name doesn’t change if object is replicated, partitioned or moved.

Naming- The basics Names are context dependent: Many naming systems Objects have Globally Unique Identifier (GUIDs) location(s) = address(es) name(s) addresses can change objects can have many names Names are context dependent: (Jim @ KGB not the same as Jim @ CIA) Many naming systems UNC: \\node\device\dir\dir\dir\object Internet: http://node.domain.root/dir/dir/dir/object LDAP: ldap://ldap.domain.root/o=org,c=US,cn=dir Address guid Jim James

Name Servers in Distributed Systems North Name servers translate names + context to address (+ GUID) Name servers are partitioned (subtrees of name space) Name servers replicate root of name tree Name servers form a hierarchy Distributed data from hell: high read traffic high reliability & availability autonomy root Northern names South root Southern names

Autonomy in Distributed Systems Owner of site (or node, or application, or database) Wants to control it If my part is working , must be able to access & manage it (reorganize, upgrade, add user,…) Autonomy is Essential Difficult to implement. Conflicts with global consistency examples: naming, authentication, admin…

Security The Basics Authentication server subject + Authenticator => (Yes + token) | No Security matrix: who can do what to whom Access control list is column of matrix “who” is authenticated ID In a distributed system, “who” and “what” and “whom” are distributed objects subject Object Permissions

Security in Distributed Systems Security domain: nodes with a shared security server. Security domains can have trust relationships: A trusts B: A “believes” B when it says this is Jim@B Security domains form a hierarchy. Delegation: passing authority to a server when A asks B to do something (e.g. print a file, read a database) B may need A’s authority Autonomy requires: each node is an authenticator each node does own security checks Internet Today: no trust among domains (fire walls, many passwords) trust based on digital signatures

Clusters The Ideal Distributed System. Cluster is distributed system BUT single location manager security policy relatively homogeneous communications is high bandwidth low latency low error rate Clusters use distributed system techniques for load distribution storage execution growth fault tolerance

Cluster: Shared What? Shared Disk Cluster Shared Nothing Cluster Shared Memory Multiprocessor Multiple processors, one memory all devices are local DEC or SGI or Sequent 16x nodes Shared Disk Cluster an array of nodes all shared common disks VAXcluster + Oracle Shared Nothing Cluster each device local to a node ownership may change Tandem, SP2, Wolfpack

Outline Concepts and Terminology Why Distribute Distributed data & objects Partitioned Replicated Distributed execution Three tier architectures Transaction concepts

Partitioned Data Break file into disjoint groups Orders Exploit data access locality Put data near consumer Less network traffic Better response time Better availability Owner controls data autonomy Spread Load data or traffic may exceed single store N.A. S.A. Europe Asia

How to Partition Data? How to Partition Problem: to find it must have by attribute or random or by source or by use Problem: to find it must have Directory (replicated) or Algorithm Encourages attribute-based partitioning N.A. S.A. Europe Asia

Replicated Data Place fragment at many sites Pros: Improves availability Disconnected (mobile) operation Distributes load Reads are cheaper Cons: N times more updates N times more storage Placement strategies: Dynamic: cache on demand Static: place specific Catalog

Updating Replicated Data When a replica is updated, how do changes propagate? Master copy, many slave copies (SQL Server) always know the correct value (master) change propagation can be transactional as soon as possible periodic on demand Symmetric, and anytime (Access) allows mobile (disconnected) updates updates propagated ASAP, periodic, on demand non-serializable colliding updates must be reconciled. hard to know “real” value

Replication and Partitioning Compared 1 TPS server 100 Users Base case a 1 TPS system 2 TPS server 200 Users Scaleup to a 2 TPS centralized system Central Scaleup 2x more work Partition Scaleup 2x more work Replication Scaleup 4x more work Partitioning Two 1 TPS systems 1 TPS server 100 Users O tps Two 2 TPS systems 2 TPS server 100 Users 1 tps Replication

Outline Concepts and Terminology Why Distribute Distributed data & objects Partitioned Replicated Distributed execution remote procedure call queues Three tier architectures Transaction concepts

Distributed Execution Threads and Messages Thread is Execution unit (software analog of cpu+memory) Threads execute at a node Threads communicate via Shared memory (local) Messages (local and remote) shared memory messages

Peer-to-Peer or Client-Server Peer-to-Peer is symmetric: Either side can send Client-server client sends requests server sends responses simple subset of peer-to-peer request response

Connection-less or Connected request contains client id client context work request client authenticated on each message only a single response message e.g. HTTP, NFS v1 Connected (sessions) open - request/reply - close client authenticated once Messages arrive in order Can send many replies (e.g. FTP) Server has client context (context sensitive) e.g. Winsock and ODBC HTTP adding connections

Remote Procedure Call: The key to transparency Object may be local or remote Methods on object work wherever it is. Local invocation y = pObj->f(x); f() x val y = val; return val;

Remote Procedure Call: The key to transparency Remote invocation marshal un x proxy y = pObj->f(x); Obj Local? x un marshal pObj->f(x) x stub val y = val; f() return val; Obj Local? x val f() return val; val

Object Request Broker (ORB) Orchestrates RPC Registers Servers Manages pools of servers Connects clients to servers Does Naming, request-level authorization, Provides transaction coordination (new feature) Old names: Transaction Processing Monitor, Web server, NetWare Transaction Object-Request Broker

History and Alphabet Soup DCE RPC GUIDs IDL Kerberos DNS COM Microsoft DCOM based on OSF-DCE Technology DCOM and ActiveX extend it 1985 Solaris International UNIX OSF DCE Foundation (OSF) Open software NT X/Open 1990 Management Group (OMG) Object CORBA ODBC XA / TX 1995 Open Group

ActiveX and COM COM is Microsoft model, engine inside OLE ALL Microsoft software is based on COM (ActiveX) CORBA + OpenDoc is equivalent Heated debate over which is best Both share same key goals: Encapsulation: hide implementation Polymorphism: generic operations key to GUI and reuse Versioning: allow upgrades Transparency: local/remote Security: invocation can be remote Shrink-wrap: minimal inheritance Automation: easy COM now managed by the Open Group

Linking And Embedding Objects are data modules; transactions are execution modules Link: pointer to object somewhere else Think URL in Internet Embed: bytes are here Objects may be active; can callback to subscribers

Bottom Line Re ORBs Object-Request Broker Microsoft Promises Cairo distributed objects, secure, transparent, fast invocation Netscape promises the CORBA Both will deliver Customers can pick the best one Transaction Object-Request Broker

Using RPC for Transparency Partition Transparency Send updates to correct partition send to correct partition x y = pfile->write(x); part Local? x un marshal pObj->write(x) x x val write() return val; val

Using RPC for Transparency Replication Transparency Send updates to EACH node Send to each replica x y = pfile->write(x); x val

Outline Concepts and Terminology Why Distributed Distributed data & objects Distributed execution remote procedure call queues Three tier architectures what why Transaction concepts

Client/Server Interactions All can be done with RPC Request-Response response may be many messages Conversational server keeps client context Dispatcher three-tier: complex operation at server Queued de-couples client from server allows disconnected operation C S S C S S S C S

Queued Request/Response Time-decouples client and server Three Transactions Almost real time, ASAP processing Communicate at each other’s convenience Allows mobile (disconnected) operation Disk queues survive client & server failures Submit Perform Response Client Server

Why Queued Processing? Prioritize requests ambulance dispatcher favors high-priority calls Manage Workflows Deferred processing in mobile apps Interface heterogeneous systems EDI, MOM: Message-Oriented-Middleware DAD: Direct Access to Data Order Build Ship Invoice Pay

Work Distribution Spectrum Fat Thin Presentation and plug-ins Workflow manages session & invokes objects Business objects Database Presentation workflow Business Objects Database

Transaction Processing Evolution to Three Tier Intelligence migrated to clients Mainframe cards Mainframe Batch processing (centralized) Dumb terminals & Remote Job Entry Intelligent terminals database backends Workflow Systems Object Request Brokers Application Generators green screen 3270 Server TP Monitor ORB Active

Web Evolution to Three Tier Intelligence migrated to clients (like TP) Server WAIS Character-mode clients, smart servers GUI Browsers - Web file servers GUI Plugins - Web dispatchers - CGI Smart clients - Web dispatcher (ORB) pools of app servers (ISAPI, Viper) workflow scripts at client & server archie ghopher green screen Mosaic NS & IE Active

PC Evolution to Three Tier Intelligence migrated to server Stand-alone PC (centralized) PC + File & print server message per I/O PC + Database server message per SQL statement PC + App server message per transaction ActiveX Client, ORB ActiveX server, Xscript IO request reply disk I/O SQL Statement Transaction

The Pattern: Three Tier Computing Presentation Clients do presentation, gather input Clients do some workflow (Xscript) Clients send high-level requests to ORB (Object Request Broker) ORB dispatches workflows and business objects -- proxies for client, orchestrate flows & queues Server-side workflow scripts call on distributed business objects to execute task workflow Business Objects Database

The Three Tiers DCOM (oleDB, ODBC,...) Object server Pool HTTP+ DCOM Web Client HTML VB or Java Script Engine Virt Machine VBscritpt JavaScrpt VB Java plug-ins Internet ORB HTTP+ DCOM Object server Pool Middleware TP Monitor Web Server... DCOM (oleDB, ODBC,...) Object & Data server. LU6.2 IBM Legacy Gateways

Why Did Everyone Go To Three-Tier? Manageability Business rules must be with data Middleware operations tools Performance (scaleability) Server resources are precious ORB dispatches requests to server pools Technology & Physics Put UI processing near user Put shared data processing near shared data Presentation workflow Business Objects Database

Why Put Business Objects at Server? Customer comes to store with list Gives list to clerk Clerk gets goods, makes invoice Customer pays clerk, gets goods Easy to manage Clerks controls access Encapsulation MOM’s Business Objects DAD’sRaw Data Customer comes to store Takes what he wants Fills out invoice Leaves money for goods Easy to build No clerks

What Middleware Does ORB, TP Monitor, Workflow Mgr, Web Server Registers transaction programs workflow and business objects (DLLs) Pre-allocates server pools Provides server execution environment Dynamically checks authority (request-level security) Does parameter binding Dispatches requests to servers parameter binding load balancing Provides Queues Operator interface

Server Side Objects Easy Server-Side Execution A Server ORB gives simple execution environment Object gets start invoke shutdown Everything else is automatic Drag & Drop Business Objects Network Receiver Queue Management Connections Context Security Configuration Thread Pool Service logic Synchronization Shared Data

A new programming paradigm Develop object on the desktop Better yet: download them from the Net Script work flows as method invocations All on desktop Then, move work flows and objects to server(s) Gives desktop development three-tier deployment Software Cyberbricks

Why Server Pools? Server resources are precious. Clients have 100x more power than server. Pre-allocate everything on server preallocate memory pre-open files pre-allocate threads pre-open and authenticate clients Keep high duty-cycle on objects (re-use them) Pool threads, not one per client Classic example: TPC-C benchmark 2 processes everything pre-allocated N clients x N Servers x F files = N x N x F file opens!!! Pool of DBC links HTTP IE 7,000 clients IIS SQL

Classic Three-Tier Example TPC-C 7,000 Web clients Transaction Processing Performance Council (TPC): standard performance benchmarks 5 transaction types order entry , payment , status (oltp) delivery (mini-batch) restock (mini-DSS) Metrics: Throughput, Price/Performance Shows best practices: everyone three tier 2 processes at server everything pre-allocated HTTP IIS = Web Pool of DBC links ODBC SQL

Classic Mistakes Thread per terminal fix: DB server thread pools fix: server pools Process per request (CGI) fix: ISAPI & NSAPI DLLs fix: connection pools Many messages per operation fix: stored procedures fix: server-side objects File open per request fix: cache hot files

Outline Laws & micro$/transaction Distributed Systems Why Distributed Distributed data & objects Distributed execution Three tier architectures why: manageability & performance what: server side workflows & objects Transaction concepts Why transactions? Using transactions

Thesis Transactions are key to structuring distributed applications ACID properties ease exception handling Atomic: all or nothing Consistent: state transformation Isolated: no concurrency anomalies Durable: committed transaction effects persist 2

What Is A Transaction? Success! Failure! Programmer’s view: Bracket a collection of actions A simple failure model Only two outcomes: Begin() action Commit() Begin() action Rollback() Begin() action Rollback() Fail ! Success! Failure! 3

Why Bother: Atomicity? ? ? ? RPC semantics: At most once: try one time At least once: keep trying ’till acknowledged Exactly once: keep trying ’till acknowledged and server discards duplicate requests ? ? ? 4

Why Bother: Atomicity? Example: insert record in file At most once: time-out means “maybe” At least once: retry may get “duplicate” error or retry may do second insert Exactly once: you do not have to worry What if operation involves Insert several records? Send several messages? Want ALL or NOTHING for group of actions 5

Why Bother: Durability Once a transaction commits, want effects to survive failures Fault tolerance: old master-new master won’t work: Can’t do daily dumps: would lose recent work Want “continuous” dumps Redo “lost” transactions in case of failure Resend unacknowledged messages 9

Why ACID For Client/Server And Distributed ACID is important for centralized systems Failures in centralized systems are simpler In distributed systems: More and more-independent failures ACID is harder to implement That makes it even MORE IMPORTANT Simple failure model Simple repair model 11

ACID Generalizations Taxonomy of actions Unprotected: not undone or redone Temp files Transactional: can be undone before commit Database and message operations Real: cannot be undone Drill a hole in a piece of metal, print a check Nested transactions: subtransactions Work flow: long-lived transactions 10

Programming & Transactions The Application View You Start (e.g. in TransactSQL): Begin [Distributed] Transaction <name> Perform actions Optional Save Transaction <name> Commit or Rollback You Inherit a XID Caller passes you a transaction You return or Rollback. You can Begin / Commit sub-trans. You can use save points Begin Begin RollBack Commit XID RollBack Return Return

Nested Transactions Going Beyond Flat Transactions Need transactions within transactions Sub-transactions commit only if root does Only root commit is durable. Subtransactions may rollback if so, all its subtransactions rollback Parallel version of nested transactions T12 T121 T122 T123 T1 T11 T13 T112 T114 T131 T132 T133 T111 T113

Workflow: A Sequence of Transactions Application transactions are multi-step order, build, ship & invoice, reconcile Each step is an ACID unit Workflow is a script describing steps Workflow systems Instantiate the scripts Drive the scripts Allow query against scripts Examples Manufacturing Work In Process (WIP) Queued processing Loan application & approval, Hospital admissions… Presentation workflow Business Objects Database

Workflow Scripts Workflow scripts are programs (could use VBScript or JavaScript) If step fails, compensation action handles error Events, messages, time, other steps cause step. Workflow controller drives flows fork Source join branch case loop Compensation Action Step

Workflow and ACID Workflow is not Atomic or Isolated Results of a step visible to all Workflow is Consistent and Durable Each flow may take hours, weeks, months Workflow controller keeps flows moving maintains context (state) for each flow provides a query and operator interface e.g.: “what is the status of Job # 72149?”

ACID Objects Using ACID DBs The easy way to build transactional objects Application uses transactional objects (objects have ACID properties) If object built on top of ACID objects, then object is ACID. Example: New, EnQueue, DeQueue on top of SQL SQL provides ACID SQL dim c as Customer dim CM as CustomerMgr ... set C = CM.get(CustID) C.credit_limit = 1000 CM.update(C, CustID) .. Business Object: Customer Business Object Mgr: CustomerMgr SQL Persistent Programming languages automate this.

ACID Objects From Bare Metal The Hard Way to Build Transactional Objects Object Class is a Resource Manager (RM) Provides ACID objects from persistent storage Provides Undo (on rollback) Provides Redo (on restart or media failure) Provides Isolation for concurrent ops Microsoft SQL Server, IBM DB2, Oracle,… are Resource managers. Many more coming. Any masochist can build one

Outline Why Distributed Distributed data & objects Distributed execution Three tier architectures Transaction concepts Why transactions? Using transactions programming workflow

References Essential Client/Server Survival Guide 2nd ed. Orfali, Harkey & Edwards, J. Wiley, 1996 Client/Server Programming with Java and CORBA Orfali, Harkey, J Wiley, 1997 Principles of Transaction Processing Bernstein & Newcomer, Morgan Kaufmann, 1997 Transaction Processing Concepts and Techniques Gray & Reuter, Morgan Kaufmann, 1993 30