The Whirlwind Tour Chapter 1a.

The Whirlwind Tour Chapter 1a

Transactions: Where It All Started
[Cuneiform] documents now number about half a million, three- quarters of them more or less directly related to the history of law - dealing, as they do, with contracts, acknowledgment of debts, receipts, inventories, and accounts, as well as containing records and minutes of judgments rendered in courts, business letters, administrative and diplomatic correspondence, laws, international treaties, and other official transactions. The total evidence enables the historian to reach back as far as the beginnings of writing, to the dawn of history.[ ... ] Moreover, because of the inconvenience of writing in stone or clay, Mesopotamians wrote only when economic or political necessity demanded it. (Encyclopaedia Britannica, 1974 edition) WICS 1999 Transaction Processing: Gray & Reuter

From Transactions to Transaction Processing Systems - I
The Sumerian way of doing business involved two components: Database. An abstract system state, represented as marks on clay tablets, was maintained. Today, we would call this the database. Transactions. Scribes recorded state changes with new records (clay tablets) in the database. Today, we would call these state changes transactions. WICS 1999 Transaction Processing: Gray & Reuter

From Transactions to Transaction Processing Systems - II
The real state is represented by an abstraction, called the database, and the transformation of the real state is mirrored by the execution of a program, called a transaction, that transforms the database. WICS 1999 Transaction Processing: Gray & Reuter

Transactions Are In ... Communications: Each time you make a phone call, there is a call setup transaction that allocates some resources to your conversation; the call teardown is a second transaction, freeing those resources. The call setup increasingly involves complex algorithms to find the callee (800 numbers could be anywhere in the world) and to decide who is to be billed (800 and 900 numbers have complex billing). The system must deal with features like call forwarding, call waiting, and voice mail. After the call teardown, billing may involve many phone companies. WICS 1999 Transaction Processing: Gray & Reuter

Transactions Are In ... Finance:
Each time you purchase gas using a credit card, the point-of-sale terminal connects to the credit card company's computer. In case that fails, it may alternatively try to debit the amount to your account by connecting to your bank. This generalizes to all kinds of point-of-sale terminals such as cash registers, ATMs, etc. When banks balance their accounts with each other (electronic fund transfer), they use transactions for reliability and recoverability.

Transactions Are In ... Travel:
Making reservations for a trip requires many related bookings and ticket purchases from airlines, hotels, rental car companies, and so on. From the perspective of the customer, the whole trip package is one purchase. From the perspective of the multiple systems involved, many transactions are executed: One per airline reservation (at least), one for each hotel reservation, one for each car rental, one for each ticket to be printed, on for setting up the bill, etc. Along the way, each inquiry that may not have resulted in a reservation is a transaction, too.

Transactions Are In ... Manufacturing:
Order entry, job and inventory planning and scheduling, accounting, and so on are classical application areas of transaction processing. Computer integrated manufacturing (CIM) is a key technique for improving industrial productivity and efficiency. Just-in-time inventory control, automated warehouses, and robotic assembly lines each require a reliable data storage system to represent the factory state.

Transactions Are In ... Real-Time Systems:
This application area includes all kinds of physical machinery that needs to interact with the real world, either as a sensor, or as an actor. Traditionally, such systems were custom made for each individual plant, starting from the hardware. The usual reason for that was that 20 years ago off-the-shelf systems could not guarantee real-time behavior that is critical in these applications. This has changed, and so has the feasibility of building entire systems from scratch. Standard software is now used to ensure that the application will be portable.

A Transaction Processing System
A transaction processing system (TP-system) provides tools to ease or automate application programming, execution, and administration of complex, distributed applications. Transaction processing applications typically support a network of devices that submit queries and updates to the application. Based on these inputs, the application maintains a database representing some real-world state. Application responses and outputs typically drive real-world actuators and transducers that alter or control the state. The applications, database, and network tend to evolve over several decades. Increasingly, the systems are geographically distributed, heterogeneous (they involve equipment and software from many different vendors), continuously available (there is no scheduled downtime), and have stringent response time requirements. WICS 1999 Transaction Processing: Gray & Reuter

ACID Properties: First Definition
Atomicity: A transaction’s changes to the state are atomic: either all happen or none happen. These changes include database changes, messages, and actions on transducers. Consistency: A transaction is a correct transformation of the state. The actions taken as a group do not violate any of the integrity constraints associated with the state. This requires that the transaction be a correct program. Isolation: Even though transactions execute concurrently, it appears to each transaction T, that others executed either before T or after T, but not both. Durability: Once a transaction completes successfully (commits), its changes to the state survive failures. WICS 1999 Transaction Processing: Gray & Reuter

Structure of a Transaction Program
The application program declares the start of a new transaction by invoking BEGIN_WORK(). All subsequent operations will be covered by the transaction. Eventually, the application program will call COMMIT_WORK(), if a new consistent state has been reached. This makes sure the new state becomes durable. If the application program cannot complete properly (violation of consistency constraints), it will invoke ROLLBACK_WORK(), which appeals to the atomicity of the transaction, thus removing all effects the program might have had so far. If for some reason the application fails to call either commit or rollback (there could be an endless loop, a crash, a forced process termination), the transaction system will automatically invoke ROLLBACK_WORK() for that transaction. WICS 1999 Transaction Processing: Gray & Reuter

The End User’s View of a Transaction Processing System
WICS 1999 Transaction Processing: Gray & Reuter

The Administrator's/Operator’s View of a TP System

Performance Measures of Interactive Transactions
Performance/ Small/Simple Medium Complex Transaction ________________________________________________________________ Instr./transaction 100k 1M 100M Disk I/O / TA Local msgs. (B) 10 (5KB) (50KB) (1MB) Remote msgs. (B) 2 (300B) (4KB) 100 (1MB) Cost/TA/second 10k$/tps k$/tps 1M$/tps Peak tps/site WICS 1999 Transaction Processing: Gray & Reuter

Client-Server Computing: The Classical Idea

Client-Server Computing: The CORBA Idea
Client on WS Presentation Services etc Object Implementation: Jim´s Mailbox IDL Stub IDL Skeleton Request: Delete Object Request Broker WICS 1999 Transaction Processing: Gray & Reuter

Client-Server Computing: The WWW Idea
Java- applet HTTP WWW- Browser Server JDBC- driver code proprietary protocol Java-Applet + Java Database Connection (JDBC) Driver Code prop. protocol JDBC-ODBC- bridge ODBC driver Database Server JDBC network driver public protocol JDBC driver (e.g. TCP/IP) WICS 1999 Transaction Processing: Gray & Reuter

Using Transactional Remote Procedure Calls (TRPCs)

Terms We Have Introduced So Far
Resource manager: The system comes with an array of transactional resource managers that provide ACID operations on the objects they implement. Database systems, persistent programming languages, and queue managers are typical examples. Durable state: Application state represented as durable data stored by the resource managers. TRPC: Transactional remote procedure calls allow the application to invoke local and remote resource managers as though they were local. They also allow the application designer to decompose the application into client and server processes on different computers. Transaction program: Inquiries and state transfor-mations are written as programs in conventional or specialized programming languages. The programmer brackets the successful execution of the program with a Begin-Commit pair and brackets a failed execution with a Begin-Rollback pair. WICS 1999 Transaction Processing: Gray & Reuter

Terms We Have Introduced So Far
Atomicity: At any point before the commit, the application or the system may abort the transaction, invoking rollback. If the transaction is aborted, all of its changes to durable objects will be undone (reversed), and it will be as though the transaction never ran. Consistency: The work within a Begin-Commit pair must be a correct transformation. Isolation: While the transaction is executing, the resource managers ensure that all objects the transaction reads are isolated from the updates of concurrent transactions. Durability: Once the commit has been successfully executed, all the state transformations of that transaction are made durable and public. WICS 1999 Transaction Processing: Gray & Reuter

The World According to the Resource Manager

Where To Split Client/Server?
Thin Fat Presentation Flow Control Application Logic (=business objects) Server Data Access Fat Thin

Client/Server Infrastructure
Middleware Server Objects Group- ware TP-Mon. DBMS OS GUI OOUI System Mgmt. OS Files SQL ORB TRPC Mail Security WWW Transport etc. WICS 1999 Transaction Processing: Gray & Reuter

Transactional Core Services

The X/Open TP-Model WICS 1999 Transaction Processing: Gray & Reuter

The X/Open Distributed Transaction Processing Model

The OTS Model transaction originator recoverable server TA- context
transmitted with request transaction originator recoverable server TA- context creation termination commit coordination invocation Transaction service TA- context TA- context WICS 1999 Transaction Processing: Gray & Reuter

Transaction Processing System Feature List
Application development features Application generators; graphical programming interfaces; screen painters; compilers; CASE tools; test data generators; starter system with a complete set of administrative and operations functions, security, and accounting. Repository features Description of all components of the system, both hardware and software. Description of the dependencies among components (bill-of-material). Description of all changes to all components to keep track of different versions. The repository is a database. Its role in the system must be complete, extensible, active and allow for local autonomy. TP-Monitor Features Process management; server classes; transactional remote procedure calls; request-based authentication and authorization; support for applications and resource managers in implementing ACID operations on durable objects. WICS 1999 Transaction Processing: Gray & Reuter

Transaction Processing System Feature List
Data communications features Uniform I/O interfaces; device independence; virtual terminal; screen painter support; support for RPC and TRPC; support for context-oriented communication (peer-to-peer). Database features Data independence; data definition; data manipulation; data control; data display; database operations. Operations features Archiving; reorganization; diagnosis; recovery; disaster recovery; change control; security; system extension. Education and testing features Imbedded education; online documentation; training systems; national language features; test database generators; test drivers. WICS 1999 Transaction Processing: Gray & Reuter

Data Communications Protocols

Presentation Management

SQL Data Definition WICS 1999 Transaction Processing: Gray & Reuter

SQL Data Manipulation WICS 1999 Transaction Processing: Gray & Reuter

Summary of Chapter 1 A transaction processing system is a large web of application generators, system design and operation tools, and the more mundane language, database, network, and operations software. The repository and the applications that maintain it are the mechanisms needed to manage the TP system. The repository is a transaction processing application. It represents the system configuration as a database and supplies change control by transactions that manipulate the configuration and the repository. The transaction concept, like contract law, is intended to resolve the situation when exceptions arise. The first order of business in designing a system is, therefore, to have a clear model of system failure modes. What breaks? How often do things break? WICS 1999 Transaction Processing: Gray & Reuter

Basic Terminology Chapter 1b

A Word About Words (Chapter 2)
Humpty Dumpty: “When I use a word, it means exactly what I chose it to mean; nothing more nor less.” Alice: “The question is, whether you can make words mean so many different things.” Humpty Dumpty: “The question is, which is to be master, that’s all.” Lewis Carroll WICS 1999 Transaction Processing: Gray & Reuter

Basic Computer Terms To get any confusion that might be caused by the many synonyms in our field out of the way, let us adopt the following conventions for the rest of this class: domain = data type = ... field = column = attribute = ... record = tuple = object = entity = ... block = page = frame = slot = ... file = data set = table = ... process = task = thread = actor = ... function=request=method=... All the other terms and definitions we need will be briefly introduced and explained during the session. WICS 1999 Transaction Processing: Gray & Reuter

Basic Hardware Architecture I
In Bell and Newell’s classic taxonomy, hardware consists of three types of modules: Processors, memory, and communications (switches or wires). Processors execute instructions from a program, read and write memory, and send data via communication lines. Computers are generally classified as supercomputers, mainframes, minicomputers, workstations, and personal computers. However, these distinctions are becoming fuzzy with current shifts in technology. WICS 1999 Transaction Processing: Gray & Reuter

Basic Hardware Architecture II
Today’s workstation has the power of yesterday’s mainframe. Similarly, today’s WAN (wide area network) has the communications bandwidth of yesterday’s LAN (local area network). In addition, electronic memories are growing in size to include much of the data formerly stored on magnetic disk. These technology trends have deep implications for transaction processing. WICS 1999 Transaction Processing: Gray & Reuter

Basic Hardware Architecture III
Distributed processing: Processing is moving closer to the producers and consumers of the data (workstations, intelligent sensors, robots, and so on). Client-server: These computers interact with each other via request-reply protocols. One machine, called the client, makes requests to another, called the server. Of course, the server may in turn be a client to other machines. Clusters: Powerful servers consist of clusters of many processors and memories, cooperating in parallel to perform common tasks. WICS 1999 Transaction Processing: Gray & Reuter

Basic Hardware Architecture IV

Memories - The Economic Perspective I
The processor executes instructions from virtual memory, and it reads and alters bytes from the virtual memory. The mapping between virtual memory and real memory includes electronic memory, which is close to the processor, volatile, fast, and expensive, and magnetic memory, which is "far away" from the processor, non-volatile, slow, and cheap. The mapping process is handled by the operating system with some hardware assistance. Memory performance is measured by its access time: Given an address, the memory presents the data at some later time. The delay is called the memory access time. Access time is a combination of latency (the time to deliver the first byte), and transfer time (the time to move the data). Transfer time, in turn, is determined by the transfer size and the transfer rate. This produces the following overall equation: memory access time = latency + ( transfer size / transfer rate ) WICS 1999 Transaction Processing: Gray & Reuter

Memories - The Economic Perspective II
Memory price-performance is measured in one of two ways: Cost/byte. The cost of storing a byte of data in that media. Cost/access. The cost of reading a block of data from that media. This is computed by dividing the device cost by the number of accesses per second that the device can perform. The actual units are cost/access/second, but the time unit is implicit in the metric’s name. These two cost measures reflect the two different views of a memory’s purpose: it stores data, and it receives and retrieves data. WICS 1999 Transaction Processing: Gray & Reuter

Memories- The Economic Perspective III
Typical large system capacity WICS 1999 Transaction Processing: Gray & Reuter

Memories- The Economic Perspective VI
$ / MB WICS 1999 Transaction Processing: Gray & Reuter

Magnetic Memory There are two types of magnetic storage media: disk and tape. Disks rotate, passing the data in the cylinder by the electronic read-write heads every few milliseconds. This gives low access latency. The disk arm can move among cylinders in tens of milliseconds. Tapes have approximately the same storage density and transfer rate, but they must move long distances if random access is desired. Consequently, tapes have large random access latencies—on the order of seconds. Disk Access Time = Seek_Time + Rotational_Latency + (Transfer_Size/ Transfer_Rate) WICS 1999 Transaction Processing: Gray & Reuter

Magnetic Memory Compare the times required for two access patterns to 1MB stored in 1000 blocks on disk: Sequential access: Read or write sectors [x, x + 1, ..., x + 999] in ascending order. This requires one seek (10 ms) and half a rotation (5 ms) before the data in the cylinder begins transferring the megabyte at 10 MBps (the transfer takes 100 ms, ignoring one-cylinder seeks). The total access time is 115ms. Random access: Read the 1000 sectors [x, ..., x + 999] in random order. In this case, each read requires a seek (10 ms), half a rotation (5 ms), and then the 1 kb transfer (.1 ms). Since there are 1000 of these events, the total access time is 15.1 seconds. WICS 1999 Transaction Processing: Gray & Reuter

Memory Hierarchies WICS 1999 Transaction Processing: Gray & Reuter

Memory Hierarchies The hierarchy uses small, fast, expensive cache memories to cache some data present in larger, slower, cheaper memories. If hit ratios are good, the overall memory speed approximates the speed of the cache. At any level of the memory hierarchy, the hit ratio is defined as: hit ratio = references satisfied by cache / all references to cache Suppose a cache memory with access time C has hit rate H, and suppose that on a miss the secondary memory access time is S. Further, suppose that C = .01 • S. The effective access time of the cache will be as follows: Effective memory access time = H • C + (1 - H) • S = H • (.01 • S) + ( 1 - H) • S = ( • H) • S » (1 - H) • S WICS 1999 Transaction Processing: Gray & Reuter

The Five Minute Rule Assume there are no special response time (real-time) requirements; the decision to keep something in cache is, therefore, purely economic. To make things simple, suppose that data blocks are 10 KB. At 1995 prices, 10 KB of main memory cost about $1. Thus, we could keep the data in main memory forever if we were willing to spend a dollar. With 10 KB of disk costing only $.10, we could save $.90 if we kept the 10 KB on disk. In reality, the savings are not so great; if the disk data is accessed, it must be moved to main memory, and that costs something. How much, then, does a disk access cost? A disk, along with all its supporting hardware, costs about $3,000 (in 1995) and delivers about 30 acc./sec.; the cost, therefore, is about $100. At this rate, if the data is accessed once a second, it costs $ to store it on disk (disk storage and disk access costs). That is considerably more than the $1 to store it in main memory. The break-even point is about one access per 100 seconds. At that rate, the main memory cost is about the same as the disk storage cost plus the disk access costs. At a more frequent access rate, diskstorage is more expensive. At a less frequent rate, disk storage is cheaper. Anticipating the cheaper main memory that will result from technology changes, this observation is called the five-minute rule rather than the two-minute rule. WICS 1999 Transaction Processing: Gray & Reuter

The Five Minute Rule Keep a data item in electronic memory if its access frequency is five minutes or higher; otherwise keep it in magnetic memory. Similar arguments apply to objects stored on tape and cached on disk. Given the object size, the cost of cache, the cost of secondary memory, and the cost of accessing the object in secondary memory once per second, the frequency at the break-even point in units of accesses per second (a/s) is given by the following formula: Frequency » ((Cache_Cost/Byte - Secondary_Cost/Byte) . Object_Bytes) / (Object_Access_Per_Second_Cost) a/s WICS 1999 Transaction Processing: Gray & Reuter

The Rules of Exponential Growth
Electronic memory: MemoryChipCapacity(year) = Kb/chip for year in [ ] Moore’s Law Magnetic memory: MagneticAreaDensity(year) = Mb/inch2 for year [ ] Hoagland’s Law Processors: SunMips(year) = MIPS for year in [ ] Joy’s Law ((year-1970)/3) ((year-1970)/10) (year-1984) WICS 1999 Transaction Processing: Gray & Reuter

Communication Hardware
The early 90s The definition of the four kinds of networks by their diameters. These diameters imply certain latencies (based on the speed of light). In 1990, Ethernet (at 10 Mbps) was the dominant LAN. Metropolitan networks typically are based on 1 Mbps public lines. Such lines are too expensive for transcontinental links at present; most long-distance lines are therefore 50 Kbps or less. As you will get from the news, these things are changing fast. WICS 1999 Transaction Processing: Gray & Reuter

Communication Hardware
Scenario 2000 Point-to-point bandwidth likely to be common among computers by the year 2000. WICS 1999 Transaction Processing: Gray & Reuter

Processor Architectures

Processor Architectures
Shared nothing: In a shared-nothing design, each memory is dedicated to a single processor. All accesses to that data must pass through that processor. Processors communicate by sending messages to each other via the communications network. Shared global: In a shared-global design, each processor has some private memory not accessible to other processors. There is, however, a pool of global memory; shared by the collection of processors. This global memory is usually addressed in blocks (units of a few kilobytes or more) and is RAM disk or disk. Shared memory: In a shared-memory design, each processor has transparent access to all memory. If multiple processors access the data concurrently, the underlying hardware regulates the access to the shared data and provides each processor a current view of the data. WICS 1999 Transaction Processing: Gray & Reuter

Address Spaces WICS 1999 Transaction Processing: Gray & Reuter

Address Spaces Memory segmentation and sharing: A process executes in an address space—a paged, segmented array of bytes. Some segments may be shared with other address spaces. The sharing may be execute-only, read-only, or read-write. Most of the segment slots are empty (lightly shaded boxes), and most of the occupied segments are only partially full of programs or data. To simplify memory addressing, the virtual address space is divided into fixed-size segment slots, and each segment partially fills a slot. Typical slot sizes range from 2**24 to 2**32 bytes. This gives a two-dimensional address space, where addresses are {segment_number, byte}. Again, segments are often partitioned into virtual memory pages, which are the unit of transfer between main and secondary memory. If an object is bigger than a segment, it can be mapped into consecutive segments of the address. WICS 1999 Transaction Processing: Gray & Reuter

Processes A process is a virtual processor. It has an address space that contains the program the process is executing and the memory the process reads and writes. One can imagine a process executing Java programs statement by statement, with each statement reading and writing bytes in the address space or sending messages to other processes. Processes provide an ability to execute programs in parallel; they provide a protection entity; and they provide a way of structuring computations into independent execution streams. So they provide a form of fault containment in case a program fails. Processes are building blocks for transactions, but the two concepts are orthogonal. A process can execute many different transactions over time, and parts of a single transaction may be executed by many processes. Each process executes on behalf of some user, or authority, and with some priority. The authority determines what the process can do: which other processes, devices, and files the process can address and communicate with. The process priority determines how quickly the process’s demand for resour-ces will be serviced if other processes make competing demands. Short tasks typically run with high priority, while large tasks are given lower priority. WICS 1999 Transaction Processing: Gray & Reuter

Protection Domains There are two ways to provide protection :
Process = protection domain: Each subsystem executes as a separate process with its own private address space. Applications execute subsystem requests by switching processes, that is, by sending a message to a process. Address space = protection domain: A process has many address spaces: one for each protected subsystem and one for the application. Applications execute subsystem requests by switching address spaces. The address space protection domain of a subsystem is just an address space that contains some of the caller’s segments; in addition, it contains program and data segments belonging to the called subsystem. A process connects to the domain by asking the subsystem or OS kernel to add the segment to the address space. Once connected, the domain is callable from other domains in the process by using a special instruction or kernel call. WICS 1999 Transaction Processing: Gray & Reuter

Protection Domains A process may have many protection domains.

Threads There is a need for multiple processes per address space:
For example, to scan through a data stream, one process is appointed the producer, which reads the data from an external source, while the second process processes the data. Further examples of cooperating processes are file read-ahead, asynchronous buffer flushing, and other housekeeping chores in the system. Processes can share the same address space simply by having all their address spaces point to the same segments. Most operating systems do not make a clean distinction between address spaces and processes. Thus a new concept, called a thread or a task, is introduced. But note: Several operating systems do not use the term process at all. For example, in the Mach operating system, thread means process, and task means address space; in MVS, task means process, and so on. WICS 1999 Transaction Processing: Gray & Reuter

Threads The term thread often implies a second property: inexpensive to create and dispatch. Threads are commonly provided by some software that found the operating system processes to be too expensive to create or dispatch. The thread software multiplexes one big operating system process among many threads, which can be created and dispatched hundreds of times faster than a process. The term thread is used in the following to connote these light-weight processes. Unless this light-weight property is intended, “process” is used. Several threads usually share a common address space. Typically, all the threads have the same authorization identifier, since they are part of the same address space domain, but they may have different scheduling priorities. WICS 1999 Transaction Processing: Gray & Reuter

Messages and Sessions There are two styles of communication among processes: Datagrams: The sender of a message determines the recipient's address (e.g. the process name) and constructs an envelope consisting of the sender's name and address, the recipient's name and address, and the message text. This envelope is delivered to the capable hands of the communication system. It is analogous to sending letters by mail. Sessions: Before any messages are sent, a fixed connection is established between sender and receiver, a so-called session. Once it has been established, both parties can send and receive messages via this session. This symmetry is often referred to as "peer-to-peer". Establishing a session requires a datagram. A session must at some point be closed down explicitly. It is analogous to a phone conversation. WICS 1999 Transaction Processing: Gray & Reuter

Advantages of Sessions
Shared state: A session represents shared state between the client and the server. A datagram might go to any process with the designated name, but a session goes to a particular instance of that name. Authorization: Processes do not always trust each other. The server often checks the client’s credentials to see that the client is authorized to perform the requested function. The authentication protocols require multi-message exchanges. Once the session key is established, it is shared state. Error correction: Messages flowing in each session direction are numbered sequentially. These sequence numbers can detect lost messages and duplicate messages. Performance: The operations described are fairly costly. Each of the steps often involves several messages. By establishing a session, this information is cached. WICS 1999 Transaction Processing: Gray & Reuter

Clients and Servers The question of how computations consisting of many interacting processes should be structured has no simple answer. Currently, two styles are particularly popular: peer-to-peer and client-server. The debate about which style is "better" often creates the impression that they are radically different. But in reality, peer-to-peer is more general and more complex, and it subsumes client-server. Here is a brief characterization: Peer-to-peer: The two processes are independent peers, each executing its computation and occasionally exchanging data with the other. Client-server: The two processes interact via request-reply exchanges in which one process, the client, makes a request to a second process, the server, which performs this request and replies to the client. WICS 1999 Transaction Processing: Gray & Reuter

Clients and Servers The limitation of the client-server model lies in the fact that it implies a synchronous pattern of one request/one response. There are, however, cases in which one request generates thousands of replies, or where thousands of requests generate one reply. Operations that have this property include transferring a file between the client and server or bulk reading and writing of databases. In other situations, a client request generates a request to a second server, which, in turn, replies to the client. Parallelism is a third area where simple RPC is inappropriate. Because the client-server model postulates synchronous remote procedure calls, the computation uses one processor at a time. However, there is growing interest in schemes that allow many processes to work on problems in parallel. The RPC model in its simplest form does not allow any parallelism. WICS 1999 Transaction Processing: Gray & Reuter

Remote Procedure Calls (RPCs)

Naming Naming has to do with the problem of how a client denotes a server it wants to invoke. Typical naming schemes distinguish between an object's name, its address, and its location. The name is an abstract identifier for the object, the address is the path to the object, and the location is where the object is. An object can have several names. Some of these names may be synonyms, called aliases. Let us say that Bruce and Lindsay are two aliases for Bruce Lindsay. For this to be explicit, all names, addresses, and locations must be interpreted in some context, called a directory. For example, in our RPC context, Bruce means Bruce Nelson, and in our publishing context, Bruce means Bruce Spatz. Within the 408 telephone area, Bruce Lindsay’s address is , and outside the United States it is WICS 1999 Transaction Processing: Gray & Reuter

Name Servers Names are grouped into a hierarchy called the name space. An international commission has defined a universal name space standard, X.500, for computer systems. The commission administers the root of that name space. Each interior node of the hierarchy is a directory. A sequence of names delimited by a period (.) gives a path name from the directory to the object. No one stores the entire name space—it is too big, and it is changing too rapidly. Certain processes, called name servers, store parts of the name space local to their neighborhood; in addition, they store a directory of more global name servers. WICS 1999 Transaction Processing: Gray & Reuter

Authentication Techniques
Passwords are the simplest technique. The client has a secret password, a string of bytes known only to it and the server. The client sends his password to the server to prove the client’s identity. A second password is then needed to authenticate the server to the client. Thus, two passwords are required, and they must be sent across the wire. Challenge-response uses only one password or key. In this scheme, the client and the server share a secret encryption key. The server picks a random number, N, and encrypts it with the key as EN. The server sends EN to the client and challenges the client to decrypt it using the secret key. If the client responds with N, the server believes the client knows the secret encryption key. The client can also authenticate the server by challenging it to decrypt a second random number. The shared secret is stored at both ends, but random numbers are sent across the wire. WICS 1999 Transaction Processing: Gray & Reuter

Authentication Techniques
Public key system: Each authid has a pair of keys—a public encryption key, EK, and a private decryption key, DK. The keys are chosen so that DK(EK(X)) = X, but knowing only EK and EK(X) it is hard to compute X. Thus, a process’s ability to compute X from EK(X) is proof that the process knows the secret DK. Each authid publishes its public key to the world. Anyone wanting to authenticate the process as that authid goes through the challenge protocol: The challenger picks a random number X, encrypts it with the authid’s public key EK, and challenges the process to compute X from EK(X). Secrets are stored in one place only, and they do not go across the wire.

Scheduling The purpose of scheduling is to make sure all requests get processed, i.e. are assigned to a specific server process. There are basically two additional constraints: Short response times: The requests should not wait longer than necessary before they get serviced. Economic usage of resources: The required throughput should be achieved with the minimum number of resources (processors, nodes, links, etc.). Throughput and response time at resource utilization r are related by the following formula: Average_Response_Time(r) = (1/ (1 - r)) • Service_Time WICS 1999 Transaction Processing: Gray & Reuter

The Scheduling Problem

File Organizations WICS 1999 Transaction Processing: Gray & Reuter

SQL in a Distributed Environment

Software Performance WICS 1999 Transaction Processing: Gray & Reuter

Protocol Standards Porting and Installation Steps Portable
message formats protocol machine Client Machine Operating System Server Unix VMS API compiler Portable Program linker/loader "local" compiled program Porting and Installation Steps Client process FAP Server Machine Operation and Inter-Operation WICS 1999 Transaction Processing: Gray & Reuter

Relevant FAP-Standards
CSMA/CD, Token Ring, etc.: Low-level protocols that specify how bits are physically transmitted across a shared medium. IP/TCP, NetBIOS, HTTP: Transport level protocols. LU6.2: SNA´s peer-to-peer protocol that allows both session oriented and client-server-style communication under transaction protection. OSI-TP: ISO´s rendering of a protocol that provides a functionality very similar to LU6.2. ASN.1: Protocol for exchanging data formatting and structuring information. Required for RPCs in a heterogeneous environment. DRDA: Interoperability standard for IBM SQL-systems. ODBC, JDBC: Interoperability standards for general SQL-systems. WICS 1999 Transaction Processing: Gray & Reuter

Relevant API-Standards
SQL: Portability standard for accessing relational databases (lots of proprietary extensions). APPC, CPI-C: Two of IBM´s APIs for the LU6.2 protocol. X/Open-XA, X/Open-XA+, etc.: APIs by the X/Open consortium on ISO´s OSI-TP protocols. IDL: OMG´s interface definition language to let objects be integrated through an object request broker. STDL: Language for programming TP-applications; based on the ACMS TP-monitor. Java: The web´s favorite programming language; comes with its own FAP-component. WICS 1999 Transaction Processing: Gray & Reuter

OSI Standards and X/Open APIs

A Last Glance at TP-Standards
Each resource manager (RM) registers with its local transaction manager (TM). Applications start and commit transactions by calling their local TM. At commit, the TM invokes every participating RM. If the transaction is distributed, the communications manager informs the local and remote TM about the incoming or outgoing transaction, so that the two TMs can use the OSI-TP protocol to commit the transaction. WICS 1999 Transaction Processing: Gray & Reuter

Summary Transaction processing systems comprise all parts of a system, software and hardware. Building such a system requires to consider end-to-end arguments at all levels of abstraction. The performance of distributed TP systems is influenced by the hardware architecture (what is shared), by software issues (which protocols are used), and by configuration aspects (what limits scaleability). The multitude of those influences gives rise to a constant dilemma: Should one restrict the variety to few (proprietary) components for better tuning and performance, or should one embrace all the standards for openness - at the risk of poor scaleability and performance? WICS 1999 Transaction Processing: Gray & Reuter

The Whirlwind Tour Chapter 1a.

Similar presentations

Presentation on theme: "The Whirlwind Tour Chapter 1a."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

The Whirlwind Tour Chapter 1a.

Similar presentations

Presentation on theme: "The Whirlwind Tour Chapter 1a."— Presentation transcript:

Similar presentations

About project

Feedback