1DT057 Distributed Information System

1DT057 Distributed Information System
Middleware: RPC and CORBA The questions that we try to answer this lecture are: What is a distributed system, why should we bother to construct systems in a distributed fashion and what are the key properties of a distributed system?

Outline 1. Conceptual Framework 2. RPCs 3. CORBA Object Model 4. CORBA Remote Invocation 5. Comparison 6. Summary In order to compare RPCs and CORBA in a systematic way, we need to think about a conceptual framework for this comparison in the first place. This framework will highlight the different features that builders of a distributed systems expect from such an infrastructure. We are then able to assess RPCs against these requirements. This assessment will explain how RPCs work and how they can actively be used by students. In the later part of this topic, we shall review those features of CORBA that have not been discussed so far. The comparison will then reveal the strengths and weaknesses of RPCs and CORBA. The main result is that CORBA provides more sophisticated support for an application builder who wishes to construct a distributed system. The builder, however, has to be prepared to pay a price as in general RPCs will outperform CORBA implementations and also RPCs are much cheaper (at least on UNIX workstations) than CORBA implementations.

1 Conceptual Framework Architecture.
Accessing components from programming languages. Interfaces to lower layers. Component identification (to achieve location transparency). Service invocation styles. Handling of failures. In the conceptual framework that we use for the comparison of the two infrastructures we first take an architectural perspective. This will identify the different modules that are involved in a distributed system based on the respective infrastructure. This architectural perspective then enables us to consider the way how the different modules of the architecture are derived and how they interface with modules that are coded by the application developer. We then review how the infrastructures interface to implementations of lower layers of the ISO/OSI reference model. We then review how the different infrastructures support the identification of distributed components. A major criterion here is how the infrastructures achieve location transparency. After that we discuss the primitives that are available in the infrastructure for the invocation of services that are offered by remote components. Finally, we consider what can go wrong during the invocation of a service and review how the infrastructures cope with these failures.

2 Remote Procedure Calls
Overview of RPC architecture. Generation of client/server stubs. RPC interface. Binding. Handling of remote procedures. Failures of RPCs. The instantiation of this conceptual framework for Remote Procedure Calls is sketched by the outline of this second part of the topic. We are going to give an overview of the RPC architecture that displays the various modules that are involved in handling a remote procedure invocation. We will see that two important modules are the client stub and the server stub. These stubs are generated from a higher-level definition that is written in the Remote Procedure Call Language (RPCL). The stubs use at a lower level an interface to TCP or UDP which is called the RPC interface. We are then going to review how an RPC client identifies a server in a network. A technique called binding is used for that. We are then going to review how remote procedure calls are handled. This will reveal that the synchronization is identical to those of local procedure calls. In addition, we discuss strategies for handling multiple concurrent invocations and availability of servers. A major difference between local and remote procedure calls is the number of failures that may occur. We shall see how clients are informed about failures if they are detected during an invocation of an RPC.

Client Server Network 2.1 RPC Architecture Local Call Stub RPC
Interface Remote Procedure send receive There are usually three processes involved in an RPC. The client process invokes the procedure from the server process. In order to contact a server process, the client first talks to a network process (the portmap daemon on the server side), which looks for the server process and establishes the connection. The inet (internet address manipulation) daemon may also start the server process if necessary. The RPC call at the client side is implemented by invoking a client stub, which is a proxy for the remote procedure in the client process, through a local procedure call. The purpose of the client stubs is to implement parameter transformation into a common external data format, marshalling/unmarshalling of parameters and the synchronization of RPCs. The stub invokes operations from the RPC interface, a lower level interface to send and receive messages on the basis of TCP or UDP. On the server side, the RPC interface receives a message from the inetd and identifies a server stub. The server stub unmarshals parameters and invokes the RPC implementation, which has to be provided by the application developer. The client/server stubs are generated by a compiler (rpcgen) from the RPC interface definition provided in the RPC language. There is only one RPC interface. It is provided in a library which has to be linked together with client stubs and the code containing the local call on the client side and the server stub and the RPC implementation on the server side.

2.2 The RPC Language Definition of types (similar to C).
Component is described as a PROGRAM. PROGRAM has an identification and a version number. PROGRAM exports procedures Procedures have a result type and a parameter list, Procedure can be called from remote components, Call can be defined statically or dynamically. Client and server stubs are generated from an interface definition written in the Remote Procedure Call Language (RPCL). RPCL not only includes primitives for defining remote procedure calls, but also for types that may then be used as parameter and result types. The definition of types is needed for the generation of data structure conversions into an external data representation as well as for parameter marshalling. Components that can execute remote procedure calls are called programs. A program is identified by a name. There can be several programs in an RPC server process. However, the program names have to be unique. RPCs have a mechanism for versioning. This facilitates the controlled evolution of RPC servers over time. Clients therefore have to identify not only a program name, but also a version number that they want to access. This version number is defined as part of the program declaration. The most important part of a program is the declaration of remote procedure signatures. The signature definition includes a procedure name that must be unique within the program, a list of parameter types and a result type of the procedure. The procedure can be called statically by using client stubs or dynamically, i.e. the call is defined at run-time and issued using the RPC interface. For a dynamic definition, clients use unique procedure identification numbers.

2.2 The RPC Language (Example)
/* person.x */ const NL=64; enum sex_type { FEMALE = 1,MALE = 2 }; struct Person { string first_name<NL>; string last_name<NL>; sex_type sex; string city<NL>; program PERSONPROG { version PERSONVERS { void PRINT(Person)=0; int STORE(Person)=1; Person LOAD(int)=2; } = 0; } = ; This example includes the definition of a set of remote procedures that deal with person information. The program on the right hand side declares three operations that enable clients to print, store and load person data on a server. By convention, remote procedure call declarations are stored in files with the suffix .x. The first line declares a constant that is used to determine the length of names stored as character strings. The second and third lines declare an enumeration type for the sex of persons. The remaining lines on the left-hand side define a record with person data. The right-hand side defines a program containing three remote procedures. Identifiers, such as program names, version names and procedure names, are by convention declared in upper case letters. PRINT takes a person record and prints it into a file on the server. STORE takes a person record as an argument and stores it into a person database on the server. It returns a unique identifier for the person record. The identifier can be used as an argument to LOAD, which returns the person record that has been stored for that identifier.

2.2 Generation of Stubs rpcgen person.x client.c server.c
C Compiler, Linker person.h person_clnt.c person_svc.c person_xdr.c Client Server includes generates reads librpc.a This picture shows the dependencies between the different modules that are used to build clients and servers that communicate via remote procedure calls. The RPCL file person.x is compiled by an RPCL compiler that is usually called rpcgen in UNIX. rpcgen creates four files. These files do not have to be modified by the programmer. The files are: C type definitions for the types and function prototypes for the procedures declared in the RPCL file (person.h), the client stub implementation (person_clnt.c), the server stub implementation (person_svc.c) and the conversion operations into external data representation (person_xdr.c). The client program (client.c) includes the prototype definitions from person.h. Thus the conformance of actual parameter types supplied to rpc calls to their declaration is indirectly checked in this way. The server program includes the implementation of the various remote procedures. It also includes person.h to enable the compiler to check type conformance of procedure signatures in the implementation with those in the RPCL declaration. Finally the C compiler and the linker are used to compile and bind the executable client and server programs. The linker therefore also reads the RPC interface implementations that are provided in an operating system library. Typically this library is called librpc.a in UNIX.

2.2 Implementation of Server
/* server.c */ void * print_0(Person *argp, struct svc_req * rqstp) { static char * result; printf("%s %s\n%s\n\n", argp->first_name, argp->last_name, argp->city); return((void *) &result); } The file server.c contains C functions for the three remote procedures defined in person.x. This slide shows the first of them. The function names are composed from the remote procedure name in lower case letters and the version number. Parameter and result types are translated from the types defined in the RPCL definition as follows: atomic types are translated into the respective types in C. strings are translated into char *. record types are passed as pointers to the respective records in C. An additional parameter is passed to provide information about the requester. The procedure implementation can use them, for instance, to validate whether the requester is actually allowed to execute a procedure. The return type is always a pointer type (even if it is a simple type such as char or int). A null pointer is returned to indicate a failure of a remote procedure call. Hence a pointer to a value has to be returned to indicate that a procedure has been completed successfully.

2.2 Use of Stubs /* client.c */ print_person(char * host, Person * pers) { ... if (print_0(pers, clnt)==NULL) /* call failed /* } The invocation of a remote procedure is done via a client stub. The name of the stub is derived from the RPC name as defined in RPCL declaration concatenated with the version number. Note that the invocation of a remote procedure looks very similar to local procedure calls. However, there are considerable differences. Firstly, local calls will be performed about a thousand times faster than a remote procedure call and secondly, remote procedure calls are more likely to fail and the client has to take this into account. The success or failure of the call is checked by comparing the result of the client stub with the null pointer. The client stub expects an additional parameter that identifies the client. This parameter is initialized through the RPC interface as we will see now.

2.3 RPC Interface Used by client or server directly:
Locating servers. Choosing a transport protocol. Authentication and security. Invoking RPCs dynamically. Used by stubs for: Generating unique message IDs. Sending messages. Maintaining message history. The RPC interface is used by client programs, procedure implementations in servers, as well as client and server stubs. The client uses the RPC interface during startup to locate a server and to choose a transport protocol. The server may use the RPC interface for security and authentication purposes. A client can also use the RPC interface to dynamically invoke a remote procedure. The client/server stubs use the RPC interface to send and receive messages through the transport protocol. In order to do so, the RPC interface provides facilities for generating unique message identifiers. These are required to maintain a message history, i.e to keep track of which messages have been sent successfully and for which messages an answer is still to be expected.

2.3 RPC Interface print_person(char * host, Person * pers) { CLIENT *clnt; clnt = clnt_create(host, PERSONPROG, PERSONVERS, "udp"); if (clnt == (CLIENT *) NULL) { exit(1); } if (print_0(pers, clnt)==NULL) clnt_perror(clnt, "call failed"); clnt_destroy(clnt); This is the completion of the example above that showed the use of the client stub. The highlighted parts display invocations of the RPC interface. They are used in this example for the establishing of a connection between client and server and for printing error messages. The first call to clnt_create creates a connection between the client and the server. The call returns a client profile that is then used during successive RPC on that server to identify the client. The parameters identify the name of the host, the name of the RPC program, the version and the transport protocol that is to be used for the connection between client and server. Note that the client name is provided as a string denoting a logical address. This means that some sort of naming (usually provided by the network information system) is used to achieve location transparency. The second invocation prints an RPC error message on the stream that is associated with stderr of the invocating process. The last invocation destroys the connection between client and server.

2.4 Binding How to locate an RPC server that can execute a given procedure in a network? Can be done statically (i.e. at compile-time) or dynamically (i.e. at run-time). Dynamic binding is supported by portmap daemons. A problem that arises is to locate that server in a network which supports the program with the desired remote procedures. This problem is referred to as binding. Binding can be done statically or dynamically. The binding we have seen in the last slide was most likely dynamic because the hostname was determined at run time by parameter passing. Static binding is fairly simple, but seriously limits migration and replication transparency. With dynamic binding the selection of the server is performed at run-time. This can be done in a way that migration and replication transparency is retained. Each host of a UNIX network that supports RPCs runs a particular daemon called portmap. Any RPC server registers its programs with the portmap daemon. Clients can contact a single portmap daemon to check whether the server supports a particular program or to get a list of all programs that reside on the server. Clients can also broadcast a search for a program to all portmaps of a network and the portmap daemons with which the program is registered will respond. In that way the client can locate the server program without knowing the topology of the network. Programs can then easily be migrated from one server to another and be replicated over multiple hosts with full transparency for clients.

2.5 Handling of Remote Procedures
Call handled synchronously by server. Concurrent RPCs: serial or concurrently. Server availability: continuous or on-demand. Remote procedure calls are handled synchronously, i.e. the client stub does not return until it has received the result from the server stub or an indication for a failure. Hence, from a client´s perspective remote procedure calls are synchronized in the same way as local procedure calls. Different clients may invoke a remote procedure at the same time. A question is how the server reacts to this. A server may queue requests and execute them one after another in a serialized manner or A server may spawn processes or threads for each separate request. Which of these two approaches is most appropriate is application dependent and therefore it cannot be decided in general. RPC frameworks usually support both approaches. Another design choice is whether a remote procedure program is always available or has to be started on demand. For startup on demand the RPC server is started by the inet daemon as soon as a request arrives. An additional configuration table has to be administered that provides for a mapping between remote procedure programs names and the location of programs in the file system. If an RPC program is always available requests will be handled more efficiently because startup overheads do not arise. On the other hand it uses resources continuously.

2.6 Failures of RPCs Machines or networks can fail at any time.
At most once semantics. RPC return value indicates success. Up to the client to avoid maybe semantics! Quite a number of things can go wrong during the invocation of a remote procedure. For example, request messages can be lost, reply messages can be lost, the server may fail during the execution, or the client may fail while it is waiting for the remote procedure call to return. The semantics that RPCs usually have is at most once. This means that the remote procedure is invoked and in case of a failure the requester is notified of the failure. Note that with respect to failures, remote procedure calls are totally different from local procedure calls. As we have seen clients are informed by using return values. A null pointer returned from a remote procedure call indicates that the procedure has not been executed properly. However, the clients do not get to know why the procedure call has failed. This makes it difficult for a client to decide what to do in the presence of a failure. If the client does not evaluate the return of a procedure call there is only a maybe semantics. Hence failure transparency is not achieved by remote procedure calls.

3 The CORBA Object Model Components objects.
Visible component state  object attributes. Usable component services  object operations. Component interactions  operation execution requests. Component service failures  exceptions. The component model of CORBA is based on the object-oriented paradigm. Hence components are seen as objects. Object-orientation seems appropriate due to its support of data abstraction and reuse. Attributes of objects are used to model the externally accessible state of components; this component state can thus be considered as the set of its current attribute values. The state of a teller machine controller, for instance, is the list of teller machines it currently controls, and the list of bank´s account databases it is currently connected to. A component offers a set of services to other components. The teller machine controller, for instance, offers a service that an automatic teller machine can use to validate a customer´s cash withdrawal request. These define the component´s behavior. They are modeled in terms of operations exported by the object. The interaction of a component with a remote component is modeled in CORBA in terms of operation execution requests and responses to these requests. Service executions may fail. These failures may be due to some problem common to any distributed applications, such as the exhaustion of memory available to a component. In these cases failures may be described in a generic fashion. Failures may also be due to some application specific problem. In the teller machine network example, the cash withdrawal may fail because there is not enough money in the customer´s bank account. These application-specific failures have to be dealt with by the application. The component model, therefore has to provide appropriate concepts. The CORBA object model provides exceptions for this purpose.

3 The OMG Interface Definition Language
OMG/IDL is a language for expressing all concepts of the CORBA object model. OMG/IDL is programming-language independent, orientated towards C++, and not computationally complete. Different programming language bindings are available. So far, we have only discussed abstract concepts of the CORBA object model. A language with primitives for expressing the different concepts is needed so as to be able to use these concepts during the construction of a distributed system. The OMG has standardized an interface definition language (IDL) for that purpose. IDL is not bound towards a particular programming language, though its definition has been influenced by C++ (and Java). The interface definition language can be used to declare the exported properties of object types. IDL is not computationally complete. It does not have concepts for defining variables and algorithms as these should not be exposed at the interface level. The OMG has standardized IDL programming language bindings for C, C++, Smalltalk and Ada-95. Bindings for OO-Cobol and Java are being standardized at the moment. Objects whose interfaces have been declared in IDL can be implemented using these programming language bindings. A language binding is also used to request the execution of operations that are specified in IDL from a particular programming language. The advantage of this approach is that programming language interoperability is achieved: An interface defined in IDL can, for instance, be implemented in C++. It can be used by Smalltalk objects, C functions and Java applets at the same time.

3 Automatic Teller Machine Example
We use the Automatic Teller Machine Network in Lecture 1 as a running example here to explain the CORBA object model.

3.1 Types of Distributed Objects
Attributes, operations and exceptions are properties objects may export to other objects. Multiple objects may export the same properties. Only define the properties once! Attributes and operations, and exceptions are defined in object types. Many different objects share the same static properties (attributes, operations and exceptions). In the teller machine network, for instance, there are almost certainly multiple tellers of the same brand, each of which has to have an attribute for storing the amount of cash that is still available. It is, therefore, unreasonable to have a designer of a distributed system describing properties for each of these objects. They should rather be defined only once for all ´similar´ objects. The CORBA object model, therefore, introduces the concept of object types. They are a vehicle to define properties shared by all objects that are of that type. Types are also used for object creation: objects are instantiated from a type. The type they are instantiated from is referred to as their type. Objects keep their type during their whole lifetime. The CORBA object model incorporates a static type system. This type system is used to verify at compile time that an object has a certain property at run-time.

3.1 Types A type is one of the following: Atomic types
(void, boolean, short, long, float, char, string), Object types (interface), Constructed types: Records (struct), Variants (union), and Lists (sequence), or Named types (typedef). IDL supports different kinds of types. Types are used to determine the domain of attributes, parameters and operations. IDL defines a number of types in that are atomic in the sense that they cannot be decomposed further. Types, such as boolean, short, char and string are examples of those. The most important of these kinds of types are object types. They are defined as interfaces, which declare the attributes, operations and exceptions supported by that type. The values of these types are pointers that refer to objects. Values of object types are sometimes referred to as object references. Type constructors can be used to construct more complex types from atomic types. Record types consist of named components that have a type. The name is used to select the component. Variants can be used to define different incarnations of a record type that depend on the value of a selector. Lists are potentially unlimited sequences of elements of a base type. Types can be given a name using the typedef construct so as to use them more easily when defining parameter types, attribute types or return types.

3.1 Types (Examples) struct Requester { int PIN; string AccountNo; string Bank; }; typedef sequence<ATM> ATMList; To represent a customer in our automatic teller machine network, a record consisting of the customer´s personal identification number, the account number and the code of the bank may have to be defined. Hence the atomic types int, string and string are combined with the struct type constructor. To maintain the list of automatic teller machines that a teller machine controller is using, the type ATMList is defined as a sequence of teller machine objects, which we assume are of the object type ATM.Here the sequence type constructor is applied to define a list type and then the type is given a name with the typedef operator.

3.2 Attributes Attributes are declared uniquely within an interface.
Attributes have a name and a type. Type can be an object type or a non-object type. Attributes are readable by other components. Attributes may or may not be modifyable by other components. Attributes correspond to one or two operations (set/get). Attributes can be declared as read-only. An object type may declare an attribute by characterizing the attribute name and its type. The attribute name will be used by remote components to access or even update the attribute´s value. The domain of the attribute value is defined by the attribute type. The attribute type can denote an object or a non-object type. If it denotes an object type, the attribute value refers at run-time to an instance of that object type. If it denotes a non-object type, the attribute contains at run-time a value of that type. An attribute determines whether or not other components may modify the attribute value at run-time. Attributes will be implemented in terms of operations. For modifyable attributes two operations will have to be provided. The first operation will be used to set the attribute value. The second operation will be used to read the value. For attributes that can only be read, the set-operation is not available. Attributes should be used very carefully! You should always ask yourself, whether or not the particular portion of the component´s state modeled by an attribute really has to be accessible or even updatable from remote components. The component implementation may always have additional attributes that are not exposed for use by other components. So there is no need to introduce attributes just to store state information of an object. Attributes can be declared as read-only.

3.2 Attributes (Examples)
readonly attribute ATMList ATMs; readonly attribute BankList banks; Within an interface that defines the automatic teller machine controller, we may want to maintain the collection of object references by means of which the controller is connected to automatic teller machines and bank´s account databases. For that purpose, we define two attributes whose names are ATMs and banks. The type of ATMs is ATMList, which has been defined on a previous slide. The attribute ATMs will contain object references of all teller machine objects that are handled by the controller. By means of these references can the controller invoke operations of the teller machines, for instance to find out how much cash is left in the dispenser. The attributes are declared as readonly. This prevents other distributed objects from modifying the contents of the lists.

3.3 Exceptions Service requests in a distributed system may not be properly executed. Exceptions are used to explain reason of failure to requester of operation execution. Operation execution failures may be generic or specific. Specific failures may be explained in specific exceptions. Exceptions have a unique name. Exceptions may declare additional data structures. There are many reasons why a service request in a distributed system may fail. It may be due to a failure in the component providing a service, it may be due to a network error, it may be due to a time-out originating in an overload of the service provider or it may be due to the fact that the request is not acceptable in the first place. There are a number of design choices how to deal with failures. It has to be decided whether or not a service requester should be aware that a failure has occurred and it must be decided how often service execution is retried. The CORBA object model takes a rather simple approach. Exceptions are a mechanism to inform a requester object about a failure. Operation execution may fail due to a system error (e.g. the network being down or a component being not reachable) or due to an error specific for the object (no money in the customer’s bank account). Failures that are specific to an object handled differently by additional operation executions. The teller machine, for instance, could ask first whether or not there is enough money in the bank account. This, however, would increase the number of operation requests and increase network traffic and load on the server component. Exceptions may be raised explicitly by the service providing object or implicitly by the distributed operating system. If no exception has been raised, the client knows that the server has performed the request properly. An exception may have additional data that informs the client object about the nature of a failure. Exceptions have a unique name. The name is used in operation signatures to declare the list of exceptions that the operation may raise. It is used in operation implementations to actually raise the exception and it is used in clients of the object to check for an occurrence of the exception. Exceptions may or may not declare additional data structures that are used for informing clients in a more detailed fashion about the reason for a failure occurrence. These data structures are similar to records and declare named components that also have a type.

3.3 Exceptions (Example) exception InvalidPIN; exception InvalidATM; exception NotEnoughMoneyInAccount { short available; }; In the automatic teller machine controller, we may have an exception InvalidPIN that inform a teller machine that a PIN that has been entered by a customer is invalid with respect to the one that is decoded on the card. Likewise, the controller may have operations that have ATM references as argument and issue an InvalidATM exception if the reference does not correspond to an ATM object that is being managed by the controller. The bank account database may want to raise an exception that a customer has not enough money in the bank account to be granted a cash withdrawal request. The bank account database is kind enough to inform the requester how much money would be available in the record associated with the exception.

3.4 Operations Operations have a unique identifier.
Operations have a parameter list. parameters are in, out or inout, parameter identifiers are unique within the list, and parameter types have been declared. Operations have a result type. Operations declare exceptions they may raise. Client objects use operation identifiers as message names to request an operation execution from a server object. As only the operation identifier is considered (as opposed to C++, where operations may be overloaded and therefore also parameter types are considered), the operation identifier has to be unique. The parameter list of an operation may have arbitrary many parameters, though it has to be considered as bad style if an operation has more than five parameters. Then almost certainly the operation performs different tasks for these different parameters and should be split into different operations. Parameters can be used to pass values from and to the client to the server executing the operation. The direction is indicated by the mandatory keywords in, out and inout. As opposed to C++, parameter names are mandatory. They are a means to specify (informally) the meaning of the parameter. Parameters are typed. The type must have been declared before it can be used as a parameter type. Operations have a result type which determines the type of the expression that is associated to the operation invocation. Operations declare the list of specific exceptions that they might raise to signal client objects that they have to catch these exceptions.

3.4 Operations (Examples)
void accept_request(in Requester req, in short amount) raises(InvalidPIN, NotEnoughMoneyInAccount); short money_in_dispenser(in ATM dispenser) raises(InvalidATM); In response to a customers request, the teller machine will have to validate the request and tell it to a bank. As the machine does not know the bank itself, it has to rely on a service provided by operation accept_request of the teller machine controller. It passes the details of a requester and the amount of money s/he has asked for as parameters and the controller deals with it. The controller may raise exceptions if there the customer has entered a wrong personal identification number, or if the controller has been informed by the account database of the customer’s bank that the customer has not sufficient money left in his bank account. A teller machine maintenance application that runs in a bank does not want to know all the details of all the teller machines. It wants to be able to ask the controller, for instance how much cash there is left in a particular machine to find out whether the machine has to be refilled. It can use operation money_in_dispenser for that purpose. The machine that the controller is to by an object reference passed as an in parameter. The operation returns the amount of money as a short integer if the dispenser object passed as a parameter is managed by the controller or raises an exception otherwise. Note, how exceptions are employed in this example to reduce the amount of operation invocations that are needed to validate and perform a service. Without exceptions there would have to be an additional operation validate_pin to check for the correctness of the entered PIN and another operation validate_account for the teller to check whether there is enough money left in the customer’s account. Hence, three times as many operation invocations would be required to perform the operation as in the case where exceptions are used.

3.5 Interfaces Attributes, exceptions and operations are defined in interfaces. Interfaces have an identifier, which denotes the object type associated with the interface. Interfaces must be declared before they can be used. Interfaces can be declared forward. Object types are defined within interfaces. The object types declares all the attributes, exceptions and operations that are exported by the type. An interface has an object identifier which denotes the name of the object type. It can be used to declare types in attributes, operations, parameters and exceptions. Interfaces must have been declared before they can be used, but there is a way to declare the existence of an interface, without providing the details. This is similar to imports in Modula-2 definition modules, though the imports can appear anywhere and do not have to be grouped to an import interface.

3.5 Interfaces (Example) interface ATM; interface TellerCtrl { typedef sequence<ATM> ATMList; exception InvalidPIN; exception NotEnoughMoneyInAccount {...}; readonly attribute ATMList ATMs; readonly attribute BankList banks; void accept_request(in Requester req, in short amount) raises(InvalidPIN,NotEnoughMoneyInAccount); }; The type for teller controller objects is defined in interface TellerCtrl. We have seen parts of these declarations before but discuss now how they fit together to an interface. The TellerCtrl interface defines ATMList as a list of teller machine objects. It uses type ATM as argument for the sequence type constructor. To make this declaration valid, the object type ATM has to be known. It is therefore declared forward in the first line. The two exceptions that may be raised by the operation have to be declared before the operation can declare that it may raise them. The inclusion of the attributes as readonly attributes enables client objects to send a message with the attribute name to any server object of type TellerCtrl and the server object will return the respective attribute value. The operation accept_request can now be invoked by sending a message with that name to any object of type TellerCtrl.

3.6 Modules A global name space for identifiers is unreasonable.
IDL includes Modules to restrict visibility of identifiers. Access to identifiers from other modules by qualification with module identifier. A large distributed system may be composed of many different objects and may have many different object types. These object types may not have been defined all at once, but may have evolved over time. Moreover, a team or even different geographically dispersed teams of engineers may have contributed object types for the distributed system. It would by unreasonable to assume a global name space identifiers, such as object type names, types and exceptions. If there were only one name space for identifiers name clashes could happen and identifiers would have to be re-named. This can be very costly, if components that have already been constructed have to be integrated a-posteriori into a distributed system. IDL, therefore, provides modules, a concept which restricts the visibility of identifiers, such as interface names, type names and exception names. A module has an identifier itself which gives a name for the name space induced by the module. Declarations from other modules can use the declarations provided by a module through qualifying their name with a module name. To do so, the module name precedes the name of the identifier to use and the two are separated by a double colon (::).

3.6 Modules (Example) module Bank { interface AccountDB {}; }; module ATMNetwork { typedef sequence<Bank::AccountDB> BankList; exception InvalidPIN; interface ATM; interface TellerCtrl {...}; Let us continue our example of a network of teller machines. Within the network, there are fundamentally different components, such as the accounts database running at a bank and the automatic teller machines together with their controllers. Most likely these different components have come from different vendors and most likely it is a different contractor or the in house IT department who is to integrate them. To make this integration easier, the object types that comprise the account database should therefore be defined in a different module from those types that belong to the teller machines and their controllers. Note how, one module can make use of definitions declared in the respective other module. BankList is defined in module ATMNetwork as a list consisting of AccountDB objects. The definition of AccountDB is imported from module Bank by preceding the type name with the module name.

3.7 Inheritance Properties shared by several types should be defined only once. Object types are organized in a type hierarchy. Subtypes inherit attributes, operations and exceptions from their supertypes. Subtypes can add more specific properties. Subtypes can redefine inherited properties. Often similar object types share the same attributes, operations and exceptions. A controller for an automatic teller machine may share some properties with a controller for a street toll machines. It is desirable to describe these properties shared by multiple types only once and then specialize the types. Different types of objects in the CORBA object model are, therefore, arranged in a type hierarchy. Common attributes, operations and exceptions of different types can be defined in a common super-type. Subtypes then need not define these properties again but inherit them from their super-types. An object is a direct instance of a type, if and only if it has been instantiated from that type. An object is an instance of all the super-types of its type. A subtypes may add specific properties to those inherited from a super-type. The subtype relationship is transitive. This means that if a type B is a subtype of A and another type C is a subtype of B then C is also a subtype of A. A subtype may redefine the definition of a particular property. It may give an operation, for instance, a slightly different behavior. Sometimes operations are abstract (or deferred) and are not implemented by a type T, but have to be implemented in all subtypes of T. Then other types can be sure that the operations exists in all instances of T. These abstract operations are often used to implement call-backs.

3.7 Inheritance (Examples)
interface Controllee; interface Ctrl { typedef sequence<Controllee> CtrleeList; readonly attribute CtrleeList controls; void add(in Controllee new_controllee); void discard(in Controllee old_controllee); }; interface ATM : Controllee {...}; interface TellerCtrl : Ctrl {...}; In the automatic teller machine example we can imagine, for instance, that there are different controllers for different kinds of devices that are handling with cash or electronic money. The controllers have in common that they control a number of devices. This means that they all have to maintain a list of objects that represent the controlled devices. This is implemented in the example above by defining an interface for type Ctrl. The attribute controls is defined to be a sequence of Controllee objects. Two operations are provided to modify the attribute. The first operation adds a new Controllee to the list and the second operation deletes a Controllee from the list. In the last two lines we declare that object type ATM inherits all properties from Controllee and that TellerCtrl inherits all properties from Ctrl. These declarations then enable clients of TellerCtrl to enter ATM objects into the controls attribute of a TellerCtrl object.

3.7 Inheritance (Multiple)
Multiple Inheritance May cause name clashes if different super-types export the same identifier. Example: interface Set { void add(in Element new_elem); }; interface TellerCtrl:Set, Ctrl { ... }; Name clashes are not allowed! If two components have to be combined in a way that the component gets all the properties of the existing components, this can be achieved in the OMG object model by defining the component as a subtypes of both components. The OMG object model, therefore, supports multiple inheritance. Consequently IDL allows an interface to inherit from multiple other interfaces. The inheritance list can therefore include several interface names that are delimited by a comma (,). The introduction of multiple inheritance is not without problems. This is also the main reason why multiple inheritance is not available in all object-oriented programming languages. Smalltalk, for instance does not include multiple inheritance for that reason. The problem is that due to multiple inheritance name clashes may occur if different super types export different properties that by chance have the same name. As an example consider another object type Set which adds an operation add. If we now define TellerCtrl to inherit from Set and from Ctrl, a name clash occurs since add is defined in both super-types. This is disastrous because then it is not clear how a TellerCtrl object should react to the message add. It could chose to execute add from Set or from Ctrl. There are different ways to cope with this problem. IDL has taken a rather crude approach and does not allow these name clashes to occur. To resolve them one of the identifiers involved in the clash has to be renamed. There are more advanced approaches then this in the literature.

4.1 Object Management Architecture
Application Objects CORBAfacilities CORBAservices Object Request Broker The core of the object management architecture is an object request broker. It enables an object to request an operation execution from another distributed object. The objects can be very heterogeneous in the sense that they can be running on different hardware platforms with different operating systems. Objects that use an ORB are classified into application objects, CORBAservices objects and CORBAfacilities objects. A number of problems occur in almost any distributed system. Examples of these problems are naming, trading, migration of components, concurrency control, transactions and the like. Object based solutions to these problems have been standardized by the OMG within the framework of the CORBAservices. It is expected that every vendor of an object request broker provides implementations for these services. This will accomplish portability and interoperability of application objects. The OMG has started to define a number of component interfaces that may be useful but will not necessarily needed in every distributed system. These component interfaces are defined as CORBAfacilities. It is not mandatory for ORB vendors to provide these components. Objects that are specific for a particular application are considered as application objects. They can use or customize CORBAservices or CORBAfacilities objects and so leverage and reuse the solutions the OMG has defined and ORB vendors have implemented.

4.2 Accessing Remote Objects
Client Object Implementation ORB Core Dynamic Invocation Client Stubs ORB Interface Implementation Skeletons Object Adapter This is the CORBA architecture regarding interface among ORB, Client and Object Implementation when they are involved in requests at run-time for remote object access. The central component of the CORBA architecture is an object request broker (ORB). An ORB receives a request to invoke an operation from a client object and forwards that request transparently to the server object. In particular, the ORB locates the server object based on the object reference provided by the client, transmits the request parameters to the server and returns the request results to the client object. Client stubs and server skeletons in CORBA fulfill the same role that client and server stubs play in RPCs: They perform marshalling of operation parameters, transform them into a common format and manage the synchronization between client and server. Marshalling and unmarshalling of request parameters is performed by the dynamic invocation interface, client stubs and implementation skeletons, which are CORBA’s server stubs. Client stubs and server skeletons are generated by an IDL compiler, which is an integral component of every CORBA product. The dynamic invocation interface supports the definition of requests at run-time. While doing so, the dynamic invocation interface interprets run-time type information in order to determine the marshalling algorithm. The CORBA specification supports several object adapters. The Basic Object Adapter (BOA) was adopted as part of the first version of the CORBA spec in The BOA defines how server objects register with an ORB, how the ORB generates references for a server object and how server objects are activated and deactivated. The BOA, however, was under-specified in the first CORBA specs which lead to incompatible BOA implementations in the first CORBA products. The OMG has rectified this problem with the adoption of the Portable Object Adapter (POA) as part of the CORBA 2.2 spec. The POA defines registration, activation and deactivation more precisely and also supports persistent objects. The ORB Interface includes a set of types that both client and server objects use for initialization purposes. Clients can, for example, use the ORB Interface to obtain a set of initial object references; servers use the interface to select an object adapter. The ORB Interface also specifies the common root type Object whose operations and attributes every CORBA object inherits. Moreover, the ORB Interface specifies the operation of the Interface Repository. CORBA achieves access and location transparency. Access transparency is achieved because the client stubs have exactly the same interface as the server objects. Hence client program do not have to be changed when they switch from invoking a client stub to invoking the server object directly. A big advantage of CORBA is that it achieves location transparency because all the client needs to make an object request is a reference to the server object. Clients do not have any knowledge about the host on which the server object resides. One standardized interface One interface per object operation ORB-dependent interface One interface per object adapter

4.2 Stub/Skeleton Generation (for C++)
IDL-Compiler Person.idl Client.cc Server.cc C++ Compiler, Linker Personcl.hh Personcl.cc Personsv.cc Personsv.hh Client Server includes generates reads This slide displays how client stubs and server skeletons are generated from the interface definition in IDL and how they are used in client and server objects. The slide displays the stub/skeleton generation as an example for the C++ programming language binding. The generation process for other language bindings is similar. The stubs and skeletons generated for C++ are themselves classes. Following the C++ conventions they are generated in different files. The class interface definition is generated in files ending on .hh whereby class implementations are generated in files ending on .cc. Stubs and skeletons are distinguished by different suffixes (cl and sv). The client stub is included by the client program (Client.cc). As the client stub only exports operations that directly correspond to the operations identified in IDL, the C++ compiler indirectly checks that clients use those operations the server has exported in a proper way. Hence static invocation through stubs is type safe. The server class (Server.cc) has to be implemented as a subclass of the server skeleton class (Personsv.*), where a pure virtual member function (same as a deferred method in Eiffel) is defined for each operation in the IDL interface. These pure virtual member functions have to be redefined in the implementation of the server class and they will be invoked from methods in the server skeleton as soon as an exported operation is invoked from a remote object. Finally server, client, stubs and skeletons have to be compiled and linked as it is sketched on the slide.

5 Comparison RPC architecture lacks interface repository.
IDL is more expressive than RPCL: inheritance, attributes and exceptions. IDL has multiple standardized language bindings. Corba handles failures with exceptions which are more expressive than returning a NULL pointer. The comparison of OMG/CORBA and RPCs is now done on the basis of the conceptual framework that we have established in the first part of this subject´s lecture. We note that the RPC architecture does not have an equivalent to the CORBA interface repository. Therefore, clients cannot check the type safeness of RPCs dynamically defined at run-time. The definition of remote components in IDL is generally easier than the a definition in RPCL. This is because IDL is generally more expressive. RPCL does not have counterparts for the IDL concepts of inheritance, attributes and application specific exceptions. Accessing remote components specified in IDL can be achieved through standardized programming language bindings for C, C++, Smalltalk, Ada-95, Java, and OO-Cobol. RPCs, however, can only be accessed using C or C++.

5 COMPARISON (CONT´D) RPCs may be more efficient than CORBA operation invocations. RPCs come with the UNIX OS whilst you would have to buy a CORBA product. CORBA is more widely available on non-UNIX platforms. RPCs are lightweight. Some CORBA products implement operation invocation requests on top of RPCs. Hence it can be expected that these products perform less efficient than RPCs. However, there are also UDP based CORBA implementations which outperform TCP based RPCs. RPCs have the advantage that they are just there after you have installed a UNIX operating system. If you run Linux on your PC at home, it will have RPCs on it (even though you may not have noticed yet) and you are encouraged to toy around with them. To be able to use CORBA, however, you will have to buy a CORBA implementation, which can be rather expensive. RPCs are only available on UNIX but CORBA implementations are also available for other platforms. There are implementations for PCs running all forms of Windows, Macintosh, all Unix platforms, IBM VM/MVS, OpenVMS, Tandem and many others. RPCs have the advantage that they are very lightweight. The RPC library that implements the RPC interface only consists of a small number of operations and the stubs that are generated are fairly small. The person management application has a size of 25 Kbytes. CORBA applications tend to be much bigger.

6 Summary The basic conceptual framework for remote object invocation in distributed systems. Definition of RPC and how it works. CORBA object model and IDL. CORBA remote procedure definition and remote object invocation. The similarities and differences between RPC and CORBA. Read Textbook Chapters 5 and 20. Read Textbook Chapter 5 about communication between distributed objects. Read Chapter 20 for a case study on CORBA, which will be also referenced in subsequent topics in the class.

1DT057 Distributed Information System

Similar presentations

Presentation on theme: "1DT057 Distributed Information System"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1DT057 Distributed Information System

Similar presentations

Presentation on theme: "1DT057 Distributed Information System"— Presentation transcript:

Similar presentations

About project

Feedback