Presentation is loading. Please wait.

Presentation is loading. Please wait.

Informix Replication and Availability Offerings

Similar presentations


Presentation on theme: "Informix Replication and Availability Offerings"— Presentation transcript:

1 Informix Replication and Availability Offerings

2 Agenda Why Replicate? Enterprise Replication Flexible Grid
Continuous Availability Updates on secondary Connection Manager Continuous Log Restore

3 Terminology Replication Enterprise replication HDR Hot Backup
Replication is the process of sharing information so as to ensure consistency between redundant resources. Enterprise replication Enterprise Replication is an asynchronous, log-based tool for replicating data between IBM Informix Dynamic Server database servers. HDR Mechanism of marinating a live copy of database on another instance by means of applying the logical logs. Hot Backup Hot backup is a backup performed on data while its accessed and modified actively. Log snooping Process of extracting the records from log stream. Concurrency A mechanism to ensure that transactions are performed concurrently without the violating the data integrity. Informix clustering capabilities provides high availability. With the ability to do business around the world through the web, more and more businesses must stay up 24 hours a day, 7 days a week and 365 days per year. This leaves no space for downtime, planned or not. IDS availability features can cover situations such as online backups, machine maintenance (by moving the load to another machine), and disaster recovery. Any downtime costs a business, IDS is the perfect database system to minimize downtime.

4 Replication No longer is a centralised IT architecture the sole corporate information resource. This migration towards a distributed architecture continues to be driven by competitive, cost-cutting, and organizational forces which necessitate that companies maximize their IT resources. One result of this decentralization trend is that organisations find themselves managing distributed databases that spread corporate information to all corners of the enterprise. Data replication is key to effectively distributing and sharing this information. But replication is more than merely moving data around from site to site. It is an extremely demanding technology. And as such, the success of replication systems depends on technology that addresses the full range of business and application needs. Replication systems must provide high performance without burdening the source database and without requiring application changes. They have to maximize system availability and ensure consistent delivery of the data. And finally, database administrators need to be able to configure and manage all the components of the replication system in a manner that utilizes the enterprise computing resources most effectively. Replication technology is one of primary strengths of Informix. Scale out, scale in, concurrency, High Availability, Replication synchronously, asynchronously, no special hardware required, share disk between two instances, failover, load balance via the Connection Manager., employ redundancy at every step of the way, support for full disk mirroring at hardware and software levels. Replicate all of the data with High availability Clustering, replicate just some of it with Enterprise Replication. Simple to implement, easy to use. Informix’ High Availability, Enterprise Replication, and Flexible Grid provide unmatched protect and scalability using low cost hardware and software.

5 Why Replicate? High Availability Capacity Relief Workload Partitioning
Provide a hot backup to avoid downtime due to system failure Secondary Capacity Relief Offload some work onto secondary systems (reporting/analysis) Reporting Database Companies replicate data for a variety of reasons. The following are some, but not all, of the reasons why people would replicate data. High availability is considered a primary reason for replication. Companies must be able to recover from the loss of a machine, power outage, disk head crash, loss of facility, etc. If they can not, then it is very possible that the company with suffer significant financial loss. Ideally a replication system to be used for high availability should be able to support loss of no committed transaction and have a very short latency. Capacity Relief Many of our customers use the secondary for reporting, especially Web access. This relieves capacity on the primary which can be dedicated to handle online traffic. The secondary is not just a hot standby, i.e. more cost effective. Workload partitioning lets data be controlled locally yet viewed globally Workload partitioning gives database administrators (DBAs) the flexibility to assign ownership of data at the table partition level. An example is illustrated below. The replication schema is mapped to the partitioning schema for the employee tables. The Asia/Pacific site has ownership of its partition and can update, insert and delete employee records for personnel in its region. Any changes to the data are then propagated to the U.S. and European sites. While the Asia/Pacific site can query or read the other partitions, it cannot update them. Similarly, the U.S. and European sites can change data only within their own respective partitions but can query and read data in all partitions. Companies may want to decentralize operations having each site “owns” a subset of the data. This allows sites to maintain “their” data, but see an entire database of activity. For example, a warehouses may want to control its own inventory, but see availability at other locations. Workload Partitioning Different datum ‘owned’ in different locations (warehouses)

6 Enterprise Replication
Uses Workload partitioning. Capacity relief. Flexible and Scalable Subset of data. Supports update anywhere Very low latency. Synchronizes local with global data. Integrated Compatible with all other Informix availability solutions. Secure data communications. Enterprise Replication (ER) provides asynchronous replications between machines. It supports uni-directional and/or bi-directional replication down to the rows and columns if needed. ER can scale to hundreds of machines. For this purpose, multiple topologies are supported: Fully meshed: All nodes replicate to each other. Hierarchical: A node replicate to one or more nodes that then replicate to others until the leaf nodes are reached. Snow flake/forest of trees: This could be considered a mix of the two other topologies that gives the flexibility needed for any business environments. ER is ideal for a distributed enterprise where local processing provides the best performance while being able to replicate the changes to other nodes as the business sees fit. This could be for consolidation into a headquarter/centralized site or because a subset of the local data is shared with other regions. Using ER takes care of the distribution of the data leaving the application to deal with the business processing and not with an infrastructure issue. Enterprise Replication: Designed for enterprise data distribution Supports active/active updates Very low latency

7 ER Strengths Flexible Supports Update anywhere
Choose which Columns to Replicate. Choose where to Replicate. Supports Update anywhere Conflicting updates resolved by: Timestamp, Stored Procedure, Always Apply. Completely implemented in the Server No additional products to buy. Based on log snooping rather than transaction based Support for heterogeneous OS, Informix versions, and hardware Enterprise Replication supports updates from any server and the ability to selectivity replicate those changes at the table level is fully supported. Not all of the data has to be replicated from server to server in the cluster. Therefore, it is highly flexible in this regard. Data that does not change, such as master table data, does not have to be replicated as it doesn’t change once loaded. Each server can have different replications relative to the other servers in an ER cluster. ER is part of core DSA architecture - It employs a series of separate threads to capture, fan out and apply replications. SQL evaluation statements are also parallelised as are network I/O calls. This minimises performance impact, allows great scalability and makes Enterprise Replication suitable for low, medium or very high replication rates. Enterprise Replication has a sophisticated configuration suite which allows definitions to be made in one place before being propagated to other DBMS locations. Enterprise Replication can support replication at single table level or by grouped tables. In the latter case, updates are grouped into transactions and are applied in an identical order to retain transaction integrity. Strengths Benefits Log-Based Performance DSA Architecture Performance Update Anywhere Flexibility and Performance Fine Granularity Flexibility Conflict Resolution Algorithms Data Integrity Supports All Data Types including BLOBS Multi-media support GUI Administration Tools Ease of Use Global Administration Resilience Immediate Replication Data Integrity Multiple Model Support Flexibility Transaction Based (Grouped Tables) Data Integrity

8 Informix Flexible Grid
What does Informix Flexible Grid (Grid) provide? The ability to mix hardware, software, and versions of Informix in the Grid. Centralized, simultaneous administration of servers and databases in the Grid. Workload balancing across nodes in the Grid. Rolling upgrades. Instance cloning. Selective data replication if desired. Virtually eliminate downtime while providing uninterrupted data services. The ability to create small to massively sized grids easily. As your data processing requirements grow you can end up collecting a disparate group of data servers to address various needs. Often data and functionality is duplicated among these servers and administration can become very difficult. Some servers may be running in the red while others remain mostly idle. The Informix Flexible Grid offers a solution to these problems. With the flexible grid you are able to administer your data servers as if they were a single entity. In addition the flexible grid lets you balance your workload across all your servers regardless of hardware or operating system. Flexible Grid leverages Informix Enterprise Replication (ER). In a significant amount of work has gone into ER to remove previous limitations, enhance its functionality, and greatly simplify its usage.

9 Informix Flexible Grid
Connection Manager for a grid AIX Linux Solaris Windows Flexible Grid Balance workloads Fewer DBAs Re-use current HW Avoid platform lock-in Scales globally Easily Managed Secondary Secondary As business needs grow, the demand for a second cluster in a remote city is established. This cluster is established on a different hardware platform. Local requirements determine access to a different set of database tables. As business needs continue to evolve, and access is needed to contents from both clusters, a Flexible Grid is created, which uses ER to connect the primary servers in the clusters, as well as other ER servers. Users access the grid connection manager to access the entire grid. Users needing access only to information on individual clusters can still access the cluster's connection manager. The new solution offers the “global” application users quick and easy access to all information available on the Grid. Connection Manager for a Cluster Connection Manager for a Cluster

10 Informix Flexible Grid
Informix Flexible Grid (Grid) replication is an enhancement to ER replication Specific syntax for Grid vs “regular” ER configuration and administration While ER is heterogeneous in terms of H/W and Informix version, Grid requires all instances to be on Informix or higher: Grid is still heterogeneous from a H/W perspective though An Informix instance can be in a ER cluster, a Grid cluster, or both at the same time!!! Flexibility Provides backwards compatibility while allowing for on-going, forward looking environment changes / adjustments Grid Based Replication provides a means of propagating DDL and server administration commands across multiple nodes. It replicates the execution of a statement rather than just the results of the execution and provides a means of supporting the connection manager on top of Enterprise Replication. ER is now even easier to configure and manage with the automatic setting up of replication on tables when they are created. You can also replicate data using ER without a primary key and Grid Based Replication provides the ability to turn on/off ER replication within the transaction, not just at the start of the transaction. 10

11 Informix Flexible Grid
Technically, Informix Flexible Grid provides the ability to: Replicate data using ER without a primary key**** Create ER replication as part of a create table DDL statement Replicate DDL statements across multiple nodes create table, create index, create procedure . . . Make instance changes across all members of the Grid Add / Drop logical logs, chunks, dbspaces, update $ONCONFIG, etc. Support the oncmsm connection agent against Grid clusters Replicate the execution of a statement rather than just the results of the statement executed somewhere else**** Helpful if you have triggers that execute locally on each node Turn on/off ER replication within the transaction and not just at the start of the transaction**** Some of the new features of GRID computing, starting in IDS 11.7: Tables do not need a Primary key any longer to participate in ER. Almost all DDL statements can be replicated across the nodes in a ER cluster. Instance changes can be made across all of the member of the Grid. The Connection Manager can be configured, at a minimum, to load balance connections to the members of the Grid. Replicate the statements themselves rather than their results. And you can turn off ER within a transaction rather than just at its start. Vastly extends the ability of ER into the corporate enterprise. Older customers needing ER but unable to implement due to previous setup requirements should look at this now in a different light. **** Has data consistency implications, reviewed later 11

12 HDR Replication Uses: Secondary available for Read-only queries.
High availability: takeover from primary. Capacity relief: distribute workload. Secondary available for Read-only queries. Simple to administer. Integrated Compatible with all other Informix availability solutions. As early as IDS Version 7, Informix adopted HDR technology, which is fully integrated within the data server. HDR is very easy to set up and administer and does not require any additional hardware or software for automatically handling server or disk failures. HDR maintains two identical IDS server instances on servers with similar configurations and operating systems. HDR employs a log record shipping technique to transfer the logical log records from the primary server to the secondary server. The secondary server is in perpetual roll-forward mode so that data on the secondary server remains current with data on the primary server. The secondary server supports read access to data, allowing database administrators to spread workload among servers. Primary server Secondary server

13 Strengths of HDR Easy setup Secondary can be used for dirty reads
Just backup the primary and restore on the secondary No significant configuration required Secondary can be used for dirty reads Provides failover to secondary Automatic failover when DRAUTO is set Stable code Has been part of the product since version 6 Integrates easily with ER The secondary server can be configured to operate in synchronous (SYNC) or asynchronous (ASYNC) mode. In SYNC mode, HDR guarantees that when a transaction is committed on the primary server its logs have been transmitted to the HDR secondary server. In ASYNC mode, transaction commitment on the primary and transmission of updates to the secondary are independent, providing better performance but possible risk of lost transactions. HDR provides automatic failover to redirect client applications to the new primary server without missing a beat. With the DRAUTO parameter set, if the primary server fails, the HDR secondary server automatically takes over and switches to a standard or primary server (based on the DRAUTO value). When the original primary server becomes available, it is synchronized when HDR is restarted. HDR also supports automatic client redirection. This feature makes failover transparent to the application. To activate automatic client redirection, the primary and secondary servers must be defined as a group in the SQLHOSTS file. Clients use the group name to connect to the IDS server. The network layer and the client-server protocol ensures that the client is always connected to the primary server in the group. If the primary server fails and the secondary server becomes the new primary server, clients connected to the group will be automatically connected to the new primary server. This means that end user applications will not experience any outage, even though the application is now pointing to a different database server.

14 Remote Standalone Secondary
New type of secondary – RSS nodes Can have 0 to N RSS nodes Can coexist with HDR secondary Uses: Reporting Web Applications Additional backup in case primary fails Can write to these as well with one configuration parameter change. Similarities with HDR secondary node: Receive logs from Primary Has its own set of disks to manage Primary performance does not affected RSS RSS performance does not affect primary Differences with HDR secondary node: Can only be promoted to HDR secondary, not primary Can only be updated asynchronously Only manual failover supported Replication to Multiple Remote Secondary Nodes Primary Node Secondary Node The Remote Standalone Secondary servers extend HDR by allowing multiple copies of the database in both local and geographically remote locations. These secondary servers, like HDR, can be accessed by the client applications for query activates. Logical logs are continuously transmitted from the primary server and applied to the database on the RSS server. A remote secondary server is a complete copy of the primary server, similar to HDR. The major difference with HDR is that is only uses asynchronous replication. For this reason, it is not an appropriate disaster recovery solution. One benefit of RSS is that it provides support for multiple copies of the primary machine. These copies can be used as multi-level disaster recovery in conjunction with HDR since an RSS server can be promoted to an HDR secondary. If the primary server fails, the HDR secondary is promoted to primary and one RSS server can be promoted to HDR secondary. At this point the production system still has protection from site failure. Another benefit is that it adds very little overhead through the use of full-duplex communication through the server multiplexer that also supports multiple logical connection between servers over TCP/IP. The asynchronous replication also makes it easier to tolerate network delays that would make it unacceptable for HDR. Finally, an RSS server can be used to offload tasks such as reporting, data analysis, and so on, leaving more cycles to the primary for productions tasks. RSS #1 RSS #2

15 Usage of RSS: Additional Capacity
Customer needs to add additional capacity for its web applications. Adding additional RSS nodes may be the answer. Applications There are several business cases for using RSS instance(s). The first is to expand the realtime failover capability of an HDR environment. You can create an HDR secondary and host its physical server in close proximity to the primary to cover hardware and other transient failures while maintaining one or more realtime RSS copies well away from site for disaster tolerance and eventual failover. This facility for using RSS to provide for disaster tolerance and eventual failover has a benefit in that it can also be used to provide additional workload capacity in the event that the Primary/HDR servers are being stressed. This will work best for those types of query where primarily reads are done, such as reporting and analysis. Primary Server Secondary Servers

16 Usage of RSS – Availability with Poor Network Latency
RSS uses a fully duplexed communication protocol. This allows RSS to be used in places where network communication is slow or not always reliable. Dallas RSS servers use a fully duplexed communication protocol, allowing the primary server to send data to the RSS servers without waiting for an acknowledgement that the data was received. Using full duplexed communication means that RSS servers have very little impact on the primary server's performance. Many RSS servers can be established, providing backup systems in remote locations around the world, and delivering data close by where it is needed. The HDR secondary is still the server that is failover-ready from the primary. There are 3 ways that the RSS server can change roles: 1. The failover-ready HDR server becomes unavailable. In this case, one of the RSS servers can be assigned the failover-ready HDR server role. 2. The primary server becomes unavailable, and the current HDR secondary assumes the role of the primary. One of the RSS servers can then be assigned the role of the HDR secondary server. 3. If both the primary and the HDR servers become unavailable, one of the RSS servers can become the primary server. Another RSS server can then be assigned the HDR secondary role. Multiple RSS servers in geographically diverse locations can be used to provide continuous availability and faster query access than if all users were directed to the primary server. The application traffic that is read-only can be sent to local RSS servers. For example, RSS servers can feed data to Web applications that do not require up-to-the-minute data currency. If the applications need to update the data, they can connect to primary, otherwise they read the data from the local RSS server. This configuration will reduce network traffic and the time required by the application to access the data. Customer in Dallas wants to provide copies of the database in remote locations, but knows there is a high latency between the sites. Memphis New Orleans

17 Usage of RSS – Bunker Backup
Customer currently has their primary and secondary in the same location and is worried about losing them in a disaster. They would like to have an additional backup of their system available in a remote location for disaster recovery. Using HDR to provide High Availability is a proven choice. Additional disaster availability is provided by using RSS to replicate to a secure ‘bunker’. Using full duplexed communication means that RSS servers have very little impact on the primary server's performance. Many RSS servers can be established, providing backup systems in remote locations around the world, and delivering data close by where it is needed. Remote RSS servers in geographically diverse locations can be used to provide a defence against disaster occurring in the local environment of the core database architecture servers such as the Primary, HDR Secondary, or other RSS servers which may be more closely located to each other.

18 HDR with Multiple Shared Disk Secondary Nodes
SDS nodes share disks with the primary: Can have 0 to N SDS nodes. Uses: Adjust capacity online as demand changes. Does not duplicate disk space. Features: Doesn’t require any specialized hardware. Simple to setup. Can coexist with ER. Can coexist with HDR and RSS secondary nodes. Similarities with HDR secondary node: Dirty reads allowed on SDS nodes. The primary can failover to any SDS node. Primary SDS #1 Shared Disk Mirror SDS #2 SDS #3 Blade Server HDR with Multiple Shared Disk Secondary Nodes SDS servers access the same physical disk as the primary server. They provide increased availability and scalability without the need to maintain multiple copies of the database. An SDS server can be made available very quickly. Once configured, an SDS server joins an existing system and is ready for immediate use. Because SDS servers also use fully duplexed communications with the primary, having multiple SDS servers has little effect on the performance of the primary server. SDS servers are completely compatible with both hardware and software-based disk mirroring. If the primary server becomes unavailable, failover to an SDS server is easily accomplished. The specified SDS server becomes the new primary server and all other SDS servers automatically recognize the new primary. Multiple SDS servers also provide the opportunity to offload reporting and other functionality from the primary server. For example, a system with four SDS servers can have two allocated for analytics and two for read-only Web site data. During the holiday season, all four SDS servers could be allocated to Web site data to support the extra traffic.

19 SDS Usage: Capacity as Needed
Web Applications Analytic Applications Shared Disk Shared Disk Primary Primary SDS #1 SDS #1 SDS #2 SDS #2 SDS #3 If your business environment experiences peak periods, you might be required to periodically increase capacity. If the amount of data is very large and making multiple copies of it is difficult, use shared-disk secondary (SDS) servers instead of remote stand-alone secondary servers. You can use SDS servers if you want to increase capacity for read/write workloads. Advantages: Very high availability. This secondary server shares disks with the primary server. Potential disadvantages: If there is no additional failover server, the secondary server might be configured to run on the same computer hardware as the primary server. No data redundancy - This secondary server does not maintain a copy of the data. (Use SAN devices for disk storage.) The primary and secondary servers require the same hardware, operating system, and version of the database server product. SDS #3 Blade Server A Blade Server B

20 Updates on Secondary Allows updating activity to be performed from the secondary node. Allows the customers to take better advantage of their investment. Informix allows client applications to update data on secondary servers by using distributed writes. This enables the creation of highly effective database server architectures which provide disaster recovery and high availability using standard standard hardware. This enables customers to make more effective use and gain better value from their hardware and software inverstments. Distributed writes give the appearance that updates are occurring directly on the secondary server; however, the write elements of the transactions are transferred to the primary server and then the changes are propagated back to the secondary server.

21 Updates on Secondary Supports a DML operation (insert, update, and delete) on the secondary node. Uses optimistic concurrency to avoid updating a stale copy of the row. Works on HDR secondary, RSS nodes, and SDS nodes. Works with the basic data types, UDTs (those that store data in the server), logged smart BLOBs, and partition BLOBs. Supports temp tables- both explicit and implicit. Works with ER. Distributed writes are not enabled by default. To enable distributed writes, set the REDIRECTED_WRITES configuration parameter to the number of SMX pipes between the primary and secondary servers. The recommended value of this parameter is twice the number of CPU virtual processors. Set REDIRECTED_WRITES to zero to disable distributed writes. Note: When a client connects to an updatable secondary server the sqlca.sqlwarn.sqlwarn6 flag will be set to 'W' to indicate that the instance is not a primary server. Applications that used this indicator to detect whether an instance was able to accept writes will need to be updated to take this into consideration.

22 Optimistic Concurrency and Writes on the Secondary
We did not implement a distributed lock manager. Added support for row versioning CREATE TABLE …. WITH VERCOLS ALTER TABLE …. [ADD]/[DROP] VERCOLS Creates a shadow column consisting of an insert checksum value (ifx_insert_checksum) and update version column (ifx_row_version). If it is determined that the before image on the secondary is different than the current image on the primary, then the write operation is not allowed and an EVERCONFLICT (-7350) error is returned. If table does not use VERCOLS, then a before image of the row is used to verify that the update is not being attempted on a stale row. To avoid data processing bottlenecks caused by distributed lock managers, Secondary servers process insert/update/delete queries just like any other query. However, when the secondary server reaches a point where a physical page must be written, the server will instead send the write request to the primary node. By processing as much of the query as possible on the secondary server, the impact on the primary is greatly reduced. This makes administering a cluster of SD secondary servers easy because the primary need not be considered any different from the other secondary servers. Additionally, a new feature has been added to optionally implement versioning of data rows. This allows the server to quickly determine differences in data rows between primary and secondary sources without having to compare whole, physical pages.

23 Committed Reads on the Secondary
The secondary node supports ‘Committed Read’ and ‘Last Committed Read’ isolation levels. This is implemented as a locally committed read, not a globally committed read. The Read on the secondary node will not return an uncommitted read. However the row could be in the process of being updated. The Problem with Committed Reads locally on a secondary is avoiding locked rows where updated rows cannot be read until the change is committed unless (they use dirty reads). Applications may perform poorly if they wait on updated rows to commit, can use dirty reads but may get unexpected results, deadlocks may occur which waste a significant amount of time. Last Committed Read (Optimistic Locking) is a solution. In the Committed Read isolation level, exclusive row-level locks held by other sessions can cause SQL operations to fail when attempting to read data in the locked rows. The LAST COMMITTED keyword option to the SET ISOLATION COMMITTED READ statement reduces the risk of locking conflicts when attempting to read a table. In this case it returns the most recently committed version of the rows, even if another concurrent session holds an exclusive lock.

24 Availability – The Complete Picture
HDR Traffic Primary SDS Blade Server A <New Orleans> Building-A RSS Traffic HDR Secondary HDR Secondary Shared Disk Mirror Client Apps Disk Blade Server B <Memphis> Client Apps SDS Blade Server D <New Orleans> Building-B Suppose that a local outage occurred in Building-A on the New Orleans campus. Perhaps a pipe burst in the machine room causing water damage to the blade server and the primary copy of the shared disk subsystem. The role of the primary can easily be switched to one of the SDS servers running on the blade server in Building-B. This would cause all other secondary servers to automatically connect to the new primary server. Should there be a regional outage in New Orleans such that both Building-A and Building-B were lost, then Memphis becomes the primary server. In addition, you may also want to make Denver into an HDR secondary and possibly add additional SDS servers to the machine in Memphis. Finally, if the Memphis server is lost in addition to New Orleans, then Denver could be converted to the primary server. RSS Client Apps Blade Server C <Denver> Disk

25 Replication – The Complete Picture
Any node within the Enterprise Replication can also be a cluster. Not only do the nodes within the cluster automatically realign, but so does the ER connections. This provides for the ability to not only provide multiple levels of availability, but also the integration of multiple systems. Web Sales Inventory Enterprise Replication ER is an asynchronous, log-based tool for replicating data between IBM Informix database servers. ER on the source server captures transactions to be replicated by reading the logical log, storing the transactions, and reliably transmitting each transaction as replication data to the target servers. At each target server, ER receives and applies each transaction contained in the replication data to the appropriate databases and tables as a normal, logged transaction. Asynchronous replication allows the following replication models: v Primary-target - All database changes originate at the primary database and are replicated to the target databases. Changes at the target databases are not replicated to the primary. v Update-anywhere - All databases have read and write capabilities. Updates are applied at all databases. The update-anywhere model provides the greater challenge in asynchronous replication. For example, if a replication system contains three replication sites that all have read and write capabilities, conflicts occur when the sites try to update the same data at the same time. Conflicts must be detected and resolved so that the data elements eventually have the same value at every site. ER reads the logical log to obtain the row images for tables that participate in replication and then evaluates the row images. Log-based data capture takes changes from the logical log and does not compete with transactions for access to production tables. Log-based data-capture systems operate as part of the normal database-logging process and thus add minimal overhead to the system. Two other methods of data capture, which ER does not support, include: v Trigger-based data capture - A trigger is code in the database that is associated with a piece of data. When the data changes, the trigger activates the replication process. v Trigger-based transaction capture - A trigger is associated with a table. Data changes are grouped into transactions and a single transaction might trigger several replications if it modifies several tables. The trigger receives the whole transaction, but the procedure that captures the data runs as a part of the original transaction, thus slowing down the original transaction. ER provides high performance by not overly burdening the data source and by using networks and all other resources efficiently. Because ER captures changes from the logical log instead of competing with transactions that access production tables, ER minimizes the effect on transaction performance. Because the capture mechanism is internal to the database, the database server implements this capture mechanism efficiently. All ER operations are performed in parallel, which further extends the performance of ER. Because ER implements asynchronous data replication, network and target database server outages are tolerated. In the event of a database server or network failure, the local database server continues to service local users. The local database server stores replicated transactions in persistent storage until the remote server becomes available. Accounting

26 Connection Manager Maintains knowledge of all nodes within the cluster. Records addition/removal of nodes. Monitors type of node. Monitors workload of nodes. Routes the client application to target node. The Connection Manager (part of ClientSDK) dynamically routes client application connection requests to the most appropriate server in a high-availability cluster. The Connection Manager connects to each of the servers in the cluster and gathers statistics about the type of server, unused workload capacity, and the current state of the server. From this information, the Connection Manager redirects the connection to the appropriate server. In addition, Connection Manager Arbitrator provides automatic failover logic for high availability Clusters. The Connection Manager provides a way of logically grouping IDS instances. Each logical group is referred to as a named SLA (Service Level Agreement) and contains the details of the server instances either by their INFORMIXSERVER name or by server type (SDS/RSS/HDR/etc). Each client could connect to the server that gives them the type of service they require; however, this “hard codes” the named instance of the server. Should that server become unavailable, or if another server comes online in the cluster that could provide the same service, the clients will not know unless manual intervention is used to alter the client’s configuration. It would also be advantageous to perform load balancing and connect to the server that can provide the best level of service based on the load across all servers. The Connection Manager removes this problem by providing defined services with the above characteristics, and maps the services to groups of server instances. So, for our three service types above: The simple SLA specification of “name=server_a+server_b+server_c” indicates that there is an ordering of the servers to try – first server_a, then server_b and finaly server_c. The alternative to using parenthesis like this: “name=(server_a+server_b+server_c)” is to indicate that all of the servers have equal precedence but priority should be given to the server with the least load. It is also possible to mix these specification types, as in: “name=(server_a+server_b)+server_c” Here, the first two servers are equally considered based on load, and only when they are both unavailable is server_c considered. The load is calculated by monitoring the servers and collecting information on the degree of latency in the connection between servers, the number of ready threads, and the rate of session creation on the server. In addition to specifying the actual server name, it is possible to use only server types when setting up an SLA. The Connection Manager then maps the server type to the actual dbserver without the administrator having to name each one. The server types are. PRI or primary – whichever server is the designated primary for the cluster SDS – any Shared Disk Secondary HDR – a traditional, High-Availability Data Replication secondary RSS – any Remote Standalone Secondary server

27 Connection Manager Works on the class of service concept by resolving the following requirements: Connect to the best possible secondary. Connect to the current primary. Connect to the SDS node or primary with the most free CPU cycles. Connect to either the HDR primary or the HDR secondary. Connect to an SDS node or HDR secondary, if any are currently active, otherwise it connects to the primary. Multiple connection managers can exist so failover is possible. Supports Informix 11.7 replicate set level Grid/ER operations as well. The Connection Manager acts like a connection point for regular SQL clients, and so must itself have entries in the sqlhosts configuration file. # Primary server db_server onsoctcp koetsu.lan 9088 db_drda drsoctcp koetsu.lan 9089 # SDS servers db_sds_ro onsoctcp koetsu.lan 9098 db_sds_rw onsoctcp koetsu.lan 9108 # Connection Manager services Oltp onsoctcp koetsu.lan 19188 Reports onsoctcp koetsu.lan 19189 Other onsoctcp koetsu.lan 19190 Clients would set the INFORMIXSERVER part of their connection specification to be one of reports, oltp or other. The Connection Manager is only consulted at the time the initial connection request is made. It hands the connection over to the chosen server and does not participate further in the data flow between client and server. The actual server that the client is connected to can be found in the traditional way by querying the value of DBSERVERNAME.

28 Automatic Failover of Primary Node
Quickly detects the failure of the current primary. Part of the connection manager binary. Gets confirmation through alternate paths before performing a failover. Performs a failover using admin API interface. There is failover for the arbitrator just as there is failover for the connection manager. Supports proxy connections thru a firewall. The other role of the Connection Manager is to handle failover of servers in a cluster environment. This is accomplished with the -f flag specified on the command line or by using the Fail Over Configuration (FOC) parameter in the configuration file. The specification is made up of an ordered list of servers and/or server types (similar to an SLA but omitting PRI/primary as a server type) and a timeout value. For items enclosed in parentheses, specific server alias names have higher precedence, followed by generic server type names such as SDS, HDR, and finally RSS. If the Connection Manager detects that the primary has failed, and no action is otherwise taken by the primary to re-connect during the ensuing timeout, the next server (or server type) in the FOC list is chosen as the candidate to turn into the primary. However, to prevent a network communication failure from causing the Connection Manager to misdiagnose the primary failure, it will first confirm with the chosen secondary that it too has seen the primary fail. There is a default FOC of “SDS+HDR+RSS,0” which means that if the primary fails, then an SD secondary server will be first to be considered as the replacement primary. The Connection Manager Arbitrator will wait for the timeout period (zero seconds in this case) and then issue the commands necessary to turn the SD secondary server into the primary. If there is no SD secondary server available, then an available HDR server would be the next choice, followed by an RS secondary server.

29 Continuous Log Restore
Continuous Log Restore is useful when the backup database server is required to be fairly current, but the two systems need to be completely independent of each other for reasons such as security or network availability. With Continuous Log Restore, log files are manually transferred to a backup database server where they are restored. Normal log restore restores all of the available log file backups and applies the log records. After the last available log is restored and applied, the log restore finishes. Transactions that are still open are rolled back in the transaction cleanup phase, then the server is brought into quiescent mode. After the server is quiesced, no more logical logs can be restored. With continuous log restore, instead of transaction clean up the server is put into log restore suspended state after the last available log is restored. The restore client exits and returns control to you. With the server in this state, you can start another logical restore after additional logical logs become available. As long as you start each log restore as a continuous log restore, you can continue this cycle indefinitely. One use of continuous log restore is to keep a second system available in case the primary system fails. You can restore logical logs backed up on the primary system on the secondary system as they become available. If the primary system fails, you can restore remaining available logical logs on the secondary system and bring that secondary system online as the new primary system. Continuous log restore requires much less network bandwidth than High-Availability Data Replication (HDR) and enterprise data replication (ER). Continuous log restore is more flexible than HDR and ER because you can start continuous log restore at any time. As a result, continuous log restore is more robust than HDR or ER in unpredictable circumstances, such as intermittent network availability.

30 Preparing Remote Standby System
Primary Remote Standby Physical backup Physical Restore Restore Logs onbar –b –l ontape -a Continuous log restore is a robust way to set up a hot backup of a database server. The hot backup of the primary IDS server is maintained on the backup server, which contains similar hardware and an identical version of IDS. To configure a backup server using Continuous Log Restore, a physical backup of the primary server is created and the backup copy is transported to the backup server. The backup is then restored on the backup server. After the restore is complete, the backup server is ready for a logical recovery. In the event that a logical log on the primary server becomes full, it is backed up and then transported to the backup server where logical recovery (log roll forward) is performed. The secondary server remains in log restore suspended state after the last available log is restored. With the server in this state, another logical restore can be started immediately after additional logical logs become available. Should the primary server become unavailable, a final log recovery is performed on the backup server, which is brought up in online mode as the primary server. Continuous Log Restore can be combined easily with the other high-availability solutions, such as shared disk and remote secondary servers, or with hardware solutions, such as cluster failover. Log backup device

31 Continuous log restore option with ontape/onbar
With continuous log restore option, server will suspend log restore: ontape -l -C Roll forward should start with log number 7 Log restore suspended at log number 7 Log restore can be restarted again with ontape command. To configure continuous log restore by using ON-Bar: 1 On the primary system, perform a level-0 backup with the onbar -b -L 0 command. 2 Import the backup objects that were created to the storage manager of the secondary server. 3 On the secondary system, perform a physical restore with the onbar -r -p command. After the physical restore completes on the secondary system, the database server waits in fast recovery mode to restore logical logs. 4 On the primary system, back up logical logs with the onbar -b -l command. 5 Transfer the backed up logical logs to the secondary system and restore them with the onbar -r -l -C command. 6 Repeat steps 4 and 5 for all logical logs that are available to back up and restore. To configure continuous log restore with ontape: 1 On the primary system, perform a level-0 archive with the ontape -s -L 0 command. 2 On the secondary system, copy the files or mount the tape (as assigned by LTAPEDEV) and perform a physical restore with the ontape -p command. 3 Respond to the following prompts: Continue restore? Y Do you want to back up the logs? N Restore a level 1 archive? N After the physical restore completes, the database instance waits in fast recovery mode to restore logical logs. 4 On the primary system, back up logical logs with the ontape -a command. 5 On the secondary system, copy the files or mount the tape that contains the backed up logical logs from the primary system. Perform a logical log restore with the ontape -l -C command. ontape -l -C Roll forward should start with log number 8 Log restore suspended at log number 10 Recovery mode is terminated by ontape –l command

32 Resources The Online Informix Information Center
IBM Informix DeveloperWorks Technical Articles IBM DeveloperWorks Informix Blogs (Informix Replication) (Informix Application Development) (Informix Experts Blog) Product documentation is available in many formats. Online Information Center (IC) is free on the Web and it contains the same information as the PDF library, which is shipped on CDs and can be ordered from the IBM Publication Center. We are investigating how to provide a downloadable IC in the future. Examples exchange Web site will contain examples that are provided on an "as-is" basis. Use the examples as models for your own situations. You will be able to rate each example. Also, you will be able to sort examples by date, topic or rating. Migration portal can help you navigate through the available information and resources related to migrating Informix database products to a new release.


Download ppt "Informix Replication and Availability Offerings"

Similar presentations


Ads by Google