Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chap 7: Consistency and Replication

Similar presentations


Presentation on theme: "Chap 7: Consistency and Replication"— Presentation transcript:

1 Chap 7: Consistency and Replication
I- Introduction Advanced Operating Systems class : CSc 8320 By Sidoine Lafleur Manono Fotso epse Kamgang On September 30, 2015 My lecture presentation is about the chapter 7 of the advanced Operating systems class: titled Consistency and Replication, I will present the Introduction Instructor: Prof. Yanqing Zhang

2 OUTLINE DEFINITIONS DON’T BE CONFUSED WITH THIS WHY DO WE REPLICATE?
WHAT MAY BE REPLICATED? REPLICATION PROTOCOLS DRAWBACKS OF REPLICATION ONGOING RESEARCHES REFERENCES My outline is as followed: -Definitions of the consistency then definition of the replication -Try to raise confusion people have between consistency and or replication and others technical thems -Explain why we may need to replicate - Explain what may be replicated ? - Give a brief talk on replication protocols - State some drawback I found on replication or consistency - Ongoing research

3 DEFINITION CONSISTENCY
Per the dictionary, Consistency is about adhering / agreeing or according to the same principles, characteristics, form, etc……. In the same order , in Advanced Computing Systems, Consistency  refers to the requirement that any change to the system made by a transaction must be such that it keeps the system valid according to all system’s defined rules.  Per the dictionary, consistency is about adhering or agreeing to the same principle , forms, rules,…. In the same order, in advanced computing systems consistency is the requirement that any change to the system by a transaction much be done such that it keeps the system valid according to the defined rules. - The defined rules may be : Cascade deletion: This means that each delete in the system should be done in cascade way no matter what. In this situation the system will be defined as consistent accoridng to the cascade deletion rule.

4 DEFINITION CONSISTENCY
For example If the consistency rules here is The “ a cascade delete ” and the system it applies to is the one below: a database of customers and their orders. Per the dictionary, consistency is about adhering or agreeing to the same principle , forms, rules,…. In the same order, in advanced computing systems consistency is the requirement that any change to the system by a transaction much be done such that it keeps the system valid according to the defined rules. - The defined rules may be : Cascade deletion: This means that each delete in the system should be done in cascade way no matter what. In this situation the system will be defined as consistent accoridng to the cascade deletion rule. This system will be consistent according to the cascade delete rules only if for any deletion of a Customer in the table of customers, all the deleted users’ orders will be automatically delete also.

5 DEFINITION REPLICATION
Broadly, In advanced Computing systems, Replication is about keeping resources redundant or copying resources , and maintaining consistency among them, in order to improve reliability, Fault-Tolerance, and accessibility. It should be transparent to an external user. Replication uses distributed technology to share data between multiple sites What about replication: it is about keeping resources redundant by copying them and maintaining consistency among them in order to improve reliability , fault tolerance and accessibility. This shows that consistency is part of replication.

6 DON’T BE CONFUSED WITH THIS
BACKUP differs from replication in that it saves a copy of data unchanged for a long period of time while Replicas, on the other hand, undergo frequent updates and quickly lose any historical state. LOAD BALANCING  differs from task replication, in that it distributes a load of different (not the same) computations across machines. sometimes load balance uses data replication internally, to distribute its data among machines. There are some them in distributed computing and in advanced computing that may be confusing with replication: -Some may think that replication is a back up system: No what differ a back up to a replication is that , in Back up it may happen that copy stay unchanged for a long period the backup frequency is not short. This make change in the real system visible in the backup only at the backup time. While in replication, every time there is a change in the main system , the change should be replicate immediately in the copy and in a transparent way to the user, to keep consistency. - Also Load balance differs from replication in that load balance distributes a load of different and not the same computations across computers of the network while replication keep the same information to different computer.

7 WHY DO WE REPLICATE? FOR MORE RELIABILITY: If one copy is unavailable or crashes, another copy is there TO INCREASE AVAILABILITY OR ACCESSIBILITY: Replication provides fast, local access to redundant data, FOR PERFORMANCE ENHANCEMENT: Duplicating system reduce servers’ and network load. Some users may access one server while other users access different servers, or access the site with the lowest access cost ( the site closest to the users )this reducing the load at all servers, increasing therefore the performance MORE FAULT-TOLERANT: When some part of a replicate system fails , the system will not completely fail , it will still operate even at a reduce level rather than failing completely, this make it fault tolerant A said in the definition of replication, They are many reasons why we replicate in distributed systems and the main are: - We replicate for more reliability: In fact replication allow to set aside a copy of the system ready to be use and time for example if the systems were crashed or damaged. - We replicate to increase availability: When data are replicated , the system is more accessible as we may quickly get back to the system as a copy is there ready to be used. - We replicate for Performance improvement: replication give the possibility to some users to access one server ( the closest must of the time)while other users access different servers )this reduce the load at all servers and make the system more performant. - We replicate for more Fault tolerance: the Fault tolerant characteristic of a replicated system is a consequence of its reliability and its accessibility that a replicate system offer, it allows to the system to still be able to operate even at reduce level if same part of the -

8 WHAT MAY BE REPLICATED? COMPUTING TASKS may be replicated, it is called computation replication  and it’s when the same computing task is executed either repeatedly on a single device( replication in time) or the same is executed on separated devices (replication in space) A situation where a computation replication in space may be done in a distributed environment is when we want a task to be run the fastest possible but we don’t know which computer in the system may run it faster, In this situation we may replicate the same task on all computers of the distributed system and run in parallel, the response of the fastest computer to run the task will be use, knowing that it is the fastest response time for that task. What may be replicated - Tasks may be replicated in this case, we talk about computation replication: Here the same computer task is executed either repeatedly on a single device (it is replication in time ) or the same task is executed on separated device ( replication in space). for example when may we need a computational replication: let says that we have a task and many computers connected in a distributed system, we would like the task to be executed the faster possible because we will use the result for another task, we may implement here a computational replication where all the computers in the network will execute the same task and the first computer to provide the result of the task will be use.

9 WHAT MAY BE REPLICATE? DATA may also be replicate. It is called data replication where the same data is stored on multiple storage device. Whether one replicates data or task, the problem is about how the system handle incoming request to its access and how the system keep copy consistent. This is called the replication protocol and we have 2 main protocols: Data may be replicated and it is called data replication. In replication , the matter is how the system will handle request and how it will keep copy consistent. The replication Protocol is what defined that and we have 2 main replication protocol:

10 PASSIVE REPLICATION - Where there is only one server (the primary server ) which processes client requests. After processing a request, it updates the state on the other servers, copy / replica and sends back the response to the client. If the primary server fails, one of the replica servers takes its place. Ex of passive replication is data replication , it operates only to maintain the stored data, reply to read requests, and apply updates. The passive replication: - here there is 1 server called primary server , which process requests and update the status on others server and reply to the request to the client. If the primary server fails , one of the replica server takes it place

11 PASSIVE REPLICATION

12 ACTIVE REPLICATION - Here each client request is sent to and processed by all the servers. For all the servers to receive the same sequence of operations, an atomic broadcast protocol must be used to guarantee that either all the servers receive a message or none, plus that they all receive messages in the same order. In active replication protocol, each client request is send to and process by all server. For this an atomic broadcast protocol is used to insure all server receive the the message or not and in the same order. Others talk about a 3rd replication protocol called a Lazy replication:

13 ACTIVE REPLICATION Others talk about a 3rd replication protocol called a Lazy replication:

14 DRAWBACKS OF REPLICATION
Increased overhead on update: When an update is required, a database system must ensure that all replicas / copy are updated. Require more disk space, so increase resources usage: Storing replicas of same data at different sites consumes more disk space. Expensive: Concurrency control and recovery techniques are more advanced and hence more expensive. although replication has a lot of advantages it also has some drawback: - overhead are increase on update - It required more disk space and more resources - It is expensive because of recovery techniques associated.

15 ONGOING RESEARCH Enhance Replication Performance mainly when update operation are required in the system by: - Minimizing transaction size in application design. If transactions are smaller, it is less likely that the Distribution server will have to resend a transaction due to network issues. If the agent is required to resend a transaction, the amount of data sent is smaller. - Configuring the Distributor agent on a dedicated server. This reduce processing overhead on the Publisher by configuring a remote Distributor. - etc………

16 REFERENCES Replication , Distributed Systems, Principles and Paradigms, Second Edition, Andrew s. Tanenbaum Maarten Van Steen Consistency Algorithms for Optimistic Replication, Department of Computer Science University of California, Los Angeles, CA Consistency and Replication: Distributed OS, University of Massachusetts Amherst computer Science.

17 THANKS.


Download ppt "Chap 7: Consistency and Replication"

Similar presentations


Ads by Google