Presentation on theme: "Fault tolerance and disaster recovery"— Presentation transcript:
1 Fault tolerance and disaster recovery Unit objectivesDiscuss disk configurationDiscuss Windows-based replication and NDS/eDirectory partitions and replicasDiscuss backup and UPS
2 Topic A Topic A: System fault tolerance Topic B: Replication Topic C: Backup and UPS
3 Disaster planningWhen creating a disaster plan, some key points to be considered are:Plan for the worstImplement physical data securityProtect your critical systems
4 RAIDA set of specifications describing hard disk fault tolerance configurationsThe specification:RAID Level 0RAID Level 1RAID Level 2RAID Level 3RAID Level 4RAID Level 5There is also a RAID 10 (as in RAID “one plus zero”), a RAID 01, and several other “composite” levels.
5 RAIDBoth the book and the power point are weak on RAID; the power point also presents the material out of order – so I’ve included 10 slides on RAID from other power points, with editing:Most popular levels are RAID 0, 1, 5 and 10.RAID = “Redundant Array of Inexpensive or Independent Disks.”More disks give more heads give faster transfer rates.Sometimes the “read” is faster and sometimes the “write” is faster; sometimes both are faster.
6 RAID 0 128K Data Controller Disk 0 Disk 1 256K Data 128K Data Disk striping (no parity): (data is written across disks in a stripe).Stripe size is a multiple of 2, size depends on the RAID level in use, RAID 1 and 0 have a high stripe size, usually 128K whereas RAID 5 has a low stripe size usually 16K.RAID 0 offers no redundancy (no fault tolerance).The 2 128K stripes of data are written in parallel.Note that because you have the same amount of data, but written across 2 or more disks, you have more locations from which to read it. This results in faster disk reads – you have 2 heads reading at once.“Disk writes” are also faster; you have 2 places to which to write!Disadvantage: There is no fault tolerance, because there is no parity.Controller128K DataDisk 0Disk 1256K Data128K Data
7 RAID 1 (Mirroring)Minimum of 2 disks, literally one disk is a complete mirror of the other.One fails the other takes overWhen reading can read both disks (two copies of the data) – very fast read and write access.Um, no – reads are faster – you have 2 identical disks from which to read, but you have to write everything twice! This isn’t as slow as it seems, especially when using 2 controllers – duplexing – but it isn’t fast either.Controller256K DataDisk 0256K DataDisk 1256K Data
8 RAID 1 (Duplexing)Controller Mirroring (2 controllers) each with a diskJust like mirroring but with two controllers instead of one, so if one controller goes down, you still have one disk that is fine.ControllerDisk 0Software mirroring (RAID1)Disk 1ControllerDisk 0Disk 1
9 RAID 10 or RAID 01 RAID 10 is mirroring (1) then striping (0) RAID 01 is striping (0) then mirroring (1)RAID 0+1RAID 1+0128K Data128K Data128K Data128K Data256K Data256K DataDisk 0Disk 2Disk 0Disk 2Disk 1Disk 3Disk 1Disk 3Write Single StripeMirrorWrite Single StripeMirror
10 RAID 10 or RAID 01Minimum of 4 disks because data needs mirroring and striping.Massive difference when comes to fault tolerance so be careful!RAID 10 allows for more fault tolerance – any disk can go so long as it has a mirror.RAID 01 has poor fault tolerance – loose 1 disk in both mirrors and array fails.Make absolutely sure you are getting what you think you are getting – there is a difference!!
11 RAID 5 Minimum of 3 disks required. Uses parity to recalculate data in case of disk failurean EOR formula = “exclusive or” 1+0 or 0+1 = 1, 0+0 or 1+1 = 0Critical failure occurs on failure of 2 disks.Performance degradation on single disk failure.Uses a smaller stripe size to aid parity calculation.Controller16K DataDisk 016K DataDisk 1256K Data16K ParityDisk 1
12 RAID 5 Data Calculation EOR to calc parity and EOR to re-calc data Stripe 1 goes to Disk 1 -> (170) Stripe 2 goes to Disk 2 -> (189)Parity Stripe to Disk 3 is EOR(23)is written to disk 3 .Recovery (Disk 2 has failed)…Take data from Disk EORTake parity from DiskData on Disk 2 is
16 Discussing disk duplexing Activity A – page 20-7Discussing disk duplexing
17 Disk striping with parity An implementation of RAID Level 5Normally used on larger networks where data integrity is a critical concern
18 Discussing disk striping with parity Activity A – page 20-9Discussing disk striping with parity
19 Volume sets Combines space from up to 32 drives Cannot contain the system or the boot partitionIf one disk area is destroyed, the entire set failsThis is the simpler version of a “striped set”
20 Disk striping Also combines space from up to 32 drives Each segment must be the same size
21 Managing disk configuration Activity A – page 20-11Managing disk configuration
22 Topic B Topic A: System fault tolerance Topic B: Replication Topic C: Backup and UPS
23 ReplicationOffers additional data redundancy on Windows-based networksCan specify certain data to be copied from one system to anotherCommon uses includereplication of login scripts to all domain serversreplication of mandatory user profilesreplication of frequently used files across multiples servers to balance the server load
24 Replication Available in Windows NT networks Helps to copy data automatically from a source system (exporter) to a destination system (importer)
25 Key points about replication Runs as a background serviceAfter any changes, files must be closed before they can be replicatedCan specify to replicate files immediately after a change in the subdirectory treeIndividual subdirectories might be lockedAn exporter can send files to importersAn importer can receive files from one or more exportersThe import directory might be lockedA Windows NT Server might act as both an exporter and an importer
26 Active DirectoryFault tolerance of directory services information is built into the directory modelEvery domain controller holds a copy of Active DirectorySo, by this syllogism, fault tolerance is “assured”All domain controllers contain Active DirectoryAll Active Directory provides fault toleranceTherefore, all domain controllers provide fault tolerance
27 File Replication Service In Windows 2000/Server 2003, the File Replication Service (FRS) replaces the LAN Manager Replication system used in Windows NTUsed to replicate system policies as well as login scripts.Allows for file replication for domain-based Distributed File System (DFS).
28 Discussing replication Activity B page 20-13Discussing replication
29 NDS / eDirectory partitions Involves division of NDS/e-Directory databaseProvides two primary benefits:Fault tolerancePerformance Increase
30 NDS/e-Directory Partitions & Replicas Used to store information about all of the objects known to the networkA partition is a logical division of the eDirectory database. A directory partition forms a distinct unit of data in the tree that stores directory information.Partitions can be created at container level objects, like Organization, Organizational Unit or any objects marked as a container.An eDirectory has one [ROOT] partition which contains all the objects by default.Partitions are set up as parent-child objects.
33 NDS/eDirectory replicas A replica is a copy or an instance of a user-defined partition that is distributed to a serverEach partition has at least one replicaExamples of Types:Master replicaRead/write replicaRead-only replicaSubordinate reference
34 NDS/eDirectory Replica Types There are six types of replicas:1. Master replica: There can be only one Master replica for a partition. The Master is a read-writeable replica that, most importantly, controls the partition operations and the obituary process.This type of replica also performs the following operations:Managing objects (add, remove, move)Authenticating objectsManaging attributes (add, remove)By default the first server in the tree holds the Master replica of the [ROOT] partition.
35 NDS/eDirectory Replica Types 2. Read-Write replica: This replica type allows modification to objects and will automatically propagate them to the other replicas based on the timestamps.You can designate a Read-Write replica as a Master replica.3. Read-Only replica: This replica type is only readable.It does not perform any write operationsIt will forward all writing requests to a Read-Write replica.The replica can be designated as a Master replica.
36 NDS/eDirectory Replica Types 4. Filtered Read-Write Replica: This replica contains only a special set of classes and attributes specified by the filter.The replica can be written and the changes will be synchronized to the other replicas.5. Filtered Read-Only Replica: The same rules applies to this replica type as the ones to the Filtered Read-Write Replica, but the replica is only readableTherefore all writing requests are forwarded to a writeable replica.
37 NDS/eDirectory Replica Types 6. Subordinate reference replica: System-generated replicas that don't contain all the objects, attributes and values like a master or a read/write replica.Therefore, they don't provide fault tolerance.They are internal pointers generated to contain enough information for eDirectory to resolve object names across partition boundaries.You cannot create a Subordinate references replica; eDirectory will create it when the server holds a replica of the parent partition, but not one of the child partitions.It holds no partition data, only information about the "real" replica-holder servers.So it cannot be designated as a Master without adding a Read-Write or Read-Only replica.
38 Discussing NDS/eDirectory replicas Activity B Page 20-17Discussing NDS/eDirectory replicas
39 Replica ringMade up by the servers that hold replicas for that partitionDocumentation of the replica ring might consist of a replica table containing:A list of serversA list of partitionsThe type of replica stored on each server
44 Removable media An alternative to tapes Includes: Removable hard drivesFloptical mediaRewriteable CD-ROMs and DVDsProvide a convenient way to archive data
45 Discussing removable media Activity C-2Discussing removable media
46 Backup storage Storing backups in your office isn’t a good idea Always keep backups in a secure, access-controlled locationAlso have backups stored at offsite locations
47 Disaster recovery site options Cold sitesWarm sitesHot sites
48 Cold sitesCold siteUsually a single room in which your data center can be recreated in case of a disasterCan be on site or off siteDoesn’t actually hold any equipmentComing back on line after a disaster can take quite a bit of timeLeast expensive backup site solution
49 Warm sites Warm site Can be either on site or off site Contains a fair amount of equipment to create a semi-duplicate of your current data centerCan be live in much less time than a cold siteIs more expensive to create and maintain than a cold site
50 Hot sitesHot siteIs a complete duplication of your current data centerIs typically off siteCan be up and running in a matter of hoursVery expensive to create and maintain
52 Uninterruptible power supply Makes sure that the server is powered down, thereby protecting network dataKeeps your system from going down unexpectedly due to line power lossUses a battery to supply power to the system