Presentation is loading. Please wait.

Presentation is loading. Please wait.

RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.

Similar presentations


Presentation on theme: "RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho."— Presentation transcript:

1 RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho

2 2 Outline Introduction RAID Orthogonal Striping and Mirroring Trojans Cluster Experiments Cooperative disk drivers Benchmark Experiments Striped Checkpointing on RAID-x

3 3 Introduction RAID-X Redundant array of inexpensive disks at level x Provides High Bandwidth distributed I/O processing on a Serverless Cluster where server functions are distributed among client hosts Based on Orthogonal striping and mirroring (OSM) Cooperative Disk Drivers (CDD) are used to implement the OSM at the kernel level Maintains data consistency without using NFS or Unix system calls

4 4 Distributed RAID-x Must have these Capabilities: 1. A single I/O space (SIOS) for all disks in cluster. 2. High scalability, availability and compatibility with current cluster architectures and applications. 3. Local and remote disk I/O operations performed with comparable latency. Implies. Total transparency to users. Utilize all disks without knowing the physical locations of the data blocks.

5 5 Orthogonal Striping and Mirroring OSM Provides: Improvement in Parallel I/O Bandwidth Hides disk mirroring overhead Enhances scalability and reliability of cluster computing applications. Eliminates Small write problem which affects RAID-5. Has advantages of both RAID-1 and Chained Declustering

6 6 Advantages of RAID-1 & Chained Declustering RAID-1: Mirroring and Duplexing 100% redundancy of data = no rebuild, just a copy Twice the Read transaction rate of single disks Chained Declustering Load Balancing

7 7 Architecture of RAID-x vs. chained declustering RAID B j original data blocks. M j mirrored blocks. 4 disks with mirroring groups involving 3 consecutive disk blocks. ex. M 0, M 1, M 2, mirroring blocks for B 0, B 1, B 2 data blocks. Diff mirroring groups are in diff shadings.

8 8 Architecture of RAID-x Orthogonal mapping. no data block and it’s image are mapped to the same disk. Data blocks are striped across all disks on the top half of the disk array, like RAID-0. Means that for large writes blocks can be written in parallel to all disks in the stripe simultaneously. Image blocks are “clustered” in the same disk vertically. Clustered images in the mirroring group are simultaneously updated in the background. Resulting in lower latency and higher bandwidth in RAID-x.

9 9 Abbreviations used

10 10 Performance of 4 RAID Architectures Write operations are improved. The same bandwidth potential as Raid-0 and chained declustering. Improvements from declustering are mainly in parallel writes. For large array size improvement approaches a factor of 2. Tolerate single disk failures (RAID-5).

11 11 Trojans Cluster 16 Pentium II/400 MHz processors RedHad Linux v. 6.0 PC engines (nodes) were connected by 100 Mbps Fast Ethernet Each node is attached with a 10GB disk 16 disks = 160 GB single I/O space

12 12 Distributed RAID-x Architecture 3 disks for each node, 4 nodes Stripe groups B 0, B 1, B 2, B 3 accessed in parallel Consecutive stripe groups (B 0, B 1, B 2, B 3 ), (B 4, B 5, B 6, B 7 ), (B 8, B 9, B 10, B 11 ) accessed in pipeline fashion because they are retrieved from disk groups attached to the same SCSI buses 4x3 RAID-x architecture with orthogonal striping and mirroring P: processor, M:memory CCD:cooperative disk driver, D j the j th disk, B i : the i th data block, B i ’: the i th mirrored image in a shaded box

13 13 Distributed RAID-x Arch. cont. n-by-k RISK-x Stripe group w/in disk blocks on n disks Mirroring group n-1 blocks on 1 disk Images of all data blocks in stripe group are saved to two disks Block addressing scheme strips across all nk disks sequentially and repeatedly n = degree of parallelism K = depth of pipelining 4x3 RAID-x architecture with orthogonal striping and mirroring P: processor, M:memory CCD:cooperative disk driver, D j the j th disk, B i : the i th data block, B i ’: the i th mirrored image in a shaded box

14 14 Single I/O space in a Distributed RAID A global virtual disk with a SIOS formed by cooperative disks crucial to building scaleable cluster of computers If not shared then I/O must be handled by time consuming system class through centralized file server (ex NSF) Enabled by CDDs at Linux kernel level

15 15 Cooperative Disk Driver (CDD) Architecture Internal design of CDD arch. Establish SIOS single global virtual disk Storage Manager Receives and processes the I/O requests Client modules Redirects local I/O requests to remote disk managers

16 16 CDD Architecture cont. Data Consistency Module Maintains data consistency at Driver level that result from distributed disks updating cached copies of same data block Can Run in 3 different states Storage Manager: coordinates use of local disk storage by remote nodes Client: accessing remote disks through remote disk managers both

17 17 CDD Allows serverless clusters Offers remote disk access directly at kernel level

18 18 I/O Bandwidth vs. request number Performance of 4 I/O subsystem architectures For large read/write 20MB file striped across all disks in array Focuses on parallel I/O capacity of disk array Uncached files Client reads only private file All reads performed simultaneously

19 19 I/O Bandwidth vs. request number Performance of 4 I/O subsystem architectures Small read/write 32KB data One block of stripe group Results for small are very close to the results for large

20 20 I/O Bandwidth vs. request number Performance of 4 I/O subsystem architectures For parallel writes RAID-x best scalability with 15.3MB/s for 16 clients RAID-5 scales slowly due to overhead of parity calcs. RAID-1 scales better than RAID-5

21 21 Achievable I/O Bandwidth and Improvement Factor Improvement factor of 16 clients over 1 on USC Trojans Cluster RAID-x demonstrates the highest improvement factor among the three RAID Arch. Almost 3x increase on RAID-X from 1-16

22 22 Elapsed Time in Executing the Andrew Benchmark On 4 I/O subsystems with respect to increase of number of client requests up to 32. NSF resultsRAID-x results

23 23 Striped Checkpointing on the RAID-x Distribute data blocks and their mirrored images orthogonally Striped staggering in coordinated checkpointing on the RAID-x disk array Successive stripes are accessed in a staggered manner from diff. Stripes on successive 4-disk groups. Staggering implies pipelined access of disk array

24 24 Striped Checkpointing on the RAID-x Using OSM each striped checkpointing file has its mirrored image on its local disk. For each node, transient failures can be recovered from its mirrored image in local disk. Permanent failures can be recovered from striped checkpointing.

25 25 Conclusions RAID-x shows strength in building distributed, high-bandwidth, I/O storage for serverless PC or workstation clusters. OSM Architecture exploits full stripe bandwidth Reliability Clustered mirror on on local disks and orthogonal striping across distributed disks. Matches RAID-5 (recovery from single disk failures) I/O performance is better than RAID-1 and 5 Highly scalable with distributed control

26 26 Questions What does Orthogonal striping and mirroring (OSM) provide? What is RAID-x? Explain the Cooperative Disk Driver (CDD). Compare RAID-x to Petal.

27 27 Resources http://ceng.usc.edu/~kaihwang/


Download ppt "RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho."

Similar presentations


Ads by Google