Presentation is loading. Please wait.

Presentation is loading. Please wait.

Initial Data Access Module & Lustre Deployment Tan Li.

Similar presentations


Presentation on theme: "Initial Data Access Module & Lustre Deployment Tan Li."— Presentation transcript:

1 Initial Data Access Module & Lustre Deployment Tan Li

2 2 Outline Disk I/O test for netqos03 and netqos04 Initial design for file I/O module  Data read with different function and buffer size  Data read with fread() with different waiting time and buffer size  Some conclusions Intro to Lustre setup Lustre deployment for the new servers

3 3 Initial Design for Data Access  Current data access module (Block size: 100K, 1M, 10M,100M, 500M for 100G file)

4 4 Initial design for file I/O module 1.Head file: ftp_io.h 2.Date access functions int ftp_open(char *path, int block_size, int mode); int ftp_read(int infile_fd, char *out_buf, int block_size); int ftp_write(int outfile_fd, char *in_buf, int block_size); int ftp_close(int close_fd, int block); Usage of ftp_open(): Block size passed to the function in order to decide the open method (open, fopen or open with O_DIRECT), and the close method of ftp_close should accord with the ftp_open. mode=0 is open for read, and mode=1 is for write

5 5 Initial design for file I/O module

6 6

7 7 Block size > 400K? open/fopen (Read only) open with O_DIRECT(Read only) No Yes Mode=0 or 1 Return the file descriptor open with O_DIRECT(Write only) open/fopen (Write only)

8 8 Initial design for file I/O module  Problem with O_DIRECT when write data When write data with O_DIRECT, the block should be the multiple of 512 Byte on our platform. So, we will have problem to write the last few bytes of the file. Possible solution: 1. using the regular write() to output the remaining data. 2. Integrate open function into the read and write function

9 9 Data reading test on fread() 1.Test result by the time tool of linux 2.Test result by nmon (recording data every two secs)

10 10 Data reading test on fread() Some Conclusions  The bandwidth grows with the increment of buffer size, especially when the buffer size change from 100K to 1000K(3 times).  The bandwidth is not sensitive to the wait time until it reach some threshold. And the larger the buffer size is, the bandwidth is less sensitive to the delay.  The CPU utilization is 0% when the buffer size is below 100K. And it grows with the increase of buffer size.

11 11 IWARP and Infiniband InfinibandIWARP HardwareSpecialized I/O structure A set of mechanisms over Ethernet that moving data management and network protocol processing to the RNIC card Transport methodpoint-to-pointend to end Compatibilityfully compatible with existing Ethernet switching specialized infrastructure VendorsA broad range of vendors Only two: Mellanox and QLogic

12 12 RoCEE RoCEE = Infiniband over Ethernet(IBoE) RDMA over Converged Enhanced Ethernet (RoCEE) protocol proposal, is designed to allow the deployment of RDMA semantics on Converged Enhanced Ethernet fabric by running the IB transport protocol using Ethernet frames.In other words, to take the InfiniBand transport layer and package it into Ethernet frames, instead of using the iWARP protocol for Ethernet-based high-performance cluster networking.

13 13 RoCEE Problem 1: IWARP has already leveraged the performance benefit of RoCEE Problem 2: hard to implement. Problem 3: the RoCEE is dependent on the deployment of 10GbE CEE infrastructure; currently only one vendor (Cisco) offers CEE switches, which are at relatively high price points.

14 14 Thanks & Questions


Download ppt "Initial Data Access Module & Lustre Deployment Tan Li."

Similar presentations


Ads by Google