Presentation is loading. Please wait.

Presentation is loading. Please wait.

Locality-driven High-level I/O Aggregation

Similar presentations


Presentation on theme: "Locality-driven High-level I/O Aggregation"— Presentation transcript:

1 Locality-driven High-level I/O Aggregation
for Processing Scientific Datasets Jialin Liu, Bradly Crysler, Yin Lu, Yong Chen Oct Data-Intensive Scalable Computing Laboratory (DISCL)

2 Outline Introduction Motivation Hila: High Level I/O Aggregation
Evaluation Conclusion and Future Work

3 Introduction Scientific simulations nowadays generate a few terabytes (TB) of data in a single run and the data sizes are expected to reach petabytes (PB) in the near future. GCRM, 100 million collumns, 128 levels per column, 50 km Accessing and analyzing the data reveals poor I/O performance due to the logical-physical mismatching.

4 Introduction Scientific Datasets and Scientific I/O Libraries
PnetCDF, HDF5, ADIOS PnetCDF MPI-IO Parallel File Systems Scientific I/O libraries allow users to specify array-based logical input Logical-physical mismatching

5 Motivation I/O methods in scientific I/O libraries(PnetCDF, ADIOS, HDF5): Independent I/O Processes collaboration: No Calls collaboration : No Collective I/O Processes collaboration: Yes Calls collaboration : No Noblocking I/O Processes collaboration: Yes Calls collaboration : Yes

6 Motivation Contention on Storage Server without Aware of Locality
Call0 Call1 Calli Two Phase Collective I/O ag00 ag01 ag02 ag03 ag10 ag11 ag12 ag13 agi0 agi1 agi2 agi3 Contention on Storage Server without Aware of Locality

7 Conclusion: Overlapping Should be Removed
Performance with Overlapping Calls Conclusion: Overlapping Should be Removed

8 Idea: High level I/O Aggregation
Physical Layout start{0,0,0} length{100,200,200} start{10,20,100} length{10,300,400} Call0 Call1 Logical Input Decomposition Physical Layout sub0 sub2 start{0,0,0} length{100,200,100} sub1 sub3 start{0,0,100} length{100,200,100} start{10,20,100} length{10,150,400} start{10,170,100} length{10,150,400}

9 Idea: High level I/O Aggregation
Basic Idea Figure out the overlapping among requests Eliminate the overlapping before doing I/O Challenges How to decompose the requests How to aggregate the sub-arrays at a high level

10 Hila: High Level I/O Aggregation
Way to figure out the physical layout Sub-correlation Function Sub-correlation Set Lustre Striping: stripe size: s; stripe count: l; Dataset : Dimension: d; subsets size: m

11 Hila Algorithm: Prior Step
Prior Step: calculate sub-correlation set, one time analysis

12 Hila Algorithm: Decomposition
Main Steps: Request Decomposition and Aggregation

13 Performance Improved with Hila
Improvement with Hila Performance Improved with Hila

14 FASM Improved with Hila
Improvement with Hila FASM Improved with Hila

15 Conclusion and Future Work
The mismatching between logical access and physical layout can lead to poor performance. We propose the locality-driven high-level aggregation approach (HiLa) to facilitate the existing I/O methods by eliminating the overlapping among sub-array requests. Future Work Apply to write operations Integrate with file systems.

16 Thanks Q&A Locality-driven High-level I/O Aggregation
for Processing Scientific Datasets Thanks Q&A


Download ppt "Locality-driven High-level I/O Aggregation"

Similar presentations


Ads by Google