Presentation is loading. Please wait.

Presentation is loading. Please wait.

Scaling Spark on HPC Systems

Similar presentations


Presentation on theme: "Scaling Spark on HPC Systems"— Presentation transcript:

1 Scaling Spark on HPC Systems
Presented by: Jerrod Dixon

2 Outline HDFS vs Lustre MapReduce vs Spark Spark on HPC
Experimental Setup Results Conclusions

3 HDFS vs Lustre

4 Hadoop HDFS Distributed filesystem Multi-node replication
Direct communication with NameNode

5 Lustre Very popular filesystem for HPC systems Leverages
Management Server (MGS) Metadata Server (MDS) Object Storage Servers

6 Lustre Full POSIX support
Metadata Server informs clients where parts of file object are located Clients connect directly

7 MapReduce vs Spark

8 MapReduce Typical method of interacting on HDFS
Maps data in files to key-pairs Reduces to unique key with value

9 Spark Similar to overall methodology of MapReduce
Maintains processes in memory Distributes data across global and local scopes

10 Spark – Vertical Data Processes from disk only when final results requested Pulls from filesystems and works against data in batch methodolgy

11 Spark –Horizontal Data
Distributes work done across nodes as it is processed Similar distribution to HDFS replication, but force-kept in memory

12 Spark Operates primarily on Resilient Distributed Databases (RDDs)
Map processes can be nested but lazy Reduce operation forces processing Caching method to force map into memory Here, making note that for ‘lazy’ means that spark does not execute transformations until data is needed

13 Spark on HPC

14 Spark on HPC Spark designed for HDFS Works on data in batches
Expects partial data on local disk Executes jobs as results requested Works on data in batches Vertical Data movement

15 Experimental Setup

16 Hardware Edison and Cori Cray XC supercomputers at NERSC
Edison uses 5,576 compute nodes Each has two 2.4 GHz 12-core Intel “Ivy Bridge” processors Cori uses 1,630 compute nodes Each has two 2.3 GHz 16-core Intel “Haswell” processors.

17 Edison cluster Leverages Lustre Standard implementation
Single MDS, single MDT

18 Cori Cluster Leverages Luster Leverages BurstBuffer
Accelerates I/O performance

19 BurstBuffer Sits between memory and Lustre
Stores frequently accessed files to improve I/O

20 Results

21 Single Node Clear bottle-neck in communicating with disk

22 Multi-node file I/O

23 BurstBuffer

24 GroupBy Benchmark 16 nodes (384 cores) Edison weak scaling
Partitions must exchanged with partitions shm – memory mapped storage

25 GroupBy Benchmark Cori specific

26 Impact of BurstBuffer Increase in mean time till operation
Lower variability in access time

27 Conclusions

28 No mention of .persist() .cache()
Spark memory management to preserve processed partitions after eviction .cache() Mask of .persist() with bare basic parameters MEMORY_ONLY mode

29 Conclusions Clear limitations to using Lustre as filesytems
Increases in access time, decreases in processing, BurstBuffer helps but only with certain amount of nodes No discussions on Spark methods to overcome issues

30 Issues Weak scaling covered extensively
Strong scaling covered almost not at all No comparisons to equivalent work on HDFS system Spark is designed for HDFS, comparing work done on HPC to standard HDFS implementation seems intuitive

31 Questions?


Download ppt "Scaling Spark on HPC Systems"

Similar presentations


Ads by Google