Presentation is loading. Please wait.

Presentation is loading. Please wait.

Storage Systems in HPC John A. Chandy Department of Electrical and Computer Engineering University of Connecticut.

Similar presentations


Presentation on theme: "Storage Systems in HPC John A. Chandy Department of Electrical and Computer Engineering University of Connecticut."— Presentation transcript:

1 Storage Systems in HPC John A. Chandy Department of Electrical and Computer Engineering University of Connecticut

2 Research Summary Storage SystemsStorage Systems –Active Storage –Parallel File Systems –Reliable Data Storage –Active Storage Networks

3 Storage Systems Parallel ComputingParallel Computing –Building parallel file systems to support HPC –Computation at the storage node –Data organization methods to improve performance Reliable Data StorageReliable Data Storage –Customizable and extensible storage for reliability –Backup strategies using personal storage devices –Data security, trust, and reliability in the cloud

4 Parallel File Systems Network Attached StorageNetwork Attached Storage –Put the storage on the network with a computer (server) acting as the go-between Network

5 Parallel File Systems Separate the metadata from the storageSeparate the metadata from the storage Network Metadata

6 Parallel File Systems How do you improve metadata performance?How do you improve metadata performance? –Distribute metadata services on data nodes –Use active storage and object services

7 Active Storage Allows us to run applications on storage nodesAllows us to run applications on storage nodes Can dramatically reduce data trafficCan dramatically reduce data traffic –Eliminate large network latencies Take advantage of fast RAID arrays and SSDsTake advantage of fast RAID arrays and SSDs –Drives bottle-necked by slow networks Run applications in parallel across multiple nodesRun applications in parallel across multiple nodes Make use of unused processor timeMake use of unused processor time

8 Programming Model Based on object storageBased on object storage RPC basedRPC based –Executable objects –RPC calls have full access to all object functions – read, write, create, set attribute, etc. Functions can be synchronous or asyncFunctions can be synchronous or async Supports multiple languages (C, Java, Python)Supports multiple languages (C, Java, Python)

9 Programming Model Based on work by Acharya, Riedel - Stream basedBased on work by Acharya, Riedel - Stream based Our model is Remote Procedure Call (RPC) basedOur model is Remote Procedure Call (RPC) based o Use executable objects o Added command to begin execution o Allow full access to all OSD functions Functions can be run sync or asyncFunctions can be run sync or async o Due to iSCSI 30sec timeout o Working to allow queries for async Allow parallel execution using asyncAllow parallel execution using async Support multiple languages (c, java, python)Support multiple languages (c, java, python)

10 Security Multiprocess implementationMultiprocess implementation –Limits AS functions from directly accessing objects –Limits access to the object services library –Enforces use of object security mechanisms chroot sandboxingchroot sandboxing –C/Java engines run in a chroot directory –Allows limited system libraries – e.g. libc

11 Security Multiprocess ImplementationMultiprocess Implementation o Limits AS functions from directly accessing objects o Limits access to the OSD services library  Forces the use of RPC o Enforces the use of OSD security mechanisms Chroot SandboxingChroot Sandboxing o Applied to engines o Limits engines inside a single directory o Allows limiting of libraries  AS versions of libraries possible

12 Active Storage Code Example

13 Results: AES Local vs. Active Storage

14 Results: Scaling with Multiple OSDs

15 Results: C vs. Java

16 High Performance Computing Active storage networkActive storage network –Computing in the network –SIMD-like processing of data in motion –Adaptive computing network elements –Application optimizations for database queries, scientific applications, data mining, sort, etc.

17 Active Storage Networks Data Sort

18 BECAT Collaboration Large Data ProblemsLarge Data Problems Parallel File Systems ImplementationParallel File Systems Implementation


Download ppt "Storage Systems in HPC John A. Chandy Department of Electrical and Computer Engineering University of Connecticut."

Similar presentations


Ads by Google