Download presentation
Presentation is loading. Please wait.
Published byBrooke Wagner Modified over 10 years ago
1
State-of-the-art Storage Solutions ...and more than that.
European AFS Workshop 2009 September 28th-30th 2009 Department of Computer Science and Automation/University Roma Tre State-of-the-art Storage Solutions ...and more than that. Fabrizio Magugliani EMEA HPC Business Development and Sales
2
What does E4 Computer Engineering stand for ?
E4 = Engineering 4 (for) Computing E4 builds the solutions that accomplish the users’ requirements
3
Wide – Reliable – Advanced
Products and Services Workstation (fluid-dynamics, video editing … ) Server (firewall, computing node, scientific apps …) Storage (from small DB up to big data requirements) SAN – Storage Area Network HPC Cluster , GPU Cluster, Interconnect System config and optimization Wide – Reliable – Advanced
4
Technology Partners
5
www.e4company.com luca.oliva@e4company.com
Customer References
6
Choosing the right computing node
Architecture Non Uniform Memory Access (AMD) Architecture Uniform Memory Access (INTEL) Form factor: [1U,7U] Socket: [1,2,4,8] Core: 4,6 Memory size Accelerators (GPUs) Form factor: [1U,7U] Socket: [1,2,4] Core: 4,6 Memory size Accelerators (GPUs) Form factor: Workstation (graphic) Server rack-mount Blade
7
Choosing the right accelerator
8
Choosing and connecting the right accelerator
9
Choosing the right accelerator: Tesla S1070 Architecture
GPU 4GB GDDR3 DRAM Tesla GPU 4GB GDDR3 DRAM Multiplexes PCIe bus between 2 GPUs Power Supply PCIe x16 Gen2 Switch Each 2 GPU sub-system can be connected to a different host Thermal Management PCIe x16 Gen2 PCIe x16 Gen2 PCIe x16 Gen2 Switch System Monitoring PCI Express Cables to Host System(s) Tesla GPU Tesla GPU 4GB GDDR3 DRAM 4GB GDDR3 DRAM 9 9
10
Choosing the right accelerator: performance
800 GFLOPS on 16 GPUs ~ 99% Scaling © NVIDIA Corporation 2008
11
Choosing the right interconnection technologies
Gigabit Ethernet entry level on every solution. Ideal solution for codes with low interprocess communication requirements InfiniBand DDR Gb/s, integrable on motherboard (first cluster InfiniBand 2005, Caspur) 10 Gb/s Ethernet Quadrics Myrinet 1 GbE 10 GbE 10 GbE RDMA (Chelsio) IB DDR (InfiniHost) IB QDR (ConnectX) Latency (microsecondi) 50 10 2,5 1,2 Bandwith (MB/s) 112 350 875 1500 3000 Bisectional Bandwith (MB/S) 175 500 2900 5900
12
Interconnect Gigabit Ethernet: Ideal solution for applications requiring moderate bandwidth among processes Infiniband DDR Gb/s motherboard-based. Infinipath on HTX slot, tested with latencies less than 2 microseconds. Myrinet, Quadrics
13
Choosing the right Storage
Performance HPC storage 6 GB/s – FC, IB interface Storage space of PB - DataDirect HPC storage MB/s per each chassis ETH interface SAN - FC Up to 1 GB/s Disk Server-EHT MB/s HD section MB/s Storage type
14
Storage Interface Performance Disk subassembly Disk Server SATA /SAS
Storage SAS / SAS Ideal for HPC applications, ethernet i/f Ideal for HPC applications, FC/IB i/f Ctrl RAID PCI-Ex 200MB/s ETH 300 – 800 MB/s FC ETH Up to 1 GB/s ETH 500 MB/s per chassis InfiniBand FC Up to 3 GB/s
15
Storage Server high flexibility, low power consumption solution engineered by E4 for high bandwidth requirements. COTS-based (2 CPU INTEL Nehalem) RAM can be configured according to the users’ requirements (up to 144GB DDR3) Controller SAS/SATA multi lane 48 TB in 4U 1GbE (n via trunking), 10GbE, Infiniband DDR/QDR 374 units installed at CERN (Geneva), 70 in several customers
16
PANASAS Cluster Storage
HPC Storage Systems Data Direct Network PANASAS Cluster Storage Clustered storage system based on Panasas File System Parallel Asynchronous Object-based Snapshot Interface : 4x1GbE, 1x10GbE, IB (router) Performace (x shelf) 500 – 600 MB/s up to 100s GB/S (sequential) 20 TB per shelf, 200TB/rack, up to PBs SSD (optimal for random I/O) Interface: FC / IB Performance: up to 6GB/s 560TB per Storage System Ideal areas: Real time data acquisition Simulation Biomedicine, Genomics Oil & Gas Rich media Finance
17
HPC Storage Systems File Systems NFS lustre GPFS panasas AFS
18
Storage Area Network E4 is Qlogic Signature Partner Latest technology
Based on high performance I/F Fibre Channel 4+ 4 Gb multipath HA Failover for mission critical applicationss (finance, biomedics..) Oracle RAC QUAD
19
System’s validation - Rigid quality procedure
Reliability: basic requirement, guaranteed by E4’s production cycle Selection of quality components Production process taken care of in every detail Burn-in to prevent infantile mortality of the component At least 72h accelerated stress test in a room with high temperature (35C) 24h individual test of each sub-system 48h simultaneous test of each sub-system OS installation to prevent HW/SW incompatibility
20
Case Histories
21
Case History – Oracle RAC http://www. oracle
22
Case History – INTEL cluster @ Enginsoft
May 2007 INTEL Infinicluster 96 computing nodes Intel quad core 2,66GHz 4 TFLOPS 1,5 TB RAM Interconnection: Infiniband 4x DDR 20Gbps 30 TB Storage FC Application’s fields: Computer Aided Engineering Soluzione con router FC to IB per dare accesso ad uno storage fibre channel dai nodi infiniband
23
Case History – CERN computing servers 1U
Server 1U with high computing capacity Application’s field: Educational, academic research Customer: CERN (Geneva), main national computing and research centres 2005 415 nodes dual Xeon® 2,8Ghz 4,6 TFLOPS 2006 250 nodes Xeon® Woodcrest 3GHz 6 TFLOPS 2 TB RAM System installed up to July ’08 : over 3000 units Nota: il CERN richiede altissima affidabilita’ dei sistemi, acceptance testing di 2 mesi
24
Case History – AMD cluster @ CASPUR
2004 Cluster SLACS 24 computing nodes Opteron, 200GFLOPS 128GB RAM Managed by CASPUR on behalf of Sardinian LAboratory for Computational materials Science, l'INFM (Istituto Nazionale Fisica della Materia) June AMD Infinicluster 24 computing nodes Opteron dual core 2,4GHz, 460GFLOPS 192GB RAM Interconnection Infiniband Expanded at 64 nodes: 1,2 TFLOPS, 512GB RAM Infinicluster: esempio di cluster COTS con interconnessione infiniband nel 2005, uno dei primi (se non il primo) in Italia. Solo ora infiniband e’ mainstream
25
Case History – CRS4 Cluster 96 core
February 2005 96 computing nodes Opteron Dual core, 384 GFLOPS 192 GB RAM in total Application’s fields : environmental sciences Renewable energy, fuel cell bioinformatics In tre anni di utilizzo intensivo abbiamo ricevuto una solo richiesta di intervento per 1 banco di RAM rotto. Questo grazie alla rigida procedura di validazione dei sistemi
26
Case History – Cluster HPC Myrinet 2005
Cluster HPC interconnection Myrinet 16 computing nodes dual Intel® Xeon® 3.2 GHz High speed interconnetcion Myrinet Storage SCSI to SATA 5 TB Monitor KVM 2 switch Ethernet 24 ports layer 3 Application’s fields : Educational, research Customer : ICAR CNR of Palermo Come per Caspur infinicluster un esempio datato di cluster COTS con interconnessione a bassa latenza
27
Case History – CNR/ICAR
Hybryd Cluster (CPU + GPU) 12 Computes Nodes 96 core - 24 CPU INTEL “Nehalem” 5520 GFLOPS(peak): 920 RAM: 288 GB 6 GPU server nVIDIA S1070 24 GPU TESLA 5760 core singola precisione 720 core doppia precisione GLOPS(peak): (24 TFLOPS)
28
Case History – CNR/ICAR
Hybryd Cluster (CPU + GPU) 1 Front end Node 48-port Gb Ethernet Switch 24-port Infiniband 20Gb/s Switch
29
Hybrid cluster CPU/GPU – ICAR CNR Cosenza - ALEPH
30
Case History – CNR/ICAR
31
Case History – EPFL
32
E4: The right partner for HPC
33
Questions?
34
Feel free to contact me:
Fabrizio Magugliani
35
E4 Computer Engineering SpA Via Martiri della Liberta’ 66
Thank you! E4 Computer Engineering SpA Via Martiri della Liberta’ 66 Scandiano (RE), Italy Switchboard:
36
E4 Computer Engineering:
The perfect partner for HPC
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.