Enterprise at a Global Scale Paul Grun Chief Scientist System Fabric Works (503) 620-8757

Slides:



Advertisements
Similar presentations
M A Wajid Tanveer Infrastructure M A Wajid Tanveer
Advertisements

System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
NAS vs. SAN 10/2010 Palestinian Land Authority IT Department By Nahreen Ameen 1.
Empowering Business in Real Time. © Copyright 2009, OSIsoft Inc. All rights Reserved. Virtualization and HA PI Systems: Three strategies to keep your PI.
1 InfiniBand HW Architecture InfiniBand Unified Fabric InfiniBand Architecture Router xCA Link Topology Switched Fabric (vs shared bus) 64K nodes per sub-net.
VIA and Its Extension To TCP/IP Network Yingping Lu Based on Paper “Queue Pair IP, …” by Philip Buonadonna.
Computer Network Architecture and Programming
Chapter 10 Introduction to Wide Area Networks Data Communications and Computer Networks: A Business User’s Approach.
Networks 1 CS502 Spring 2006 Network Input & Output CS-502 Operating Systems Spring 2006.
CS-3013 & CS-502, Summer 2006 Network Input & Output1 CS-3013 & CS-502, Summer 2006.
5/8/2006 Nicole SAN Protocols 1 Storage Networking Protocols Nicole Opferman CS 526.
Hardware/Software Concepts Tran, Van Hoai Department of Systems & Networking Faculty of Computer Science & Engineering HCMC University of Technology.
DISTRIBUTED COMPUTING
Damian Gordon.  When we hook up computers together using data communication facilities, we call this a computer network.
Computer Networks IGCSE ICT Section 4.
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.0 Communicating over the Network Network Fundamentals – Chapter 2.
CECS 474 Computer Network Interoperability Tracy Bradley Maples, Ph.D. Computer Engineering & Computer Science Cal ifornia State University, Long Beach.
IWARP Ethernet Key to Driving Ethernet into the Future Brian Hausauer Chief Architect NetEffect, Inc.
1 Wide Area Network. 2 What is a WAN? A wide area network (WAN ) is a data communications network that covers a relatively broad geographic area and that.
New Direction Proposal: An OpenFabrics Framework for high-performance I/O apps OFA TAC, Key drivers: Sean Hefty, Paul Grun.
For more notes and topics visit:
Yury Kissin Infrastructure Consultant Storage improvements Dynamic Memory Hyper-V Replica VM Mobility New and Improved Networking Capabilities.
Chapter 2 The Infrastructure. Copyright © 2003, Addison Wesley Understand the structure & elements As a business student, it is important that you understand.
Networks. What is a Network? Two or more computers linked together so they can send and receive data. We use them for sending s, downloading files,
LECTURE 9 CT1303 LAN. LAN DEVICES Network: Nodes: Service units: PC Interface processing Modules: it doesn’t generate data, but just it process it and.
Slide 1 What is a Computer Network? A computer network is a linked set of computer systems capable of sharing computer power and resources such as printers,
1 WHY NEED NETWORKING? - Access to remote information - Person-to-person communication - Cooperative work online - Resource sharing.
Copyright 2009 Fujitsu America, Inc. 0 Fujitsu PRIMERGY Servers “Next Generation HPC and Cloud Architecture” PRIMERGY CX1000 Tom Donnelly April
Networks QUME 185 Introduction to Computer Applications.
Boosting Event Building Performance Using Infiniband FDR for CMS Upgrade Andrew Forrest – CERN (PH/CMD) Technology and Instrumentation in Particle Physics.
The NE010 iWARP Adapter Gary Montry Senior Scientist
NETWORKING COMPONENTS AN OVERVIEW OF COMMONLY USED HARDWARE Christopher Johnson LTEC 4550.
Swapping to Remote Memory over InfiniBand: An Approach using a High Performance Network Block Device Shuang LiangRanjit NoronhaDhabaleswar K. Panda IEEE.
Enabling Technologies (Chapter 1)  Understand the technology and importance of:  Virtualization  Cloud Computing  WAN Acceleration  Deep Packet Inspection.
1 Public DAFS Storage for High Performance Computing using MPI-I/O: Design and Experience Arkady Kanevsky & Peter Corbett Network Appliance Vijay Velusamy.
Infiniband Bart Taylor. What it is InfiniBand™ Architecture defines a new interconnect technology for servers that changes the way data centers will be.
Lecture (Mar 23, 2000) H/W Assignment 3 posted on Web –Due Tuesday March 28, 2000 Review of Data packets LANS WANS.
Summary - Part 2 - Objectives The purpose of this basic IP technology training is to explain video over IP network. This training describes how video can.
An Introduction to Networking
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.0 Connecting to the Network Introduction to Networking Concepts.
Definition of a Distributed System (1) A distributed system is: A collection of independent computers that appears to its users as a single coherent system.
Networks. Ethernet  Invented by Dr. Robert Metcalfe in 1970 at Xerox Palo Alto Research Center  Allows group of computers to communicate in a Local.
Higher Computing Networking. Networking – Local Area Networks.
Intro to Distributed Systems and Networks Hank Levy.
Rehab AlFallaj.  Network:  Nodes: Service units: PC Interface processing Modules: it doesn’t generate data, but just it process it and do specific task.
Mellanox Connectivity Solutions for Scalable HPC Highest Performing, Most Efficient End-to-End Connectivity for Servers and Storage September 2010 Brandon.
Datacenter Fabric Workshop NFS over RDMA Boris Shpolyansky Mellanox Technologies Inc.
STORAGE ARCHITECTURE/ MASTER): Where IP and FC Storage Fit in Your Enterprise Randy Kerns Senior Partner The Evaluator Group.
NETWORK DEVICES Department of CE/IT.
Intro to Distributed Systems Hank Levy. 23/20/2016 Distributed Systems Nearly all systems today are distributed in some way, e.g.: –they use –they.
Class Notes CS403- Internet Technology Prepared by: Gulrez Alam Khan.
Brian Lauge Pedersen Senior DataCenter Technology Specialist Microsoft Danmark.
Computer Networks CSC September 23,
SURENDRA INSTITUTE OF ENGINEERING & MANAGEMENT PRESENTED BY : Md. Mubarak Hussain DEPT-CSE ROLL
Network Processing Systems Design
Extreme Scale Infrastructure
Wide Area InfiniBand What it is, and why it is
Definition of Distributed System
VIRTUAL SERVERS Presented By: Ravi Joshi IV Year (IT)
Introduction to Networks
CT1303 LAN Rehab AlFallaj.
University of Technology
GGF15 – Grids and Network Virtualization
IS3120 Network Communications Infrastructure
Computer Technology Notes #4
Telecommunication ELEC503
An Introduction to Computer Networking
Storage Networking Protocols
Application taxonomy & characterization
Lecture 09 & 10 Operating Systems Network, Communication, OSI.
Presentation transcript:

Enterprise at a Global Scale Paul Grun Chief Scientist System Fabric Works (503)

There are many classes of enterprise which are geographically dispersed and yet must behave as a single, monolithic enterprise. In one such application, a globally distributed enterprise must collect real time information which must be made available to a globally distributed network of analysts. The results of the analysis, in turn, must be presented in near-real time to field agents. Large scale ‘data shipping’ using conventional networks is not a viable option since time is of the essence in this environment. One method for presenting such a single, worldwide face is through the use of Remote Direct Memory Access (RDMA) at gigabits per second to interconnect a set of globally distributed enterprise data centers, in effect virtualizing the globally distributed storage and compute facilities as presented to its users. This talk discusses the use of RDMA over the wide area to virtualize a set of widely dispersed data centers. Abstract

Key messages 1.Describe ‘storage at a distance’ as practiced in LD 2.Extend the concept to the enterprise

Truth in Advertising I am not a network guy…. (From Joint Techs Workshop, Salt Lake City – February 2010)

Truth in Advertising I am not a network guy…. But I am pretty interested in storage, and storage at a distance. (From Joint Techs Workshop, Salt Lake City – February 2010)

One doesn’t usually think about networks when discussing storage… …unless there is a need for ‘storage at a distance’ Suddenly, networks become very interesting. Consider the case of a globally distributed enterprise… This was the key message…

A globally distributed enterprise 7 Data Center Data Center Data Center Remote backup/recovery Data collected in one place, but analyzed in another Dissemination of information throughout the enterprise “Scientific Productivity follows Data Locality” – Eli Dart, et al Application mobility

In time sensitive environments, data is only useful if it can be analyzed quickly, results delivered quickly, and action taken quickly The notion of ‘Storage at a Distance’ is predicated on delivering an unprecedented level of immediacy in data access This required a re-think of the way data is ingested, stored and accessed 8

Logical view – global datacenter 9 workstations, servers workstations, servers Storage Server Workstations, servers Logical switch Storage Server Storage Server Data Center Data Center Data Center

Data center - notional 10 user Storage Server LAN To remote site switch … IB switch Users connect via a web browser servers, workstations IB chosen for: -Latency, b/w -Support for parallel file I/O -Reduced resource utlization (CPU/memory b/w) -Cost efficiency Compute and storage is provided at each node Access to all data, enterprise-wide

Storage at a distance Storage Server … IB switch Storage Server IB switch OC192 ATM/SONET IB subnet segments: 40Gb/s WAN links: 10Gb/s Also tested on a ‘shared wavelength’ service, with excellent results Workstations, Servers WAN ‘gateway’ - async/sync interface - a two port switch.

Enterprise storage architecture 12 User app storage client Local Storage Remote Storage Remote Storage buffer server An enterprise application reads data through a storage client. The storage client connects to each storage server via RDMA. Thus, the user has direct access to all data stored anywhere on the system. user Basic idea: effectively utilize rare high bandwidth links

Lustre Parallel File System – (1/2) 13 MDS OSS Persistent connection to Metadata Server (MDS) and Object Storage Servers (OSS). user User app storage client buffer server Local Storage Remote Storage All file systems mounted by storage client. Data appears as if local; No need for file FTP.

Parallel file system – (2/2) f/s client server f/s client server f/s client server mds oss Lustre, pNFS… - file, object, block level I/O - store/retrieve data using parallel disk storage - source/sink data using multiple initiators and parallel file systems

RDMA WAN WAN Gateway is a two port switch,  buffer-to-buffer transfers over the WAN RDMA Transport IB Network IB Link* IB Phy gateway function IB Link* IB Phy WAN Link WANPhy gateway function IB Link* IB Phy WAN Link WANPhy RDMA Transport IB Network IB Link* IB Phy WAN app buffer WAN gateway device ‘Losslessness’ is stretched across the WAN Highly efficient use of available bandwidth Scales well with multiple, concurrent data flows RDMA/IB b/w performance: ≥ 80% TCP/IP b/w performance : ≤ 40% RDMA CPU usage estimated at 4x less

which results in… WAN Compute Storage Logical switch A practical, enterprise network distributed over 1000s of KMs ‘Pools’ of compute resource ‘Pools’ of storage

A commercial global enterprise 17 Data Center Data Center Data Center Remote backup/recovery Data collected in one place, but analyzed in another Dissemination of information throughout the enterprise Application mobility Manhattan New Jersey London

Distributed storage, and what else? 18 globally virtualized storage Global access to enterprise data – worldwide Flexible, agile allocation of server resources Data protection Reliability, resiliency Compute Manhattan New Jersey London Storage Logical switch

Flexible, agile allocation of resources?? 19 Compute New Jersey London Storage Logical switch VM Put the application container where compute resource is available, or where it is needed (temporally) Manhattan

RDMA Concept Application RDMA Service network Application RDMA Service network switch phy switch phy Based on “channel I/O”, RDMA creates memory-to-memory pipes RDMA delivers: - low latency - scalability - high network bandwidths - low CPU utilization - conserves precious memory bandwidth - Reduce/eliminate context switches, - Reduce/eliminate buffer copies, - Minimal CPU utilization, - Conserves server memory bandwidth.

App 21 RDMA connects virtual buffers which may be located in different physical address spaces... buf NIC App buf NIC OS …even across a network. No kernel buffer copies No OS context switch for data transfers Virtual-to-physical address translation in the NIC. Application accesses the NIC directly. RDMA: initiating app targets a virtual buffer in the receiving end. Virtual addresses are carried over the network by the transport. SEND/RECEIVE: Sender targets a destination ‘queue pair’; the destination buffer address is opaque to the sender.

Extending RDMA over the WAN has been repeatedly demonstrated NRL’s work demonstrates the value of combining structured data, RDMA over the WAN and a parallel file system Apply the same concepts to the globally distributed enterprise 22

To do list Finish routing –SM scalability? Improved injection rate control –better QoS for ‘shared wavelength’ environments Increase LID space? Steve Poole’s list from last night The list from the OEM panel … 23

24 Backup

System Fabric Works System Fabric Works, Inc. delivers engineering, system integration and strategic consulting services to organizations seeking to deploy high productivity computing and storage systems, low latency high performance networks and the optimal software to meet our customer’s application requirements. SFW also offers custom integration and deployment of commodity servers and storage systems at levels of performance, scale and cost effectiveness that are not available from other suppliers. SFW personnel are widely recognized experts in the fields of high performance computing, networking and storage systems particularly in OpenFabrics Software, InfiniBand, Ethernet and energy saving, efficient computing technologies such as RDMA.

FTP packets FTP/TCP: windowing protocol Windowing effects are exaggerated over long distance. Measured utilizations ~20% of wire bandwidth. RDMA protocol keeps the pipe continuously full. Measured utilizations approach 98% of wire bandwidth. FTP/TCP/IP RDMA client An efficient WAN?