Implementing Convergent Networking: Partner Concepts

Slides:



Advertisements
Similar presentations
Technology alliance partner
Advertisements

Virtual Machine Queue Architecture Review Ali Dabagh Architect Windows Core Networking Don Stanwyck Sr. Program Manager NDIS Virtualization.
System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
Storage Networking Strategic Decision-Making Randy Kerns Evaluator Group, Inc.
NAS vs. SAN 10/2010 Palestinian Land Authority IT Department By Nahreen Ameen 1.
Institute of Computer Science Foundation for Research and Technology – Hellas Greece Computer Architecture and VLSI Systems Laboratory Exploiting Spatial.
NETWORK LOAD BALANCING NLB.  Network Load Balancing (NLB) is a Clustering Technology.  Windows Based. (windows server).  To scale performance, Network.
SQL Server, Storage And You Part 2: SAN, NAS and IP Storage.
VIA and Its Extension To TCP/IP Network Yingping Lu Based on Paper “Queue Pair IP, …” by Philip Buonadonna.
Server Platforms Week 11- Lecture 1. Server Market $ 46,100,000,000 ($ 46.1 Billion) Gartner.
RDMA ENABLED WEB SERVER Rajat Sharma. Objective  To implement a Web Server serving HTTP client requests through RDMA replacing the traditional TCP/IP.
An overview of Infiniband Reykjavik, June 24th 2008 R E Y K J A V I K U N I V E R S I T Y Dept. Computer Science Center for Analysis and Design of Intelligent.
5/8/2006 Nicole SAN Protocols 1 Storage Networking Protocols Nicole Opferman CS 526.
Storage area network and System area network (SAN)
Storage Networking Technologies and Virtualization Section 2 DAS and Introduction to SCSI1.
Network IO Architectures Partner Concepts Steven Hunter IBM Corporation Blade Architectures Applications and Benefits with a Networking Focus Chris Pettey.
Windows Server Scalability And Virtualized I/O Fabric For Blade Server
Module – 7 network-attached storage (NAS)
Data Storage Willis Kim 14 May Types of storages Direct Attached Storage – storage hardware that connects to a single server Direct Attached Storage.
IWARP Ethernet Key to Driving Ethernet into the Future Brian Hausauer Chief Architect NetEffect, Inc.
Product Manager Networking Infrastructure Choices for Storage.
Microsoft Virtual Academy Module 4 Creating and Configuring Virtual Machine Networks.
File Systems and N/W attached storage (NAS) | VTU NOTES | QUESTION PAPERS | NEWS | VTU RESULTS | FORUM | BOOKSPAR ANDROID APP.
New Direction Proposal: An OpenFabrics Framework for high-performance I/O apps OFA TAC, Key drivers: Sean Hefty, Paul Grun.
SANPoint Foundation Suite HA Robert Soderbery Sr. Director, Product Management VERITAS Software Corporation.
Roland Dreier Technical Lead – Cisco Systems, Inc. OpenIB Maintainer Sean Hefty Software Engineer – Intel Corporation OpenIB Maintainer Yaron Haviv CTO.
Copyright 2009 Fujitsu America, Inc. 0 Fujitsu PRIMERGY Servers “Next Generation HPC and Cloud Architecture” PRIMERGY CX1000 Tom Donnelly April
LWIP TCP/IP Stack 김백규.
Towards a Common Communication Infrastructure for Clusters and Grids Darius Buntinas Argonne National Laboratory.
Slide 1 DESIGN, IMPLEMENTATION, AND PERFORMANCE ANALYSIS OF THE ISCSI PROTOCOL FOR SCSI OVER TCP/IP By Anshul Chadda (Trebia Networks)-Speaker Ashish Palekar.
Trends In Network Industry - Exploring Possibilities for IPAC Network Steven Lo.
The NE010 iWARP Adapter Gary Montry Senior Scientist
InfiniSwitch Company Confidential. 2 InfiniSwitch Agenda InfiniBand Overview Company Overview Product Strategy Q&A.
11/05/07 1TDC TDC 564 Local Area Networks Lecture 8 IP-based Storage Area Network.
2006 Sonoma Workshop February 2006Page 1 Sockets Direct Protocol (SDP) for Windows - Motivation and Plans Gilad Shainer Mellanox Technologies Inc.
Remote Direct Memory Access (RDMA) over IP PFLDNet 2003, Geneva Stephen Bailey, Sandburst Corp., Allyn Romanow, Cisco Systems,
Srihari Makineni & Ravi Iyer Communications Technology Lab
Enable Multi Tenant Clouds Network Virtualization. Dynamic VM Placement. Secure Isolation. … High Scale & Low Cost Datacenters Leverage Hardware. High.
Using NAS as a Gateway to SAN Dave Rosenberg Hewlett-Packard Company th Street SW Loveland, CO 80537
1 Public DAFS Storage for High Performance Computing using MPI-I/O: Design and Experience Arkady Kanevsky & Peter Corbett Network Appliance Vijay Velusamy.
IP Communication Fabric Mike Polston HP
Infiniband Bart Taylor. What it is InfiniBand™ Architecture defines a new interconnect technology for servers that changes the way data centers will be.
Congestion Notification Mechanisms in 802 networks Manoj Wadekar IEEE Interim Meeting January 12, 2005.
Microsoft Virtual Server: Overview and Roadmap Mike Neil Product Unit Manager Windows Virtualization microsoft.com Microsoft Corporation.
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
An Architecture and Prototype Implementation for TCP/IP Hardware Support Mirko Benz Dresden University of Technology, Germany TERENA 2001.
© 2009 IBM Corporation Statements of IBM future plans and directions are provided for information purposes only. Plans and direction are subject to change.
Intel Research & Development ETA: Experience with an IA processor as a Packet Processing Engine HP Labs Computer Systems Colloquium August 2003 Greg Regnier.
VMware vSphere Configuration and Management v6
Ethernet. Ethernet standards milestones 1973: Ethernet Invented 1983: 10Mbps Ethernet 1985: 10Mbps Repeater 1990: 10BASE-T 1995: 100Mbps Ethernet 1998:
Mellanox Connectivity Solutions for Scalable HPC Highest Performing, Most Efficient End-to-End Connectivity for Servers and Storage April 2010.
1 Chapters 2 & 3 Computer Networking Review – The TCP/IP Protocol Architecture.
Mr. P. K. GuptaSandeep Gupta Roopak Agarwal
NDIS 6.0 Roadmap and Value Statement
Internet Protocol Storage Area Networks (IP SAN)
Sandeep Singhal, Ph.D Director Windows Core Networking Microsoft Corporation.
STORAGE ARCHITECTURE/ MASTER): Where IP and FC Storage Fit in Your Enterprise Randy Kerns Senior Partner The Evaluator Group.
Sockets Direct Protocol for Hybrid Network Stacks: A Case Study with iWARP over 10G Ethernet P. Balaji, S. Bhagvat, R. Thakur and D. K. Panda, Mathematics.
Technical Overview of Microsoft’s NetDMA Architecture Rade Trimceski Program Manager Windows Networking & Devices Microsoft Corporation.
Storage Networking. Storage Trends Storage grows %/year, gets more complicated It’s necessary to pool storage for flexibility Intelligent storage.
© 2007 EMC Corporation. All rights reserved. Internet Protocol Storage Area Networks (IP SAN) Module 3.4.
Windows Server 2008 R2 Failover Clustering and Network Load Balancing October 25 th 2009.
Infrastructure for the DBA: An Introduction Peter Shore SQL Saturday Chicago 2016.
Progress in Standardization of RDMA technology Arkady Kanevsky, Ph.D Chair of DAT Collaborative.
Advisor: Hung Shi-Hao Presenter: Chen Yu-Jen
LWIP TCP/IP Stack 김백규.
What is Fibre Channel? What is Fibre Channel? Introduction
Storage Networking Protocols
Application taxonomy & characterization
Presentation transcript:

Implementing Convergent Networking: Partner Concepts Uri Elzur Broadcom Corporation Director, Advanced Technology Brian Hausauer Neteffect Inc. Chief Architect

Convergence In The Data Center: Convergence Over IP Uri Elzur Broadcom Corporation Director, Advanced Technology

Agenda Application requirements in Data Center Data flows and Server architecture Convergence Demo Hardware and software challenges and advantages Summary

Enterprise Network Today IT: Get ready for tomorrow’s Data Center, today Multiple networks drive Total Cost of Ownership (TCO) up Consolidation, convergence, virtualization requires Flexible I/O Higher speeds (2.5G, 10G) requires more Efficient I/O Issue: Best use of Memory and CPU resources Additional constraints: Limited power, cooling and smaller form factor

Convergence Over Ethernet Multiple networks and multiple stacks in the OS are used to provide these services Wire protocols; e.g., Internet Small Computer System Interface (iSCSI) and iWARP (Remote Direct Memory Access {RDMA}) enable the use of Ethernet as the converged network Direct Attach storage migrates to Networked Storage Proprietary clustering can now use RDMA over Ethernet The OS supports one device servicing multiple stack with the Virtual Bus Driver To accommodate these new traffic types, Ethernet’s efficiency must be optimal CPU utilization Memory BW utilization Latency Sockets Applications Windows Sockets Windows Socket Switch Storage Applications RDMA Provider User Mode File System Kernel Mode Partition TCP/IP Class Driver NDIS iSCSI Port Driver NDIS IM Driver (iscsiprt sys) . iSCSI NDIS Miniport RDMA Driver Miniport HBA NIC RNIC

Data Center Application Characteristics Database Application Servers Cluster Web Servers IP/Ethernet Storage Cluster Management DAS Load Balancers

The Server In The Data Center Web Servers Application Servers Database Cluster DAS Load Balancers Server network requirements – Data, Storage, Clustering, and Management Acceleration required for Data = TCP, Storage = iSCSI, Clustering = RDMA Application requirements: More transactions per server, Higher rate, Larger messages (e.g., e-mail) IP/Ethernet Storage Cluster Management Long Lived connection Short Lived connection

Traditional L2 NIC Rx Flow And Buffer Management Application Application pre-posts buffer Data arrives at Network Interface Adapter (NIC) NIC Direct Memory Access (DMA) data to driver buffers (Kernel) NIC notifies Driver after a frame is DMA’d (Interrupt moderation per frame) Driver notifies Stack Stack fetches headers, processes TCP/IP, strip headers Stack copies data from driver to Application buffers Stack notifies Application 1 8 TCP Stack 7 6 5 Driver 3 4 L2 NIC 2 Minimum of one copy

iSCSI iSCSI provides a reliable high performance block storage service Microsoft Operating System support for iSCSI accelerates iSCSI’s deployment Microsoft iSCSI Software Initiator iSCSI HBA iSCSI HBA provides for Better performance iSCSI Boot iSER enablement Storage Applications File System Partition Manager Class Driver iSCSI Port Driver (iscsiprtysys) iSCSI Miniport HBA

The Value Of iSCSI Boot Storage consolidation – lower TCO Easier maintenance, replacement No need to replace server blade for a HD failure No disk on blade/motherboard – space, power savings Smaller blades, higher density Simpler board design, no need for HD specific mechanical restrainer Higher reliability Hot replacement of disks if a disk fails RAID protection over boot disk Re-assign disk to another server in case of server failure

WSD And RDMA Kernel by pass attractive for High Performance Computing (HPC), Databases, and any Socket application WSD model supports RNICs with RDMA over Ethernet (a.k.a., iWARP) As latency improvements are mainly due to kernel by-pass, WSD is competitive with other RDMA-based technologies, e.g., Infiniband Traditional Model WSD Model Socket App Socket App WinSock WinSock WinSock Switch WinSock SPI TCP/IP WinSock Provider TCP/IP WinSock Provider RDMA service Provider User Kernel TCP/IP Transport Driver TCP/IP Transport Driver RDMA provider Driver Microsoft WSD Module OEM WSD Software NDIS OEM SAN Hardware NDIS Miniport SAN NDIS Miniport Private interface NIC RNIC

L2 Technology Can’t Efficiently Handle iSCSI And RDMA iSCSI HBA implementation concerns iSCSI Boot Digest overhead – CRC-32C Copy overhead – Zero Copy requires iSCSI protocol processing RDMA RNIC implementation concerns Throughput – high Software overhead for RDMA processing MPA – CRC-32C, Markers every 512B DDP/RDMA – protocol processing, zero copy, User mode interaction, special queues Minimal latency – Software processing doesn’t allow for kernel bypass Thus, for optimal performance specific offload is required

Convergence Over Ethernet: TOE, iSCSI, RDMA, Management Legacy model New model Networking Converges functions Multiple functions (SAN, LAN, IPC, Mgmt.) can be consolidated to a single fabric type Blade server storage connectivity (low cost) Clustering (IPC, HPC) NIC Host Application layer Presentation layer Session layer Transport layer Network layer Data Link layer Physical layer C-NIC TOE RSS Host Cluster IPC Convergence over Std. Ethernet LAN Storage Management Remote Management Storage Networking Block Storage File Storage Consolidates ports Leverage Ethernet pervasiveness, knowledge, cost leadership and volume Consolidate KVM over IP Leverage existing Standard Ethernet equipment Lower TCO – one technology for multiple purposes

C-NIC Demo

C-NIC Hardware Design – Advantages/Challenges Performance – wire speed Find the right split between Hardware and Firmware Hardware for Speed – e.g., connection look up, frame validity, buffer selection, and offset computation Hardware connection look up is significantly more efficient than software IPv6 address length (128-bits) exacerbates it Flexibility Firmware provides flexibility, but maybe slower than hardware… Specially optimized RISC CPU – it’s not about MHz… Accommodate future protocol changes: e.g., TCP ECN Minimal latency From wire to application buffer (or from application to wire for Tx) Not involving the CPU Flat ASIC architecture for minimal latency Scalability – 1G, 2.5G, 10G Zero Copy architecture – a match to server memory BW and latency; additional copy or few copies in any L2 solution Power goals – under 5W per 1G/2.5G, under 10W per 10G CPU consumes 90W

C-NIC Software Design – Advantages/Challenges Virtual Bus Driver Reconcile requests from all stacks Plug and Play Reset Network control and speed Power Support of multiple stacks Resource allocation and management Resource isolation Run time – priorities Interfaces separation Interrupt moderation per stack Statistics

Summary C-NIC Advantages TCP Offload Engine in Hardware – for better application performance, lower CPU, and improved latency RDMA – for Memory BW and ultimate Latency iSCSI – for networked storage and iSCSI Boot Flexible and Efficient I/O for the data center of today and tomorrow

Brian Hausauer Chief Architect NetEffect, Inc BrianH@NetEffect.com WinHEC 2005 Brian Hausauer Chief Architect NetEffect, Inc BrianH@NetEffect.com

Myrinet, Quadrics, InfiniBand, etc. Today’s Data Center NAS Users LAN Ethernet networking adapter Applications ▪ ▪ ▪ ▪ ▪ ▪ ▪ switch Clustering Myrinet, Quadrics, InfiniBand, etc. clustering adapter ▪ ▪ ▪ ▪ ▪ ▪ ▪ switch SAN Block Storage Fibre Channel block storage adapter ▪ ▪ ▪ ▪ ▪ ▪ ▪ switch

Datacenter Trends: Traffic Increasing 3x Annually 2006 I/O traffic requirements Front-end web servers Mid-tier application servers Back-end database servers Network Heavy (5-10 Gb/s) Intermediate (200-500 Mb/s) Low (<200 Mb/s) Storage (1.5-4 Gb/s) (<100 Mb/s) (3-6 Gb/s) Clustering IPC None (2-4 Gb/s) Application requirements pervasive standard, plug-n-play interop concurrent access, high throughput, low overhead fast access, low latency 6.5 – 14.0 Gb/s 2.3 – 4.6 Gb/s 5.2 – 10.2 Gb/s Typical for each server Sources: 2006 IA Server I/O Analysis, Intel Corporation; Oracle Scaling 3-fabric infrastructure expensive and cumbersome Server density complicates connections to three fabrics Successful solution must meet different application requirements

High Performance Computing: Clusters Dominate Clusters in Top 500 Systems 300 Clusters continue to grow in popularity and now dominate the Top 500 fastest computers 294 clusters in top 500 computers 250 200 Ethernet is the interconnect for over 50% of the top clusters 150 Ethernet continues to increase share as the cluster interconnect of choice for the top clusters in the world 100 50 1997 1998 1999 2000 2001 2002 2003 2004 Ethernet-based Clusters All Other Clusters Source: www.top500.org

Next-generation Ethernet can be the solution Why Ethernet? pervasive standard multi-vendor interoperability potential to reach high volumes and low cost powerful management tools/infrastructure Why not? Ethernet does not meet the requirements for all fabrics Ethernet overhead is the major obstacle The solution: iWARP Extensions to Ethernet Industry driven standards to address Ethernet deficiencies Renders Ethernet suitable for all fabrics at multi-Gb and beyond Reduces cost, complexity and TCO

Overhead & Latency in Networking Sources Solutions Packet Processing Intermediate Buffer Copies Command Context Switches user application app buffer OS buffer driver buffer app buffer I/O cmd % CPU Overhead I/O cmd I/O library 100% application to OS context switches 40% kernel context switch context switch device driver OS device driver OS I/O cmd I/O cmd server software 60% application to OS context switches 40% Intermediate buffer copies 20% TCP/IP 40% transport processing 40% application to OS context switches 40% Intermediate buffer copies 20% software Transport (TCP) offload hardware I/O adapter I/O cmd RDMA / DDP TCP/IP standard Ethernet TCP/IP packet adapter buffer User-Level Direct Access / OS Bypass

Introducing NetEffect’s NE01 Ethernet Channel Adapter (ECA) A single chip supports: Transport (TCP) offload RDMA/DDP OS bypass / ULDA Meets requirements for: Clustering (HPC, DBC,…) Storage (file and block) Networking Reduces overhead up to 100% Strategic advantages Patent-pending virtual pipeline and RDMA architecture One die for all chips enables unique products for dual 10Gb / dual 1Gb host CPUs NetEffect ECA Ethernet ports (10 Gb or 1 ; Copper or fibre) host interface transaction switch protocol engine DDR2 SDRAM controller MAC server chipset memory external DRAM with ECC PCI Express or PCI-X

Future Server Ethernet Channel Adapter (ECA) for a Converged Fabric O/S Acceleration Interfaces Existing Interfaces RDMA Accelerator Clustering OS/driver s/w Block Storage OS/driver s/w Networking OS/driver s/w TCP Accelerator (WSD, DAPL, VI, MPI) iSER iSCSI iSCSI iWARP TOE NIC NetEffect ECA Transaction Switch Ethernet fabric(s) NetEffect ECA delivers optimized file and block storage, networking, and clustering from a single adapter

NetEffect ECA Architecture host interface Basic Networking Accelerated Networking Clustering Block Storage crossbar ... MAC MAC

NetEffect ECA Architecture Networking Sockets (SW Stack) WSD, SDP, TCP Accelerator Basic & Accelerated Networking Related software standards Sockets Microsoft WinSock Direct (WSD) Sockets Direct Protocol (SDP) TCP Accelerator Interfaces iWARP TOE Basic Networking

NetEffect ECA Architecture Storage Connection Mgmt iSER, R-NFS iSCSI, NFS Block Storage Related software standards File system NFS DAFS R-NFS Block mode iSCSI iSER iWARP TOE Basic Networking Setup/Teardown and Exceptions only

NetEffect ECA Architecture Clustering Connection Mgmt MPI, DAPL, Clustering Related software standards MPI DAPL API IT API RDMA Accelerator Interfaces iWARP TOE Basic Networking N/A Setup/Teardown and Exceptions only

Myrinet, Quadrics, InfiniBand, etc. Tomorrow’s Data Center Separate Fabrics for Networking, Storage, and Clustering Users LAN iWARP Ethernet LAN Ethernet NAS ▪ ▪ ▪ ▪ ▪ ▪ ▪ switch Applications networking storage clustering networking adapter Block Storage Fibre Channel Storage iWARP Ethernet switch networking storage clustering storage adapter ▪ ▪ ▪ ▪ ▪ ▪ ▪ SAN clustering networking storage clustering adapter Clustering iWARP Ethernet Clustering Myrinet, Quadrics, InfiniBand, etc. switch ▪ ▪ ▪ ▪ ▪ ▪ ▪

Fat Pipe for Blades & Stacks Converged Fabric for Networking, Storage & Clustering Users LAN iWARP Ethernet ▪ ▪ ▪ ▪ ▪ ▪ ▪ switch NAS Applications Storage iWARP Ethernet networking storage clustering adapter switch ▪ ▪ ▪ ▪ ▪ ▪ ▪ SAN Clustering iWARP Ethernet switch ▪ ▪ ▪ ▪ ▪ ▪ ▪

Take-Aways Multi-gigabit networking is required for each tier of the data center Supporting multiple incompatible network infrastructure is becoming increasingly more difficult as budget, power, cooling and space constraints tighten With the adoption of iWARP, Ethernet for the first time meets the requirements for all connectivity within the data center NetEffect is developing a high performance iWARP Ethernet Channel Adapter that enables the convergence of clustering, storage and networking

Call to Action Deploy iWARP products for convergence of networking, storage and clustering Deploy 10 Gb Ethernet for fabric convergence Develop applications to RDMA-based APIs for maximum server performance

Open Group Authors of ITAPI, RNIC PI & Sockets API Extensions Resources NetEffect www.NetEffect.com iWARP Consortium www.iol.unh.edu/consortiums/iwarp/ Open Group Authors of ITAPI, RNIC PI & Sockets API Extensions www.opengroup.org/icsc/ DAT Collaborative www.datcollaborative.org RDMA Consortium www.rdmaconsortium.org IETF RDDP WG www.ietf.org/html.charters/rddp-charter.html

Community Resources Windows Hardware and Driver Central (WHDC) www.microsoft.com/whdc/default.mspx Technical Communities www.microsoft.com/communities/products/default.mspx Non-Microsoft Community Sites www.microsoft.com/communities/related/default.mspx Microsoft Public Newsgroups www.microsoft.com/communities/newsgroups Technical Chats and Webcasts www.microsoft.com/communities/chats/default.mspx www.microsoft.com/webcasts Microsoft Blogs www.microsoft.com/communities/blogs