LHCONE NETWORK SERVICES: GETTING SDN TO DEV-OPS IN ATLAS Shawn McKee/Univ. of Michigan LHCONE/LHCOPN Meeting, Taipei, Taiwan March 14th, 2016 March 14,

Slides:



Advertisements
Similar presentations
INDIANAUNIVERSITYINDIANAUNIVERSITY GENI Global Environment for Network Innovation James Williams Director – International Networking Director – Operational.
Advertisements

Distributed Data Processing
Internet2 and AL2S Eric Boyd Senior Director of Strategic Projects
Internet2 Network: Convergence of Innovation, SDN, and Cloud Computing Eric Boyd Senior Director of Strategic Projects.
SDN and Openflow.
Keith Wiles DPACC vNF Overview and Proposed methods Keith Wiles – v0.5.
Integrating Network and Transfer Metrics to Optimize Transfer Efficiency and Experiment Workflows Shawn McKee, Marian Babik for the WLCG Network and Transfer.
Network+ Guide to Networks, Fourth Edition Chapter 1 An Introduction to Networking.
(ITI310) By Eng. BASSEM ALSAID SESSIONS 8: Network Load Balancing (NLB)
Abstraction and Control of Transport Networks (ACTN) BoF
Cloud computing Tahani aljehani.
Public and Private Clouds: Working Together
We will be covering VLANs this week. In addition we will do a practical involving setting up a router and how to create a VLAN.
CLIENT A client is an application or system that accesses a service made available by a server. applicationserver.
Lecture 1 Internet CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger and Daniel Zappala Lecture 1 Introduction.
FIORANO SERVICE BUS The Cloud Enablement Platform
Is Lambda Switching Likely for Applications? Tom Lehman USC/Information Sciences Institute December 2001.
What if you suspect a security incident or software vulnerability? What if you suspect a security incident at your site? DON’T PANIC Immediately inform:
1 Apache. 2 Module - Apache ♦ Overview This module focuses on configuring and customizing Apache web server. Apache is a commonly used Hypertext Transfer.
SDN based Network Security Monitoring in Dynamic Cloud Networks Xiuzhen CHEN School of Information Security Engineering Shanghai Jiao Tong University,
LAN Switching and Wireless – Chapter 1
Thoughts on Future LHCOPN Some ideas Artur Barczyk, Vancouver, 31/08/09.
DataTAG Research and Technological Development for a Transatlantic Grid Abstract Several major international Grid development projects are underway at.
1 BRUSSELS - 14 July 2003 Full Security Support in a heterogeneous mobile GRID testbed for wireless extensions to the.
Switch Features Most enterprise-capable switches have a number of features that make the switch attractive for large organizations. The following is a.
 Load balancing is the process of distributing a workload evenly throughout a group or cluster of computers to maximize throughput.  This means that.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
From the Transatlantic Networking Workshop to the DAM Jamboree to the LHCOPN Meeting (Geneva-Amsterdam-Barcelona) David Foster CERN-IT.
NORDUnet Nordic Infrastructure for Research & Education Workshop Introduction - Finding the Match Lars Fischer LHCONE Workshop CERN, December 2012.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
HEPiX IPv6 Working Group David Kelsey GDB, CERN 11 Jan 2012.
Advanced Computer Networks Lecturer: E EE Eng. Ahmed Hemaid Office: I 114.
Internet of Things. IoT Novel paradigm – Rapidly gaining ground in the wireless scenario Basic idea – Pervasive presence around us a variety of things.
SOFTWARE DEFINED NETWORKING/OPENFLOW: A PATH TO PROGRAMMABLE NETWORKS April 23, 2012 © Brocade Communications Systems, Inc.
LHCONE Point-to-Point Circuit Experiment Authentication and Authorization Model Discussion LHCONE meeting, Rome April 28-29, 2014 W. Johnston, Senior Scientist.
Simple Infrastructure to Exploit 100G Wide Are Networks for Data-Intensive Science Shawn McKee / University of Michigan Supercomputing 2015 Austin, Texas.
Copyright 2007 John Wiley & Sons, Inc. Information Systems: Creating Business Value John Wiley & Sons, Inc. Mark Huber, Craig Piercy, and Patrick McKeown.
TeraPaths: A QoS Enabled Collaborative Data Sharing Infrastructure for Petascale Computing Research The TeraPaths Project Team Usatlas Tier 2 workshop.
3/12/2013Computer Engg, IIT(BHU)1 CLOUD COMPUTING-1.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
ANSE: Advanced Network Services for Experiments Institutes: –Caltech (PI: H. Newman, Co-PI: A. Barczyk) –University of Michigan (Co-PI: S. McKee) –Vanderbilt.
David Foster LCG Project 12-March-02 Fabric Automation The Challenge of LHC Scale Fabrics LHC Computing Grid Workshop David Foster 12 th March 2002.
SDN/OPENFLOW IN LHCONE A discussion June 5, 2013
Cyber in the Cloud & Network Enabling Offense and Defense Mark Odell April 28, 2015.
Network Virtualization Sandip Chakraborty. In routing table we keep both the next hop IP (gateway) as well as the default interface. Why do we require.
DICE: Authorizing Dynamic Networks for VOs Jeff W. Boote Senior Network Software Engineer, Internet2 Cándido Rodríguez Montes RedIRIS TNC2009 Malaga, Spain.
Assignment # 3 Networking Components By: Jeff Long.
Strawman LHCONE Point to Point Experiment Plan LHCONE meeting Paris, June 17-18, 2013.
From the Transatlantic Networking Workshop to the DAM Jamboree David Foster CERN-IT.
Information Initiative Center, Hokkaido University North 11, West 5, Sapporo , Japan Tel, Fax: Management.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
Software Defined Networking and OpenFlow Geddings Barrineau Ryan Izard.
Using Check_MK to Monitor perfSONAR Shawn McKee/University of Michigan North American Throughput Meeting March 9 th, 2016.
Deploy SDN-IP.
LHCONE NETWORK SERVICES INTERFACE (NSI) POINT-TO-POINT TESTBED WITH ATLAS SITES Shawn McKee/Univ. of Michigan Kaushik De/Univ. of Texas Arlington (Thanks.
Atrium Router Project Proposal Subhas Mondal, Manoj Nair, Subhash Singh.
Ian Bird, CERN WLCG Project Leader Amsterdam, 24 th January 2012.
SDN controllers App Network elements has two components: OpenFlow client, forwarding hardware with flow tables. The SDN controller must implement the network.
SDN challenges Deployment challenges
Chapter 7. Identifying Assets and Activities to Be Protected
Bentley Systems, Incorporated
Current Generation Hypervisor Type 1 Type 2.
Report from WLCG Workshop 2017: WLCG Network Requirements GDB - CERN 12th of July 2017
A Deterministic End to End Performance Verification Architecture
Chapter 18 MobileApp Design
IBM Start Now Host Integration Solutions
Indigo Doyoung Lee Dept. of CSE, POSTECH
Software Defined Networking (SDN)
Network Virtualization
Presentation transcript:

LHCONE NETWORK SERVICES: GETTING SDN TO DEV-OPS IN ATLAS Shawn McKee/Univ. of Michigan LHCONE/LHCOPN Meeting, Taipei, Taiwan March 14th, 2016 March 14,

Within LHCONE we have had a point-to-point service effort for quite a while. –It has been challenging to make progress beyond a few limited demonstrations Within the LHC experiments there has been interest in what might be possible with networking and especially with how a future production quality software defined networking capability would fit with the way the experiments manage, operate and orchestrate their globally distributed resources –Network device support for SDN has not really been “production quality”…hard to interest the experiments in even testing because of problems getting anything enabled between sites of interests How best to make some progress? Context for this Presentation March 14,

While we have dabbled as a community for years with various SDN capabilities we have never managed to effectively bridge the gap into the core LHC experiment middleware and workflow systems. Why? –The experiments have their own “stove-pipes” of effort and there hasn’t been much interaction with networking –The experiments focused on what they perceive as bigger problems they must face We have helped ensure the network has been the most reliable and capable component of their distributed infrastructure –Our test implementations are typically one-offs designed to demonstrate features and capabilities but not then easily translated into use with existing production systems. –SDN itself (both software and hardware) has not been near “production quality” to-date. This is improving rapidly, new hardware/chipsets are much more capable, problems with software usability improving. Challenges Getting SDN into LHC Production Systems March 14,

To make progress with SDN capabilities for LHC we need to start focusing on enabling new SDN features in production instances, blending production and development. dev-ops –Software and technology development has called doing this “dev-ops” (Development and Operations) One shortcoming in the P2P effort to-date has been the significant challenge in getting all the way to the ends: to the servers that source and sink our data. –We have been able to create WAN circuits but it then gets “messy” for how those are actually used to carry the right traffic for production activities OpenvSwitchWe now have an interesting option to help us: Open vSwitch (openvswitch.org). OpenFlow OVSDB –This is well tested, supported software to create virtual switches on Linux (and other OSs) with traffic control and shaping and OpenFlow and OVSDB support. Getting SDN to the Ends using Dev-Ops March 14,

There is a web page on the Wiki below documenting both the creation of RPMS for Redhat/CentOS/SL 6.x and their deployment onto existing hosts: – –This web site will soon provide some detailed and tested configuration examples for implementing OVS on hosts with various types of network configuration (bonded, VLANs, multiple interfaces, etc) The idea is to move your systems IP addresses off from their existing physical (or virtual OS) NICs and onto the OVS bridge you will bring up. OVS can be installed and turned on without any impact to the running system (install RPM, activate service) –It is actually moving the IP that is potentially disruptive and must be done with some care. The URL above has details. Details on Deploying Open vSwitch (OVS) March 14,

OVSBy getting OVS in place on LHC production storage systems we immediately gain visibility and control all the way to the sources and sinks of data-flows for LHC OVS LHONEWe have verified that OVS has almost no measureable impact when shaping traffic on 10G NICs (See Ramiro’s presentation at the last LHONE meeting: / /LHCONE-AM_SDN_OVS_rv1.pdf / /LHCONE-AM_SDN_OVS_rv1.pdf OVS OVSHaving OVS running on production systems with the IPs moved to the OVS bridge allows us to continue to operate all production services identically to how they were operated prior to installation and configuration –The big win is that we can start to do simple tests incorporating specific flows or sets of servers into end-to-end circuits. –Gradually, we can verify the impact of using such capabilities with LHC production systems and, if positive, it makes a strong argument for other sites to begin joining the effort. Advantages of OVS on Production Instances March 14,

Diagram of Possible Future SDN Dev-Ops Testbed March 14, Site A Agent OVS Transfer Node (OVS+FDT/GridFTP) Transfer Node (OVS+FDT/GridFTP) Site B Agent OVS PanDA/DaTri Agent NSA_N NSA_1 STP A STP B OVS tail (site dependent) OVS tail (site dependent) LHCONE p-t-p Multi-domain Fabric Request circuit New path Start transfer 1 Data Plane Control Plane 1) Request WAN circuit 2) Integrate circuit with OVS 3) Transfer on new E2E path 2 In development Currently in place Interfaces 3 Original Slide from Ramiro/Azher, Caltech

Challenges March 14, OVSWhile having OVS “at the ends” will be a huge step forward for our Point-to-Point work, there remain a number of challenges OVSThe primary challenge is integrating existing circuit creation systems with OVS as a participant OVS –How can we incorporate the OVS-enabled end-systems seamlessly into the end-to-end circuit? OVSHow best to use the many OVS features to improve the overall performance of the circuit? The main “meta-question”: How can SDN capabilities improve the LHC experiments ability to manage, utilize and optimize their global infrastructure? –There is a lot of work to do to investigate this: Getting SDN “in-line” with production LHC work is our first step!

OVSFinalize testing of OVS Configuration to support various network configurations AGLT2MWT2AGLT2 (Michigan, Michigan State) and MWT2 (Illinois, Indiana and University of Chicago) have agreed to deploy OVS onto their ATLAS dCache storage systems –Total of 8.7 Petabytes of storage between the two –Most system dual 10G connected; sites 80 Gbits to WAN –This will provide an example to experiment with SDN end-to-end using real ATLAS production traffic We want to expand as soon as is feasible. Interest from –DE-KIT –Possible Canadian participation –Seeking additional sites with real use cases (at least one more in North America and in Europe) OVSTimescale April-May 2016 for initial tests (assumes documentation and initial OVS configurations documented and tested by end of March) Shawn McKee if your site is interested in participating Next Steps March 14,

QUESTIONS & COMMENTS Shawn McKee March 14,