**DRAFT** Doctor+Congress OPNFV Summit 2016 15 June 2016 Doctor+Congress PoC team.

Slides:



Advertisements
Similar presentations
LACP Project Proposal.
Advertisements

Bringing Together Linux-based Switches and Neutron
Neutron What’s new in Havana? Arvind Somya Software Engineer Cisco Systems Inc.
High Availability Project Qiao Fu Project Progress Project details: – Weekly meeting: – Mailing list – Participants: Hui Deng
Doctor Implementation Plan (Discussion) Feb. 6, 2015 Ryota Mibu, Tomi Juvonen, Gerald Kunzmann, Carlos Goncalves.
Virtualized Infrastructure Deployment Policies (Copper) 19 February 2015 Bryan Sullivan, AT&T.
SDN in Openstack - A real-life implementation Leo Wong.
Apache CloudStack Evolution Proposal Alex Huang Software Architect, Citrix Systems.
Keith Wiles DPACC vNF Overview and Proposed methods Keith Wiles – v0.5.
24 February 2015 Ryota Mibu, NEC
(OpenStack Ceilometer)
1 Doctor Fault Management 18 May 2015 Ryota Mibu, NEC.
**DRAFT** Doctor Southbound API 14 April 2015 Ryota Mibu, NEC.
Microsoft Virtual Academy Module 4 Creating and Configuring Virtual Machine Networks.
Presenter: Chi-Hung Lu 1. Problems Distributed applications are hard to validate Distribution of application state across many distinct execution environments.
1 Doctor Fault Management - Updates - 30 July 2015 Ryota Mibu, NEC.
CON Software-Defined Networking in a Hybrid, Open Data Center Krishna Srinivasan Senior Principal Product Strategy Manager Oracle Virtual Networking.
Windows Azure Pack Service Provider Foundation 2012 R2 Windows Server 2012 R2 Virtual Machine Manager 2012 R2 Damian Flynn MVP System Center
Configuring Network Access Protection
**DRAFT** Blueprints Alignment (OpenStack Ceilometer) 4 March 2015 Ryota Mibu, NEC.
Fault Localization (Pinpoint) Project Proposal for OPNFV
Gerald Kunzmann, DOCOMO Carlos Goncalves, NEC Ryota Mibu, NEC
Extending OVN Forwarding Pipeline Topology-based Service Injection
Gerald Kunzmann, DOCOMO Carlos Goncalves, NEC Ryota Mibu, NEC
1 OPNFV Summit 2015 Doctor Fault Management Gerald Kunzmann, DOCOMO Carlos Goncalves, NEC Ryota Mibu, NEC.
© 2015 AT&T Intellectual Property. All rights reserved. AT&T and the AT&T logo are trademarks of AT&T Intellectual Property. 1 VF (Virtual Functions) Event.
Ashiq Khan NTT DOCOMO Congress in NFV-based Mobile Cellular Network Fault Recovery Ryota Mibu NEC Masahito Muroi NTT Tomi Juvonen Nokia 28 April 2016OpenStack.
Ashiq Khan NTT DOCOMO Congress in NFV-based Mobile Cellular Network Fault Recovery Ryota Mibu NEC Masahito Muroi NTT Tomi Juvonen Nokia 28 April 2016OpenStack.
Secure Access and Mobility Jason Kunst, Technical Marketing Engineer March 2016 Location Based Services with Mobility Services Engine ISE Location Services.
Co-ordination & Harmonisation of Advanced e-Infrastructures for Research and Education Data Sharing Grant.
What is OPNFV? Frank Brockners, Cisco. June 20–23, 2016 | Berlin, Germany.
Failure Inspection in Doctor utilizing Vitrage and Congress
Doctor Tech Deep Dive Tomi Juvonen, Nokia Ryota Mibu, NEC.
NFV Infrastructure Maintenance Automation by OPNFV Doctor
Keeping My (Telco) Cloud Afloat
ONAP and MEF LSO External API Framework Functional Reference Architecture 12 July 2017 Andy Mayer, Ph.D. © 2016 AT&T Intellectual Property. All rights.
Xin Li, Chen Qian University of Kentucky
Collectd 101.
Fault Management with OpenStack Congress and Vitrage, Based on OPNFV Doctor Framework Barcelona 2016 Ryota Mibu NEC Ohad Shamir Nokia Masahito Muroi.
Collectd 101.
X V Consumer C1 Consumer C2 Consumer C3
Doctor + OPenStack Congress
Ashiq Khan, NTT DOCOMO Ryota Mibu, NEC
Vmware 2V0-642 VMware Certified Professional 6 - Network Virtualization (NSX v6.2) VCE Question Answers.
Maintenance changes to OpenStack Nova 21 Jun 2016 Tomi Juvonen Nokia
ETHANE: TAKING CONTROL OF THE ENTERPRISE
Policy Based Management: Introduction & implementation
OPNFV Doctor - How OPNFV project works -
Doctor PoC Booth Vitrage Demo
NFV PoC update November, 2017 Wouter Huisman.
Tomi Juvonen SW Architect, Nokia
Tomi Juvonen Software Architect, Nokia
OpenStack Ani Bicaku 18/04/ © (SG)² Konsortium.
Bin Hu, AT&T IPv6 Project Lead, OPNFV
Infrastructure Maintenance & Upgrade: Zero VNF Downtime with OPNFV Doctor on OCP Hardware AirFrame Open Rack V1 and V2 compatible Demo video:
Proactive RCA with Vitrage, Kubernetes, Zabbix and Prometheus
Vitrage Project Update, OpenStack Summit Vancouver
Bin Hu, AT&T Dave Lenrow, HP
Vitrage hands-on lab Muhamad Najjar, Eyal Bar-Ilan CloudBand, Nokia
On the Way to Cloud Native:
Network Services Benchmarking - NSB
Vitrage hands-on lab Muhamad Najjar, Marina Koushnir CloudBand, Nokia
OpenStack Ceilometer Blueprints for Liberty
Vitrage Project Update, OpenStack Summit Berlin
Doctor OpenStack Controller changes Tomi Juvonen Nokia
**DRAFT** NOVA Blueprint 03/10/2015
**DRAFT** Doctor Southbound API 23 Feb 2016 Ryota Mibu, NEC.
Doctor Host Maintenance
Presentation transcript:

**DRAFT** Doctor+Congress OPNFV Summit June 2016 Doctor+Congress PoC team

PoC Demo Description Doctor: fast and dynamic fault management in OpenStack (DOCOMO, NTT, NEC, Nokia, Intel) Doctor is an OPNFV project implementing fault management framework for high service availability in OpenStack. In this framework, there are two choices for handling fault events which will be notified to users immediately, by OpenStack Congress and OpenStack Vitrage. –Congress is the Policy Evaluation Engine for enforcing flexible and dynamic failure identification policy defined by an Operators demands. –Vitrage is a new Root Cause Analysis Engine for organizing, analyzing and expanding OpenStack alarms & events, yielding insights regarding the root cause of the problems, and deducing the existence of faults before they are directly detected. This PoC shows how fast fault recovery is performed by using these two options, thereby ensuring the required service availability of telecom nodes. 2

Demo Scenario 3 Monitor Aodh Ceilometer Manager Virtualized Infrastructure (Resource Pool) Alarm Conf. Application Controller Neutron Resourc e Map Congress 6. Notify all 7. Notify Error 1. Set Alarm on Port event Failur e Policy Monitor Collectd 3. Notify Raw Failure 5. Update State 4. Find Affected 2. Monitor 0. Setup VM2 VM1 VM0 Port0 Bonding Doctor Plugin Doctor Driver A: When both of NIC are down, propagate error to status of Ports connecting DPDK Switch B: When 1 NIC is down, propagate error to status of Ports connecting DPDK Switch API extension for Port state update Port1 Port2 DPDK SwitchSR-IOV NIC 0 NIC 1 NIC 2

4 0. Setup Make Congress fetches Neutron Port info (vif_type and hostname) periodically 1.Create Aodh Alarm definition, specifying notification URI of the App manager, Neutron Port ID and context of port update event (Trigger Error) 2.The collectd gets NIC status from DPDK 3.If collectd gets failure(*), Doctor Plugin of collectd posts doctor event (containing vif_type=normal and hostname) to Congress 4.Congress (Policy engine) evaluates policy with received event and find effected 5.Congress enforce mark status of effected ports to down (using Neutron driver) 6.Neutron notify port update events to Ceilometer 7.Aodh fire the notifications of those port failures to Manager Notification Logic for Demonstration

Difference from the PoCs so far Event-driven notification with DPDK stats and collectd Failure mapping in logical resource view 5 Host (Hypervisor) Server (VM) Port (vNIC) OPNVF Summit 2015, OpenStack Summit OPNVF Summit 2016

Options (Backup Plans / Improvements) Option A. VM0 uses another port of DPDK Switch instead of SRIOV (The networks are divided by VLAN) Option B. VM1 and VM2 use normal Open vSwitch instead of DPDK Switch Option C. Map raw failure to Nova Instance instead of Neutron Port Option D. Monitor packet loss instead of link state of pNIC port (Improvements) 6

Alarm in Aodh aodh alarm create -t event --name "NICFailureAlarm" \ --alarm-action \ --description "NIC failure" \ --enabled True \ --repeat-actions False \ --severity "moderate" \ --event-type port.update.end \ --query "traits.forced_down=string::True;traits.resource_id=string:: “ 7

Rules in Congress [All bonded NIC ports down] execute[neutronv2:force_down_port(portid)] :- neutronv2:ports(id=portid, hostid=hostname, vif_type=viftype), doctor:events(hostname=hostname,vif_type=viftype,type="host.nic1.down"), doctor:events(hostname=hostname,vif_type=viftype,type="host.nic2.down") [One NIC port down] execute[neutronv2:force_down_port(portid)] :- neutronv2:ports(id=portid, hostid=hostname, vif_type=viftype), doctor:events(hostname=hostname,vif_type=viftype,type="host.nic1.down") execute[neutronv2:force_down_port(portid)] :- neutronv2:ports(id=portid, hostid=hostname, vif_type=viftype), doctor:events(hostname=hostname,vif_type=viftype,type="host.nic2.down") 8

Main Monitor (40 inch) 9 Globe App Status App Manager Log time Recovery mon 0.02 time Recovery mon 0.02

Sub Monitor (Laptop) Console Slide Deck Horizon –VM List w/ status –Congress Rules (TBC) 10