Doctor + OPenStack Congress

Slides:



Advertisements
Similar presentations
LACP Project Proposal.
Advertisements

Bringing Together Linux-based Switches and Neutron
CloudWatcher: Network Security Monitoring Using OpenFlow in Dynamic Cloud Networks or: How to Provide Security Monitoring as a Service in Clouds? Seungwon.
High Availability Project Qiao Fu Project Progress Project details: – Weekly meeting: – Mailing list – Participants: Hui Deng
Doctor Implementation Plan (Discussion) Feb. 6, 2015 Ryota Mibu, Tomi Juvonen, Gerald Kunzmann, Carlos Goncalves.
Virtualized Infrastructure Deployment Policies (Copper) 19 February 2015 Bryan Sullivan, AT&T.
SDN in Openstack - A real-life implementation Leo Wong.
DataGrid is a project funded by the European Union 22 September 2003 – n° 1 EDG WP4 Fabric Management: Fabric Monitoring and Fault Tolerance
A General approach to MPLS Path Protection using Segments Ashish Gupta Ashish Gupta.
Policy Architecture Discussion 18 May 2015 Bryan Sullivan, AT&T.
24 February 2015 Ryota Mibu, NEC
(OpenStack Ceilometer)
OpenContrail Quickstart
1 Doctor Fault Management 18 May 2015 Ryota Mibu, NEC.
**DRAFT** Doctor Southbound API 14 April 2015 Ryota Mibu, NEC.
System Center 2012 Setup The components of system center App Controller Data Protection Manager Operations Manager Orchestrator Service.
Module 14: Configuring Print Resources and Printing Pools.
1 Doctor Fault Management - Updates - 30 July 2015 Ryota Mibu, NEC.
Call Control with SIP Brian Elliott, Director of Engineering, NMS.
Clever Framework Name That Doesn’t Violate Copyright Laws MARCH 27, 2015.
Windows Azure Pack Service Provider Foundation 2012 R2 Windows Server 2012 R2 Virtual Machine Manager 2012 R2 Damian Flynn MVP System Center
Microsoft Virtual Academy. STANDARDIZATION SELF SERVICEAUTOMATION Give Customers of IT services the ability to identify, access and request services.
**DRAFT** Blueprints Alignment (OpenStack Ceilometer) 4 March 2015 Ryota Mibu, NEC.
Fault Localization (Pinpoint) Project Proposal for OPNFV
GLIDEINWMS - PARAG MHASHILKAR Department Meeting, August 07, 2013.
Gerald Kunzmann, DOCOMO Carlos Goncalves, NEC Ryota Mibu, NEC
Promise Resource Reservation 09 November 2015
Gerald Kunzmann, DOCOMO Carlos Goncalves, NEC Ryota Mibu, NEC
1 OPNFV Summit 2015 Doctor Fault Management Gerald Kunzmann, DOCOMO Carlos Goncalves, NEC Ryota Mibu, NEC.
DataTAG is a project funded by the European Union International School on Grid Computing, 23 Jul 2003 – n o 1 GridICE The eyes of the grid PART I. Introduction.
Service Charging Platform. EMS (Entity Management System) 0 Logging Agent Provides detailed activity logs and reports all raw facts as they happen to.
© 2015 AT&T Intellectual Property. All rights reserved. AT&T and the AT&T logo are trademarks of AT&T Intellectual Property. 1 VF (Virtual Functions) Event.
Ashiq Khan NTT DOCOMO Congress in NFV-based Mobile Cellular Network Fault Recovery Ryota Mibu NEC Masahito Muroi NTT Tomi Juvonen Nokia 28 April 2016OpenStack.
Ashiq Khan NTT DOCOMO Congress in NFV-based Mobile Cellular Network Fault Recovery Ryota Mibu NEC Masahito Muroi NTT Tomi Juvonen Nokia 28 April 2016OpenStack.
Secure Access and Mobility Jason Kunst, Technical Marketing Engineer March 2016 Location Based Services with Mobility Services Engine ISE Location Services.
Failure Inspection in Doctor utilizing Vitrage and Congress
Congress Blueprint --policy abstraction
**DRAFT** Doctor+Congress OPNFV Summit June 2016 Doctor+Congress PoC team.
Doctor Tech Deep Dive Tomi Juvonen, Nokia Ryota Mibu, NEC.
NFV Infrastructure Maintenance Automation by OPNFV Doctor
Keeping My (Telco) Cloud Afloat
Master Service Orchestrator (MSO)
Flash Test By Nikolas Hermanns.
Fault Management with OpenStack Congress and Vitrage, Based on OPNFV Doctor Framework Barcelona 2016 Ryota Mibu NEC Ohad Shamir Nokia Masahito Muroi.
X V Consumer C1 Consumer C2 Consumer C3
Ashiq Khan, NTT DOCOMO Ryota Mibu, NEC
Vmware 2V0-642 VMware Certified Professional 6 - Network Virtualization (NSX v6.2) VCE Question Answers.
Maintenance changes to OpenStack Nova 21 Jun 2016 Tomi Juvonen Nokia
OPNFV Doctor - How OPNFV project works -
Network Load Balancing
Doctor PoC Booth Vitrage Demo
HPE OneView for Microsoft System Center
Tomi Juvonen SW Architect, Nokia
Tomi Juvonen Software Architect, Nokia
OpenStack Ani Bicaku 18/04/ © (SG)² Konsortium.
Infrastructure Maintenance & Upgrade: Zero VNF Downtime with OPNFV Doctor on OCP Hardware AirFrame Open Rack V1 and V2 compatible Demo video:
Proactive RCA with Vitrage, Kubernetes, Zabbix and Prometheus
ONOS Drake Release September 2015.
Vitrage Project Update, OpenStack Summit Vancouver
Vitrage hands-on lab Muhamad Najjar, Eyal Bar-Ilan CloudBand, Nokia
Vitrage hands-on lab Muhamad Najjar, Marina Koushnir CloudBand, Nokia
OpenStack Ceilometer Blueprints for Liberty
Vitrage Project Update, OpenStack Summit Berlin
Robert Down & Pranay Sadarangani Nov 8th 2011
Doctor OpenStack Controller changes Tomi Juvonen Nokia
**DRAFT** NOVA Blueprint 03/10/2015
**DRAFT** Doctor Southbound API 23 Feb 2016 Ryota Mibu, NEC.
Doctor Host Maintenance
Presentation transcript:

Doctor + OPenStack Congress NTT DOCOMO, NEC, Intel

PoC Demo Description Doctor: fast and dynamic fault management in OpenStack (DOCOMO, NTT, NEC, Nokia, Intel) Doctor is an OPNFV project implementing fault management framework for high service availability in OpenStack. In this framework, there are two choices for handling fault events which will be notified to users immediately, by OpenStack Congress and OpenStack Vitrage. Congress is the Policy Evaluation Engine for enforcing flexible and dynamic failure identification policy defined by an Operators demands. Vitrage is a new Root Cause Analysis Engine for organizing, analyzing and expanding OpenStack alarms & events, yielding insights regarding the root cause of the problems, and deducing the existence of faults before they are directly detected. This PoC shows how fast fault recovery is performed by using these two options, thereby ensuring the required service availability of telecom nodes. http://events.linuxfoundation.org/events/opnfv-summit/extend-the-experience/opnfv-poc-zone

API extension for Port state update Demo Scenario Application Manager 1. Set Alarm on Port event 7. Notify Error Virtualized Infrastructure (Resource Pool) Neutron 6. Notify all Controller Aodh Ceilometer Controller Resource Map Alarm Conf. API extension for Port state update VM0 VM1 VM2 5. Update State 4. Find Affected Link Monitor Script Port0 Port1 Port2 Monitor Doctor Driver Congress Monitor Failure Policy Bridge Bridge 2. Monitor 3. Notify Raw Failure NIC0 NIC1 NIC2 Bonding

Notification Logic for Demonstration Setup Make Congress fetches Neutron Port info (vif_type and hostname) periodically Create Ceilometer/Aodh Alarm definition, specifying notification URI of the App manager, Neutron Port ID and context of port update event (Trigger Error) The monitor gets NIC status from the compute host If an fault observed, The monitor report the fault event (as nic down) to Congress Congress (Policy engine) evaluates policy with received event and find effected ports Congress sends request to update status of effected ports to down (using Neutron driver) Neutron updates status of the ports and notify these port updates to Ceilometer/Aodh Ceilometer/Aodh fires the alarm notifications of these port failures to Manager

Difference from the PoCs so far Alarming “Single Point of failure” Failure mapping in logical resource view @ OPNVF Summit 2015, OpenStack Summit Austin Host (Hypervisor) Server (VM) Port (vNIC) Network @ OPNVF Summit 2016

Configurations Policies in Congress Alarm Definition in Aodh nic_down(host, physnet) :- doctor:events(hostname=host, physical_network=physnet, type="host.nic1.down") nic_down(host, physnet) :- doctor:events(hostname=host, physical_network=physnet, type="host.nic2.down") execute[neutronv2:force_down_port(port)] :- neutron:ports(id=port, host_id=host, network_id=net), neutronv2:networks(id=net, physical_network="default"), nic_down(hostname=host, physical_network=physnet) Alarm Definition in Aodh aodh alarm create -t event --name "NICFailureAlarm" --event-type port.update.end \ --query "traits.forced_down=string::True;traits.resource_id=string::<Neutron Port ID>” \ --alarm-action <URI to notify> (…)