Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ashiq Khan, NTT DOCOMO Ryota Mibu, NEC

Similar presentations


Presentation on theme: "Ashiq Khan, NTT DOCOMO Ryota Mibu, NEC"— Presentation transcript:

1 Ashiq Khan, NTT DOCOMO Ryota Mibu, NEC
OPNFV Summit 2017 Doctor's achievements and what else you can do with it! Ashiq Khan, NTT DOCOMO Ryota Mibu, NEC

2 Doctor Project Feature project to build fault management and maintenance framework Virtualized Infrastructure Applications VIM User and Administrator Virtualized Infrastructure Manager (VIM) = OpenStack Virtual Compute Virtual Storage Virtual Network Virtualization Layer Hardware Resources App Doctor Scope

3 Assumption of VNF (NFV Application)
Telco Applications are generally deployed in active-standby or active-active fashion App state needs to be switched when failure occurs App (Active) App (Standby) App and App Manager (VNFM) cannot detect HW failures directly VM VM Machine Machine

4 X Use Case 1: Fault management V Consumer C1 Consumer C2 Consumer C3
4. Switch to SBY configuration V Consumer C1 Consumer C2 Consumer C3 3. FaultNotification (VM ID, Fault ID) 5. Instruction (VM ID) OpenStack Northbound Interface 2. Inform the Consumer? If YES, find owner of affected VMs from database Virtualized Infrastructure Manager (VIM), e.g. OpenStack Resource Map VM-1 VM-2 VM-7 VM-4 Server – VM mapping Server S1 VM-1, VM-2 Server S2 VM-7 Server S3 VM-4 6. Execute Instruction - e.g. migrate VM Ownership information VM-1, VM-7 Consumer C1 VM-2 Consumer C2 VM-4 Consumer C3 Resource Pool Hypervisor Hypervisor Hypervisor Hardware Server S1 Hardware Server S2 Hardware Server S3 X 1. Fault Monitoring - Hardware fault - Hypervisor fault - Host OS fault

5 Use Case 2: Maintenance V Administrator Consumer C1 Consumer C2
4. Switch to SBY configuration V Administrator Consumer C1 Consumer C2 Consumer C3 3. Maintenance Notification (VM ID) 1. Maintenance Request (Server S3) 5. Instruction (VM ID) OpenStack Northbound Interface Virtualized Infrastructure Manager (VIM), e.g. OpenStack VM-1 VM-2 VM-7 VM-4 6. Execute Instruction - e.g. migrate VM Resource Map Server – VM mapping Server S1 VM-1, VM-2 Server S2 VM-7 Server S3 VM-4 Resource Pool Hypervisor Hypervisor Hypervisor Ownership information VM-1, VM-7 Consumer C1 VM-2 Consumer C2 VM-4 Consumer C3 Hardware Server S1 Hardware Server S2 Hardware Server S3 2. Which VMs are affected? Find Consumer owning the VM(s) from the database.

6 Doctor Achievements Design
Requirement Document Figured out basic use cases and minimal requirements in fault management To-be-architecture Implementation Open Source Project Mapping Gap Analysis Key features are available in OpenStack work in Upstream (OpenStack) Integration and testing Good example of how to upstream Functest / Installer Doctor CI OPNFV users can test fault management scenario quickly with performance profile Performance test support Profiler

7 Demo @ OpenStack Summit Barcelona
vEPC Failover keeping phone call session online

8 Key Requirements as VIM
Consistent Resource State Awareness Immediate Notification Extensible Monitoring Fault Correlation

9 Doctor Architecture and Typical Scenario
Application 0. Set Alarm Manager 6-. Action 5. Notify Error Virtualized Infrastructure (Resource Pool) 4. Notify all Controller Controller Notifier Controller Resource Map Alarm Conf. 3. Update State 2. Find Affected Monitor Monitor Inspector Monitor Failure Policy 1. Raw Failure

10 Doctor Architecture and Typical Scenario
Application 0. Set Alarm Manager 6-. Action 5. Notify Error Virtualized Infrastructure (Resource Pool) 4. Notify all Controller Controller Notifier Controller Resource Map Alarm Conf. Consistent Resource State Awareness Immediate Notification 3. Update State 2. Find Affected Monitor Monitor Inspector Monitor Failure Policy Extensible Monitoring 1. Raw Failure Fault Correlation

11 Virtualized Infrastructure (Resource Pool)
Doctor OSS Map Application 0. Set Alarm Manager 6-. Action 5. Notify Error Virtualized Infrastructure (Resource Pool) 4. Notify all Controller Controller Notifier Nova Controller Resource Map Alarm Conf. Neutron Ceilometer /Aodh Cinder 3. Update State 2. Find Affected Monitor Monitor Inspector Monitor Failure Policy e.g. Zabbix 1. Raw Failure Congress

12 Analyzed Gaps and Development Items
Application 0. Set Alarm Manager 6-. Action 5. Notify Error Event Alarm State Correction Virtualized Infrastructure (Resource Pool) 4. Notify all Controller Controller Notifier Nova Controller Resource Map Alarm Conf. Neutron Ceilometer /Aodh Cinder 3. Update State 2. Find Affected Monitor Monitor Inspector Monitor Failure Policy e.g. Zabbix 1. Raw Failure Congress Event-driven Inspection

13 Doctor Blueprints in OpenStack
Project Blueprint Spec Drafter Developer Status Ceilometer/Aodh Event Alarm Evaluator Ryota Mibu (NEC) Completed (Liberty) Nova New nova API call to mark nova-compute down Tomi Juvonen (Nokia) Roman Dobosz (Intel) Support forcing service down Carlos Goncalves (NEC) Get valid server state Completed (Mitaka) Add notification for service status change Balazs Gibizer (Ericsson) Congress Push Type DataSource Driver Masahito Muroi (NTT) Adds Doctor Driver Completed (Newton) Neutron Port data plane status Completed (Pike)

14 Further Technical Information can be found …
OPNFV Doctor Wiki Page Deliverables/Documents Presentation Slides Doctor blueprint tracker

15 Doctor Integration

16 We are now using pre-deployed OPNFV PoD
Doctor CI Job Pharos / Infra team putting infra in place Releng CI Control Testing Team Test Coordination Functest Testing Framework Apex/Fuel Integration Doctor Feature Dev. 0. Patch uploaded 6. Report result 4. Trigger Feature Specific Testing Code Gerrit 1. Trigger Jenkins Tester Installer TestCase TestDB 5. Run test Artifact 2. OPNFV Deploy (Not triggered now) 3. Launch Functest container We are now using pre-deployed OPNFV PoD 7. Store Logs

17 Testing tool enhancements
Multi-VM support  Performance test VM_BASENAME=doctor_vm VM_FLAVOR=m1.tiny VM_COUNT=${VM_COUNT:-1} Performance Profiler User/Project Option  RBAC check DOCTOR_USER=doctor DOCTOR_PROJECT=doctor DOCTOR_ROLE=_member_ Rewriting in Python

18 Profiler Total time cost: 472(ms)
==============================================================================> |Monitor|Inspector |Controller|Notifier|Evaluator | | | |? |? |? | | | | | | | | | | | link down:0 | | | | | | | | | raw failure:112 | | | | | | | | found affected:? | | | | | | | set VM error:312 | | | | | | marked host down:842 | | | | | notified VM error:? | | | | transformed event:? | | | evaluated event:? | | fired alarm:? | received alarm:472

19 Doctor Status – Fault Management
Notifier Controller Inspector Monitor Ceilometer/Aodh Nova Neutron Cinder Sample Congress Vitrage Sample To-Be Arch. Design Gap Analysis Blueprint Coding Integration OPNFV Release Done

20 Doctor Achievements Design
Requirement Document Figured out basic use cases and minimal requirements in fault management To-be-architecture Implementation Open Source Project Mapping Gap Analysis Key features are available in OpenStack work in Upstream (OpenStack) Integration and testing Good example of how to upstream Functest / Installer Doctor CI OPNFV users can test fault management scenario quickly with performance profile Performance test support Profiler

21 Adopting Doctor Framework to NFVI Maintenance
Tool chain for NFVI maintenance Exchange VNFM and VIM admin intentions via Nova Server Tag and Notification Manager 0. Set Alarm Application 6. Action 5. Notify Retirement 7. Allow migration 4. Notify all Virtualized Infrastructure (Resource Pool) Controller Notifier VM active migrate-ng Nova Ceilometer /Aodh retirement 3. Update State 2. Find Affected VM 8. Check App Readiness inactive migrate-ok Inspector Admin 1. Inform Maintenance Congress 9. Perform Maintenance

22 Monitoring Team Collaboration
Doctor developed “Framework” and supports typical scenarios, We need to expand supports of various fault types Cross project collaboration between Barometer and Doctor Design session “Monitoring Team Gathering” Room Sculpture, 11: :15, Tuesday

23 Don’t miss out... “Faster, Higher, Stronger : Accelerating Fault Management to the Next Level” Studio 4, Thursday, 1:50pm - 2:20pm “Developing an Open Source NFV Platform for Telecom: OPNFV Release Specifications and New Features” DOCOMO booth, PoC Demo Zone


Download ppt "Ashiq Khan, NTT DOCOMO Ryota Mibu, NEC"

Similar presentations


Ads by Google