Presentation is loading. Please wait.

Presentation is loading. Please wait.

NM functions Configuration, Performance, Fault, Accounting, Security.

Similar presentations


Presentation on theme: "NM functions Configuration, Performance, Fault, Accounting, Security."— Presentation transcript:

1 NM functions Configuration, Performance, Fault, Accounting, Security

2 Configuration Management Middle and long range activities for controlling  Physical, electrical and logical inventories  Maintaining vendor files and trouble tickets  Supporting provisioning and order processing  Defining and supervising service level agreements  Managing changes  Distributing software

3 Configuration management is central to all other network management functions  All other management are supported by configuration details  Enhances control over configuring the network and devices  Quick access to vital configuration data  Helps initialization, maintenance and shutdown of individual components and logical subsystems

4 Primary Information Actual configuration Attributes of network elements Generated configuration Status indicators of network elements Vendor data Change requests and record Order data Actual inventory Status of service-level indicators

5 Secondary Information Traffic Volumes More details on indicators Performance indicators of the network elements etc

6 Configuration management functions Inventory management Network topology services Service Level agreements Designing, implementing and processing trouble tickets Order processing and provisioning Change Management

7 Inventory management Automated inventory – online record of  currently implemented components and spares,  contact vendors,  location of components,  maintenance requirements for certain equipment classes,  service statistics like number of outages, response for repair, repair time distribution

8 Good Inventory Management less redundancy  if same information is stored in different data bases- wastage of resource, processing time to back up the data bases synchronized change management unique names and addresses  Helps during troubleshooting Efficient troubleshooting Better capacity and contingency planning

9 Network Topology Services Requires current and historical configurations Layered configuration displays at network and component level of  Electrical layouts  Physical  Logical

10 Display of configuration details

11 Network details – click on icon

12 Protocol level

13 Auto Discovery tool Auto- discovery tool can discover devices on the network ( periodically) Auto mapping produces the network map Takes up bandwidth to execute all this

14 SLA Need to evaluate long-term service levels Consistency in customer service level Increased planning and decreased crisis management Service levels  Responsiveness, accuracy, availability Performance reporting  Planned and actual workload characteristics and service levels during report period

15 trouble tickets Linking trouble-tickets Information in a trouble tickets  Time reported  Time received by responsible group  Time network service restored  Time vendor notified  Time vendor responded  Time vendor restored service  Total vendor time  Total user non-availability  Total service outage

16 Change Management

17 Tools for configuration management Simple tools  Provide simple storage for all network related information  Manually collecting and entering data Complex tool  Automatically gather data – latest information on configuration  Compare current configuration with stored conf  Change a device’s configuration while running  Specify configuration errors that should generate warning messages –

18 Performance Management Activities required to continuously evaluate principal performance indicators to check  Service level maintenance  Identify potential bottlenecks  Establish trend reports  Network utilization and error rates

19 Contd.. Involves  Collection of data on current utilization of network devices and links  Analyze data to discern high utilization trends  Setting utilization thresholds  Using off-line simulation and or analytical studies on how to maximize performance

20 Primary Information Actual Configuration Generated configuration Performance indicators in real-time or in near- real-time  Response time  Congested channels  Resource utilization Selected vendor data Performance histories for selected facilities Operational procedures

21 Performance Indicators Availability Response time Throughput Utilization – channel occupancy Grade of service Transmission volumes Offered load Accuracy

22 Indicators Service oriented indicators  Have priority Efficiency oriented indicators

23 Service Oriented Indicators Availability  Customers perspective  depends on technical reliability of components  Redundancy? Cost benefit  Total Costs = costs of redundancy + cost of cosnequences

24 Availability MTBF __________________________________ MTBF+MTTD+MMTR+MTOR MTBF – Mean time between failures MTTD – Mean time to diagnose MTTR- Mean time to Repair (or report) MTOR – Mean time of Repair Better Availability, keep MTTD, MTTR, MTOR low,

25 Response Time Propagation Delays, Processing delays, Transmission delays, Protocol delays

26 Contd.. Total Response Time Network Delays Processing delays Protocol delays – time outs Response time consideration depend on  Protocols and their behavior  Job priorities  Loads in the system

27 Accuracy Accuracy can be affected by  Erroneous transmission (wireless & fiber)  Characters transmitted but not delivered  Characters received which were not sent  Characters duplicated

28 Residual Error Rate CH E +CH V +CH N +CH D ______________________________ CH T CH E = erroneous characters  due to media & processing CH V = transmitted but not received CH N = extra characters received CH D = duplicated characters CH T = total characters

29 Efficiency oriented indicators Efficiency oriented indicators - Represent interest of the organization Service oriented monitoring and and efficiency oriented monitoring  conflicts?

30 Efficiency vs service

31 Throughput Measure of a server’s capacity - MIPS Line throughput – kilobits/sec Application oriented  Number of transaction / unit time  Number of customer sessions per application  Number of calls serviced  Number of jobs provided by a node

32 Utilization Dynamic measure of resources used Puts a practical limits on the throughput under operational conditions Helps study overlap among component processing, mutual waits etc.

33 Utilization Utilization vs Accuracy Utilization vs throughput Utilization vs Goodput

34 Overlap effects

35 Availability Availability of system depends on availability of individual components (Very difficult to measure and report on availability)  Check on each component and compare with configuration  Depends on how components are connected

36 Example Each Component availability = 0.98 Availability of the serial combination is 0.98 * 0.98 = 0.96 Example : 2 modems. Serial processing of data

37 Prob 1 link is not available = 0.02 Prob both links are no available is 0.02 * 0.02 = 0.0004 Availability = 1- 0.0004 = 0.9996

38 Performance measurements Data Gathering  Exhaustive  Statistical Distribution for sampling times Correlation effects Performance Analysis  Data presentation  Interpretation

39 Contd.. Historical trends Real time trends Graphical presentation and comparison Linking different performance indicators  Then set thresholds

40 Simulation studies To improve the performance or identify bottlenecks –  model the network and components – (primary)  Study effects of changes in the model  Target Optimal performance  Requires Synthetic traffic generation Analytical and simulation tools

41 Simple tools for PM Provides real-time information on network components  Graphical – bars, histograms Can help find bottlenecks Main information  Processor utilization  Memory utilization  Link – pkts/sec, bits/sec  Bit error rates

42 Complex Tools Set threshold Take action once thresholds exceed  Alarm  Enable backup Near threshold warning Store historical daya

43 A complex tool at work Performance problem Brief periods on interrupted service between systems – no information passes through –3 pm and 12 am

44 PM tool at work Check error rates in the network  Normal Check utilization  Peaks at 3pm and 12 am – times of back up Check Gatsby and Daisy utilization  Peaked to 100% at the specified times Check for processor intensive applications  negative

45 Contd.. Check network traffic type  Located an unknown protocol packet  Flooding the network – locating servers  Check originator  Send message to him  Or block his traffic

46 Fault management Activities needed to dynamically maintain the network service level High network availability

47 Primary Information Actual configuration Generated configuration Event reports and alarms Status indicators of network elements Performance indicators Spare components and their status Backup routes and their status Vendor data for problem dispatch Global traffic volumes Progress of trouble resolution

48 Steps in FM Identify the occurrence of fault Isolate the cause of fault Correct the fault if possible First is difficult, second is very difficult!

49 Network Status Supervision Layered configuration maps (status) (Tightly coupled to topology display) Zoom in on parts to isolate problems Real time traffic status displays Good monitoring devices/sensors Monitored information to be passed on to agents, or management elements Process and distribute messages, events and alarms

50 Status Is a measurement of the behavior of an object at a specific instance in time  Represented by a set of status information items and their values at a specific time

51 Event Change in the status of the element – which justifies notification i.e. significant to fault management Event report can be generated  Type of event  Change in status  Time stamp  Reporting entity -Object or process that generated event  Managed object whose status changed  Managed object information  Probable cause  Effect of event on the managed object

52 Event Filtering Multi-layered filtering

53 Filtering Process

54 Global filtering  First process on an event – is the event serious and does it have to be processed  Use a set of criteria for this assessment  Can not be function specific

55 Filtering Process Distribution Filtering  An event processor selects the event it wishes to receive  There are various event processes running simultaneously Event process filtering  Filtering done by the event processor  Specific to the functional

56 Event Processor Examine and process event reports Passive processing  Sampling and logging Proactive processing  Takes automatic corrective action

57 Process for filtering

58 Event effect Permanent – external action required Temporary – will correct automatically Impending – will result in failure soon Impaired – services can be provided at reduced levels Inhibited – services stopped

59 Dynamic Troubleshooting Opens trouble tickets, links them, dispatches to the proper vendors, checks on-line progress of trouble tickets Problem detection –  Is something wrong? Problem determination  What is wrong and where is the problem in the network? Problem diagnosis & resolution  To isolate, fix or provide backup and fix

60 End-to-end testing To verify dynamically correct network operation  Conducted during normal network operation, without affecting it Can we have over-head free testing? What components should be tested? How should tasks be assigned?  Local sites  Central sites

61 Contd.. When to monitor and test?  Continually, periodically, on demand How to monitor and test  Disruptive, non-disruptive What indicators to monitor and test?  Service level, efficiency, loops, circuits What instruments to use?  Hw, sw, analog, digital What reports are to be generated?  Standard, adhoc with special evaluations What are the triggering events?  Time, single or combined events, alarms

62 Types of faults Unobservable  Deadlocks between processes  Instrument not capable of recording the events Partially observable  Node failure – actual reason – low level protocol Uncertainty in observation  Lack of device response Device is down, network partitioned, congestion delays, local timer faulty

63 Issues in isolating faults Multiple potential faults  Number of elements failing Too many related observations  One fault manifests itself as various events Interference between diagnosis and local recovery procedures  Error recovery sets in before diagnosis Absence of automated tools

64 Example FM Problem scenario – sergeant fails due to buffer overflow

65 Contd.. Buffer is sergeant is well provisioned for  Fails due to traffic surge Pepper reports link failure to LAN3  Message sent to NM system NMS asks pepper to check on carrier presence in Link to LAN3  Carrier Absence reported NMS ask Pepper to perform loopback on link3  ok

66 Contd.. NM resets Sergeant ? Actual reason for failure not identified This could have been avoided if there was an event from sergeant of utilization in excess of 80% or 90%

67 Simple tool Points out problem existence  Eg ICMP ping tells you about the existence of a system Complex tool may perform all functions shown in the previous example


Download ppt "NM functions Configuration, Performance, Fault, Accounting, Security."

Similar presentations


Ads by Google