Fault Management IACT 418/918 Autumn 2005 Gene Awyzio SITACS University of Wollongong.

Slides:



Advertisements
Similar presentations
COMPUTER NETWORK TOPOLOGIES
Advertisements

Capacity Planning IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
Nilesh Agre Wedashree Jalukar Neelima Shahi Group Members.
Signaling & Network Control NETW 704 MTP 3. Primary purpose is to route messages between SS7 network nodes in a reliable manner. It is equivalent to Layer.
Slide 1 Copyright : Valiant Communications Limited Slide 1 Orion Telecom Networks Inc Updated: April, 2010 V aliant C ommunications L imited.
Chapter 19: Network Management Business Data Communications, 5e.
CIS : Network Management. Introduction Network, associated resources and distributed applications indispensable Complex systems —More things can.
Chapter 13 Managing Computer and Data Resources. Introduction A disciplined, systematic approach is needed for management success Problem Management,
CS 795 – Spring  “Software Systems are increasingly Situated in dynamic, mission critical settings ◦ Operational profile is dynamic, and depends.
Software Quality Assurance (SQA). Recap SQA goal, attributes and metrics SQA plan Formal Technical Review (FTR) Statistical SQA – Six Sigma – Identifying.
1 Independent Verification and Validation Current Status, Challenges, and Research Opportunities Dan McCaugherty IV&V Program Manager Titan Systems Corporation.
Chapter 19: Network Management Business Data Communications, 4e.
SNMP IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
Network Management Overview IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
Events and Interrupts. Overview  What is an Event?  Examples of Events  Polling  Interrupts  Sample Timer Interrupt example.
Reliability Risk Assessment
1 ITC242 – Introduction to Data Communications Week 12 Topic 18 Chapter 19 Network Management.
Security Management IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
OSI Model.
Hands-On Microsoft Windows Server 2003 Networking Chapter 7 Windows Internet Naming Service.
Configuration Management IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
Accounting Management IACT 918 April 2005 Glenn Bewsell/Gene Awyzio SITACS University of Wollongong.
Security Management IACT 418/918 Autumn 2005 Gene Awyzio SITACS University of Wollongong.
Network Design and Implementation IACT 418/918 Autumn 2005 Gene Awyzio SITACS University of Wollongong.
Configuration Management IACT 418/918 Autumn 2005 Gene Awyzio SITACS University of Wollongong.
 The Open Systems Interconnection model (OSI model) is a product of the Open Systems Interconnection effort at the International Organization for Standardization.
Growing the Network © 2004 Cisco Systems, Inc. All rights reserved. Choosing the Right Network Topology INTRO v2.0—3-1.
H-1 Network Management Network management is the process of controlling a complex data network to maximize its efficiency and productivity The overall.
1.  TCP/IP network management model: 1. Management station 2. Management agent 3. „Management information base 4. Network management protocol 2.
NETWORK TOPOLOGY. WHAT IS NETWORK TOPOLOGY?  Network Topology is the shape or physical layout of the network. This is how the computers and other devices.
Front Page …..is an Asset Management tool designed to record and aid the analysis of activities affecting Production capability and costs. …..promotes.
Chapter 4.  Understand network connectivity.  Peer-to-Peer network & Client-Server network  Understand network topology  Star, Bus & Ring topology.
Semester 1 Module 8 Ethernet Switching Andres, Wen-Yuan Liao Department of Computer Science and Engineering De Lin Institute of Technology
ICMP (Internet Control Message Protocol) Computer Networks By: Saeedeh Zahmatkesh spring.
Chapter 4: Managing LAN Traffic
Transparent Bridging. Chapter Goals Understand transparent bridge processes of learning, filtering, forwarding, and flooding. Explain the purpose of the.
Internet Addresses. Universal Identifiers Universal Communication Service - Communication system which allows any host to communicate with any other host.
Improving TCP Performance over Mobile Networks Zahra Imanimehr Rahele Salari.
© Oxford University Press 2011 DISTRIBUTED COMPUTING Sunita Mahajan Sunita Mahajan, Principal, Institute of Computer Science, MET League of Colleges, Mumbai.
Transmission Control Protocol
NetPro-ITI Building a Simple Network. What Is a Network?
Network Management Lecture 3. Network Faults Hardware Software.
1 © 2003, Cisco Systems, Inc. All rights reserved. CCNA 3 v3.0 Module 4 Switching Concepts.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development 3.
Communication Paradigm for Sensor Networks Sensor Networks Sensor Networks Directed Diffusion Directed Diffusion SPIN SPIN Ishan Banerjee
Oracle's Distributed Database Bora Yasa. Definition A Distributed Database is a set of databases stored on multiple computers at different locations and.
AS Computing Data Transmission and Networks. Transmission error Detecting errors in data transmission is very important for data integrity. There are.
Sem1 - Module 8 Ethernet Switching. Shared media environments Shared media environment: –Occurs when multiple hosts have access to the same medium. –For.
Software Requirements and Design Khalid Ishaq
NETWORKING FUNDAMENTALS. Network+ Guide to Networks, 4e2.
Thomas L. Gilchrist Testing Basics Set 3: Testing Strategies By Tom Gilchrist Jan 2009.
Protocol Layering Chapter 11.
Network Topologies.
Role Of Network IDS in Network Perimeter Defense.
Movement-Based Check-pointing and Logging for Recovery in Mobile Computing Systems Sapna E. George, Ing-Ray Chen, Ying Jin Dept. of Computer Science Virginia.
CEG 2400 FALL 2012 Chapter 15 Network Management 1Network Management.
FTOP: A library for fault tolerance in a cluster R. Badrinath Rakesh Gupta Nisheeth Shrivastava.
Class A, B, and now N NFPA 72, 2016 Edition Dan Horon President
Performance Management IACT 418/918 Autumn 2005 Gene Awyzio SITACS University of Wollongong.
Chapter 19: Network Management
Chapter 9 Optimizing Network Performance
Managing Your Network Environment
Lec 5: SNMP Network Management
© 2003, Cisco Systems, Inc. All rights reserved.
Configuring EtherChannels and Switch Troubleshooting
Introduction to Computers
Network Administration CNET-443
Networking Management
Networking Management
Network Survivability
Presentation transcript:

Fault Management IACT 418/918 Autumn 2005 Gene Awyzio SITACS University of Wollongong

2 Overview Fault Management is the process of locating and correcting network problems or faults Comprehensive fault management is probably the most important task in Network Management

3 Benefits of Fault Management Process Increased network reliability –Provides tools allowing engineer to quickly Detect problems Initiate recovery procedures Need to maintain the illusion of complete and continuous connectivity Also provides tools to extract information about the networks current state

4 Accomplishing Fault Management Can be considered as a three (3) step process –Identify the fault –Isolate the cause of the fault –Correct the fault if possible

5 Identifying the fault Gathering Information to identify a problem –To learn that a problem exists we need to gather data about the current state of the network Two approaches –Log critical network events –Poll network devices

6 Identifying the fault Critical network events –Examples Failure of a link Lack of response from host –Transmitted by network device when fault conditions occur –Reactive method –If device fails it cannot send an event

7 Identifying the fault Occasional Polling –Can help find faults in a timely manner –Tradeoff Degree of timeliness vs bandwidth consumption –Other factors Number of devices to poll Bandwidth of links

8 Identifying the fault Example of Occasional Polling –Assume each query and response is 100 bytes long (including data and header information) –For a network of 30 devices ( ) * 30 = 6000bytes/polling interval = 48,000 bits/polling interval –Polling every minute 800 bits/second (48,000 bits/polling interval * 60 secs * 60 polls) = 172,800,000 = 173 Megabits/hour –Polling every 10 minutes 17.3 Megabits/hour May not know about event for 10 minutes

9 Deciding Which Faults to Manage Need to decide which faults to mange –Need to prioritise faults –If number of faults reports is high network may not handle volume –Limiting event traffic can reduce redundant transmissions and storage Factors to consider –Scope of control over network –Size of network

10 Fault Management of a Network Management System Simplest system –Reports existence of fault but NOT location More complex tool –Uses capability of hosts and network devices to Send critical network events Facilitate isolation of fault cause Advanced tool –Correction of fault

11 Impact of a Fault on the Network A fault management tool MUST be capable of analysing how a fault can affect other areas of the network Need to know –What services the fault STOPS IMPACTS –Not only that a fault has occurred but also how that fault affects other network communication Data can come from performance management tools

12 Form of Reporting Faults Common forms of fault reporting –Text –Graphical –Auditory signals Text –Will work on any type of terminal

13 Form of Reporting Faults Graphical –Considered to be very effective –Can use flashing images to gain attention –Colour can be used to indicate device status Auditory signals –Will quickly call attention to the occurrence of a fault