Presentation is loading. Please wait.

Presentation is loading. Please wait.

Nagios in Power Transmission Utilities Fernando Covatti

Similar presentations


Presentation on theme: "Nagios in Power Transmission Utilities Fernando Covatti"— Presentation transcript:

1 Nagios in Power Transmission Utilities Fernando Covatti fernando.covatti@ceee.com.br fcovatti@gmail.com

2 Introduction & Agenda Brazilian Electrical Sector Overview CEEE-GT experience within Nagios Core Motivation for Different areas of the company –Telecommunications –Automation –Protection and Control –Supervision Results and Future Plans

3 Brazillian Electrical Sector Regionalized until mid-1990.

4 Brazillian Electrical Sector Regionalized until mid-1990. Regional companies controlled their respective areas and they could have vertical expertise.

5 Brazillian Electrical Sector Regionalized until mid-1990. Regional companies controlled their respective areas and they could have vertical expertise. Generation, Transmission and Distribution of Electricity.

6 Brazillian Electrical Sector Regionalized until mid-1990. Regional companies controlled their respective areas and they could have vertical expertise. Generation, Transmission and Distribution of Electricity. In the second half of the 1990s, rules changed.

7 Brazillian Electrical Sector Regionalized until mid-1990. Regional companies controlled their respective areas and they could have vertical expertise. Generation, Transmission and Distribution of Electricity. In the second half of the 1990s, rules changed. The increasing interconnectivity of various states created the need to regulate and discipline the electrical sector.

8 Brazillian Electrical Sector Regionalized until mid-1990 Regional companies controlled their respective areas and they could have vertical expertise. Generation, Transmission and Distribution of Electricity. In the second half of the 1990s, rules changed. The increasing interconnectivity of various states created the need to regulate and discipline the electrical sector. ANEEL and ONS were created.

9 Brazillian Electrical Sector National Interconnected System (SIN)

10 Brazillian Electrical Sector National Interconnected System (SIN) Biggest of its kind in the world.

11 Brazillian Electrical Sector National Interconnected System (SIN) Biggest of its kind in the world. More than 100 thousand km of transmission lines (equal to or higher than 230kV)

12 Brazillian Electrical Sector National Interconnected System (SIN) Biggest of its kind in the world. More than 100 thousand km of transmission lines (equal to or higher than 230kV) Only 1,7% of Energy used in the country are not in the interconnected system.

13 Brazillian Electrical Sector National Interconnected System (SIN) Biggest of its kind in the world. More than 100 thousand km of transmission lines (equal to or higher than 230kV) Only 1,7% of Energy used in the country are not in the interconnected system. A failure in a substation or transmission line can impact in the whole country (blackout).

14 Company Presentation CEEE was founded in 1943.

15 Company Presentation CEEE was founded in 1943. Operates in the 3 main areas of The Brazilian Electrical Sector: Power Generation (G), Transmission (T) and Distribution(D).

16 Company Presentation CEEE was founded in 1943. Operates in the 3 main areas of The Brazilian Electrical Sector: Power Generation (G), Transmission (T) and Distribution(D). The state government has equity control of the company.

17 Company Presentation CEEE was founded in 1943. Operates in the 3 main areas of The Brazilian Electrical Sector: Power Generation (G), Transmission (T) and Distribution(D). The state government has equity control of the company. Considerable Eletrobras participation (~32%), which is the main provider for the federal government

18 Company Presentation 3,800 employees.

19 Company Presentation 3,800 employees. 6th largest company in Rio Grande do Sul State (117th largest company in Brazil).

20 Company Presentation 3,800 employees. 6th largest company in Rio Grande do Sul State (117th largest company in Brazil). Generates 75% of the State Hydroelectricity

21 Company Presentation 3,800 employees. 6th largest company in Rio Grande do Sul State (117th largest company in Brazil). Generates 75% of the State Hydroelectricity Owns 5.781 km of transmission lines.

22 Company Presentation 3,800 employees. 6th largest company in Rio Grande do Sul State (117th largest company in Brazil). Generates 75% of the State Hydroelectricity Owns 5.781 km of transmission lines. Distributes electrical energy for one third of the State (3.5 million people).

23 Supervision Area Division was founded in the mid-1970s.

24 Supervision Area Division was founded in the mid-1970s. Initially focused on the data state of the electrical system.

25 Supervision Area Division was founded in the mid-1970s. Initially focused on the data state of the electrical system. With the growth of the system, greater demands were aggregated.

26 Supervision Area Division was founded in the mid-1970s. Initially focused on the data state of the electrical system. With the growth of the system, greater demands were aggregated. New devices were installed.

27 Supervision Area Division was founded in the mid-1970s. Initially focused on the data state of the electrical system. With the growth of the system, greater demands were aggregated. New devices were installed. New demands were made by the regulator.

28 Supervision Area Division was founded in the mid-1970s. Initially focused on the data state of the electrical system. With the growth of the system, greater demands were aggregated. New devices were installed. New demands were made by the regulator. Need to reduce downtime of equipment.

29 Supervision Area Division was founded in the mid-1970s. Initially focused on the data state of the electrical system. With the growth of the system, greater demands were aggregated. New devices were installed. New demands were made by the regulator. Need to reduce downtime of equipment. Need to remotely control Substations.

30 Supervision Area Composed mainly of electronic/electrical engineers and technicians.

31 Supervision Area Composed mainly of electronic/electrical engineers and technicians. Weak Computer knowledge among team members (no course graduation in the IT area).

32 Supervision Area Composed mainly of electronic/electrical engineers and technicians. Weak Computer knowledge among team members (no course graduation in the IT area). Large gap between new and old employees, due to a large time without new hires.

33 Supervision Area Composed mainly of electronic/electrical engineers and technicians. Weak Computer knowledge among team members (no course graduation in the IT area) Large gap between new and old employees, due to a large time without new hires. Old concepts and techniques are very difficult to change.

34 Motivation The amount of data has been growing exponentially.

35 Motivation The amount of data has been growing exponentially. Many of these data are not directly linked to real time.

36 Motivation The amount of data has been growing exponentially. Many of these data are not directly linked to real time. Increasing number of data to be supervised versus selective users interest.

37 Motivation The amount of data has been growing exponentially. Many of these data are not directly linked to real time. Increasing number of data to be supervised versus selective users interest. Several of these data are alarmed for long time.

38 Motivation The amount of data has been growing exponentially. Many of these data are not directly linked to real time. Increasing number of data to be supervised versus selective users interest. Several of these data are alarmed for long time. Disrupting the work of real time staff.

39 Motivation The amount of data has been growing exponentially. Many of these data are not directly linked to real time. Increasing number of data to be supervised versus selective users interest. Several of these data are alarmed for long time. Disrupting the work of real time staff. The maintenance staff is not informed of problems.

40 Motivation The amount of data has been growing exponentially. Many of these data are not directly linked to real time. Increasing number of data to be supervised versus selective users interest. Several of these data are alarmed for long time. Disrupting the work of real time staff. The maintenance staff is not informed of problems. This leads the system to become discredited.

41 Motivation Reduction of revenues led to reduction of employees on the long term (retirement and no new hires).

42 Motivation Reduction of revenues led to reduction of employees on the long term (retirement and no new hires). Telecontrol of substations became a priority in order to reduce Substations operators workforce.

43 Motivation Reduction of revenues led to reduction of employees on the long term (retirement and no new hires). Telecontrol of substations became a priority in order to reduce Substations operators workforce. Higher availability of systems are required when telecontrol is used.

44 Motivation Overview Substation Field Devices Substation Protection Realys

45 Motivation Overview Substation Field Devices Substation Protection Realys Substation Automation Devices

46 Motivation Overview Substation Field Devices Substation Protection Realys Substation Automation Devices Operation Centers EMS

47 Motivation Overview Substation Field Devices Substation Protection Realys Substation Automation Devices Operation Centers EMS Operation Center HMI

48 Motivation Overview Substation Field Devices Substation Protection Realys Substation Automation Devices Operation Centers EMS Operation Center HMI National System Operator

49 Motivation Overview Substation Field Devices Substation Protection Realys Substation Automation Devices Operation Centers EMS Operation Center HMI National System Operator Database Servers

50 Motivation Overview Substation Field Devices Substation Protection Realys Substation Automation Devices Operation Centers EMS Operation Center HMI National System Operator Database Servers Corporate Network

51 Motivation for Substation Devices Online Graphic Supervision of failure on ethernet based devices inside substations.

52 Motivation for Substation Devices Online Graphic Supervision of failure on ethernet based devices inside substations. Due to the redundancy and use of RSTP (or other redundancy protocols), the flaws are often unnoticed, and failed devices are not replaced.

53 Motivation for Substation Devices Online Graphic Supervision of failure on ethernet based devices inside substations. Due to the redundancy and use of RSTP (or other redundancy protocols), the flaws are often unnoticed, and failed devices are not replaced. Preventive Maintenance, mainly in substations implemented with IEC61850.

54 Motivation for Substation Devices Online Graphic Supervision of failure on ethernet based devices inside substations. Due to the redundancy and use of RSTP (or other redundancy protocols), the flaws are often unnoticed, and failed devices are not replaced. Preventive Maintenance, mainly in substations implemented with IEC61850. Supervision also where there is a 2nd communication channel for the Operation Center.

55 Motivation for Telecommunications Different communication devices and vendors.

56 Motivation for Telecommunications Different communication devices and vendors. Multiplexers (SDH and SONET).

57 Motivation for Telecommunications Different communication devices and vendors. Multiplexers (SDH and SONET). Switches (Ethernet).

58 Motivation for Telecommunications Different communication devices and vendors. Multiplexers (SDH and SONET). Switches (Ethernet). Analog and Digital Radio (Serial communication).

59 Motivation for Telecommunications Different communication devices and vendors. Multiplexers (SDH and SONET). Switches (Ethernet). Analog and Digital Radio (Serial communication). Power Line Communication.

60 Motivation for Telecommunications Different communication devices and vendors. Multiplexers (SDH and SONET). Switches (Ethernet). Analog and Digital Radio (Serial communication). Power Line Communication. Different Management Softwares.

61 Motivation for Telecommunications Different communication devices and vendors. Multiplexers (SDH and SONET). Switches (Ethernet). Analog and Digital Radio (Serial communication). Power Line Communication. Different Management Softwares. Architecture only drawn (without online state).

62 Motivation for Telecommunications Different communication devices and vendors. Multiplexers (SDH and SONET). Switches (Ethernet). Analog and Digital Radio (Serial communication). Power Line Communication. Different Management Softwares. Architecture only drawn (without online state). Most susceptible to failures (shared links with other companies, weather,…).

63 Nagios Usage Data Excess Problem of the excessive number of points became critical.

64 Nagios Usage Data Excess Problem of the excessive number of points became critical. There have been a few attempts to solve the problem by reducing the number of points.

65 Nagios Usage Data Excess Problem of the excessive number of points became critical. There have been a few attempts to solve the problem by reducing the number of points. It did not work for obvious reasons.

66 Nagios Usage Data Excess Problem of the excessive number of points became critical. There have been a few attempts to solve the problem by reducing the number of points. It did not work for obvious reasons. We need to monitor increasingly data points.

67 Nagios Usage Data Excess Problem of the excessive number of points became critical. There have been a few attempts to solve the problem by reducing the number of points. It did not work for obvious reasons. We need to monitor increasingly data points. Another attempt was to include filters alarms.

68 Nagios Usage Data Excess Problem of the excessive number of points became critical. There have been a few attempts to solve the problem by reducing the number of points. It did not work for obvious reasons. We need to monitor increasingly data points. Another attempt was to include filters alarms. These filters alarms end up making users forget most of the filtered points.

69 Nagios Usage Data Separation Monitoring a growing number of points requires more elaborate solutions.

70 Nagios Usage Data Separation Monitoring a growing number of points requires more elaborate solutions. The interest in this information is not the same for all teams.

71 Nagios Usage Data Separation Monitoring a growing number of points requires more elaborate solutions. The interest in this information is not the same for all teams. Also the frequency of monitoring needs to be the same.

72 Nagios Usage Data Separation Monitoring a growing number of points requires more elaborate solutions. The interest in this information is not the same for all teams. Also the frequency of monitoring needs to be the same. Thus, we sought to separate roughly into real- time information and maintenance.

73 Nagios Usage Data Separation Monitoring a growing number of points requires more elaborate solutions. The interest in this information is not the same for all teams. Also the frequency of monitoring needs to be the same. Thus, we sought to separate roughly into real- time information and maintenance. The real-time system remains SAGE.

74 Nagios Usage Data Separation Monitoring a growing number of points requires more elaborate solutions. The interest in this information is not the same for all teams. Also the frequency of monitoring needs to be the same. Thus, we sought to separate roughly into real- time information and maintenance. The real-time system remains SAGE. Nagios was introduced as the maintenance system.

75 Nagios Usage Why Nagios? Stable, was developed over an extensive period of time.

76 Nagios Usage Why Nagios? Stable, was developed over an extensive period of time. Expandable and customizable, with a wide range of add ons.

77 Nagios Usage Why Nagios? Stable, was developed over an extensive period of time. Expandable and customizable, with a wide range of add ons. Open software that meets the preferences of the team.

78 Nagios Usage Why Nagios? Stable, was developed over an extensive period of time. Expandable and customizable, with a wide range of add ons. Open software that meets the preferences of the team. Community of developers and active users.

79 Nagios Usage First Attempts In the past decade the Telecommunications Area made ​​ an attempt to monitor through Nagios.

80 Nagios Usage First Attempts In the past decade the Telecommunications Area made ​​ an attempt to monitor through Nagios. This experiment was not successful.

81 Nagios Usage First Attempts In the past decade the Telecommunications Area made ​​ an attempt to monitor through Nagios. This experiment was not successful. Lack of interest from potential users.

82 Nagios Usage First Attempts In the past decade the Telecommunications Area made ​​ an attempt to monitor through Nagios. This experiment was not successful. Lack of interest from potential users. Fine tuning in Nagios was needed.

83 Nagios Usage First Attempts In the past decade the Telecommunications Area made ​​ an attempt to monitor through Nagios. This experiment was not successful. Lack of interest from potential users. Fine tuning in Nagios was needed. It was needed a person to give daily attention to the maturation process, which didn’t exist.

84 Nagios Usage First Attempts In the past decade the Telecommunications Area made ​​ an attempt to monitor through Nagios. This experiment was not successful. Lack of interest from potential users. Fine tuning in Nagios was needed. It was needed a person to give daily attention to the maturation process, which didn’t exist. Only part of the telecommunications system of the company was monitored.

85 Nagios Usage Installation Conditions In early 2011, the team was renewed in 50%.

86 Nagios Usage Installation Conditions In early 2011, the team was renewed in 50%. Entry of new members brought new ideas.

87 Nagios Usage Installation Conditions In early 2011, the team was renewed in 50%. Entry of new members brought new ideas. Telecommunications System expanded a lot with new multiplexers and switches.

88 Nagios Usage Installation Conditions In early 2011, the team was renewed in 50%. Entry of new members brought new ideas. Telecommunications System expanded a lot with new multiplexers and switches. Telecommunications team experienced an influx of new employees.

89 Nagios Usage Installation Conditions In early 2011, the team was renewed in 50%. Entry of new members brought new ideas. Telecommunications System expanded a lot with new multiplexers and switches. Telecommunications team experienced an influx of new employees. In 2012, the company lost more than 60% of its revenue due to renovation contracts by the Federal Government.

90 Nagios Usage Installation Conditions In early 2011, the team was renewed in 50%. Entry of new members brought new ideas. Telecommunications System expanded a lot with new multiplexers and switches. Telecommunications team experienced an influx of new employees. In 2012, the company lost more than 60% of its revenue due to renovation contracts by the Federal Government. This has led to a pressing need for increased monitoring of the system and preventative action.

91 Nagios Usage Installation Nagios was again considered as a way to monitor the status of various devices.

92 Nagios Usage Installation Nagios was again considered as a way to monitor the status of various devices. Installation started in mid-2012.

93 Nagios Usage Installation Nagios was again considered as a way to monitor the status of various devices. Installation started in mid-2012. Primary focus was to monitor Linux systems.

94 Nagios Usage Installation Nagios was again considered as a way to monitor the status of various devices. Installation started in mid-2012. Primary focus was to monitor Linux systems. Soon, it expanded to other systems and areas, such as communication status of remote systems.

95 Nagios Usage Installation Nagios was again considered as a way to monitor the status of various devices. Installation started in mid-2012. Primary focus was to monitor Linux systems. Soon, it expanded to other systems and areas, such as communication status of remote systems. Several features were added in these two years.

96 Nagios Usage Installation Nagios was again considered as a way to monitor the status of various devices. Installation started in mid-2012. Primary focus was to monitor Linux systems. Soon, it expanded to other systems and areas, such as communication status of remote systems. Several features were added in these two years. Increasing in other areas of the company, like in the Substation Automation.

97 Nagios Usage Installation Nagios was again considered as a way to monitor the status of various devices. Installation started in mid-2012. Primary focus was to monitor Linux systems. Soon, it expanded to other systems and areas, such as communication status of remote systems. Several features were added in these two years. Increasing in other areas of the company, like in the Substation Automation. In June 2014 the system has expanded to a second version, now installed in Telecommunications.

98 Nagios Usage Panorama Supervision

99 Nagios Usage Panorama Supervision Telecommunication

100 Nagios Usage Customized Services Script to check raid disks.

101 Nagios Usage Customized Services Script to check raid disks. Configuration backup (Manually changed devices).

102 Nagios Usage Customized Services Script to check raid disks. Configuration backup (Manually changed devices). Configuration check (differences between database and Operation Center configuration).

103 Nagios Usage Customized Services Script to check raid disks. Configuration backup (Manually changed devices). Configuration check (differences between database and Operation Center configuration). Serial Communication state (RX/TX Bytes).

104 Nagios Usage Customized Services Script to check raid disks. Configuration backup (Manually changed devices). Configuration check (differences between database and Operation Center configuration). Serial Communication state (RX/TX Bytes). Telecommunication System Devices proprietary protocols (via telnet).

105 Nagios Usage Customized Services Script to check raid disks. Configuration backup (Manually changed devices). Configuration check (differences between database and Operation Center configuration). Serial Communication state (RX/TX Bytes). Telecommunication System Devices proprietary protocols (via telnet). Expect Language scripts.

106 Results It has provided notices of failures which could not be detected in a normal situation.

107 Results It has provided notices of failures which could not be detected in a normal situation. Failure of one of the disks in a RAID system.

108 Results It has provided notices of failures which could not be detected in a normal situation. Failure of one of the disks in a RAID system. Failure of Emergency Control Scheme system.

109 Results It has provided notices of failures which could not be detected in a normal situation. Failure of one of the disks in a RAID system. Failure of Emergency Control Scheme system. Failure of backup devices.

110 Results It has provided notices of failures which could not be detected in a normal situation. Failure of one of the disks in a RAID system. Failure of Emergency Control Scheme system. Failure of backup devices. Failure of backup communication channels.

111 Results It has provided notices of failures which could not be detected in a normal situation. Failure of one of the disks in a RAID system. Failure of Emergency Control Scheme system. Failure of backup devices. Failure of backup communication channels. Reduced response time of maintenance teams to attend occurrences.

112 Results It has provided notices of failures which could not be detected in a normal situation. Failure of one of the disks in a RAID system. Failure of Emergency Control Scheme system. Failure of backup devices. Failure of backup communication channels. Reduced response time of maintenance teams to attend occurrences. Fault location with an integrated view.

113 Future and Beyond Transferring Real-time data points to Nagios.

114 Future and Beyond Transferring Real-time data points to Nagios. Can be expanded to obtain more data from protection relays.

115 Future and Beyond Transferring Real-time data points to Nagios. Can be expanded to obtain more data from protection relays. Integration within the substation (IEC 61850, DNP LAN).

116 Future and Beyond Transferring Real-time data points to Nagios. Can be expanded to obtain more data from protection relays. Integration within the substation (IEC 61850, DNP LAN). Increasingly networked devices for substation, easily reaching 50 on today’s equipment.

117 Future and Beyond Transferring Real-time data points to Nagios. Can be expanded to obtain more data from protection relays. Integration within the substation (IEC 61850, DNP LAN). Increasingly networked devices for substation, easily reaching 50 on today’s equipment. Trend to increase the number of the substation devices.

118 Future and Beyond Transferring Real-time data points to Nagios. Can be expanded to obtain more data from protection relays. Integration within the substation (IEC 61850, DNP LAN). Increasingly networked devices for substation, easily reaching 50 on today’s equipment. Trend to increase the number of the substation devices. Usage of Nagios in Smart Grids (Bigger Networks)

119 Future and Beyond Usage of Nagios reports in order to analyze potential points of future failure.

120 Future and Beyond Usage of Nagios reports in order to analyze potential points of future failure. Provides prospective on where to invest the budget resources.

121 Future and Beyond Usage of Nagios reports in order to analyze potential points of future failure. Provides prospective on where to invest the budget resources. Relieving the burden of repetitive work.

122 Future and Beyond Usage of Nagios reports in order to analyze potential points of future failure. Provides prospective on where to invest the budget resources. Relieving the burden of repetitive work. Using Nagios as a tool of "management“: email to Decentralized Teams to provide maintenance on failed devices.

123 Future and Beyond Usage of Nagios reports in order to analyze potential points of future failure. Provides prospective on where to invest the budget resources. Relieving the burden of repetitive work. Using Nagios as a tool of "management“: email to Decentralized Teams to provide maintenance on failed devices. Integration with other tools, such as automatic generation of maps, simulators, wiki, etc.

124 Questions? Any questions? Thanks!

125 The End Fernando Covatti fernando.covatti@ceee.com.br fcovatti@gmail.com


Download ppt "Nagios in Power Transmission Utilities Fernando Covatti"

Similar presentations


Ads by Google