Presentation is loading. Please wait.

Presentation is loading. Please wait.

Monitoring Dynamic IOC Installations Using the alive Record Dohn Arms Beamline Controls & Data Acquisition Group Advanced Photon Source.

Similar presentations


Presentation on theme: "Monitoring Dynamic IOC Installations Using the alive Record Dohn Arms Beamline Controls & Data Acquisition Group Advanced Photon Source."— Presentation transcript:

1 Monitoring Dynamic IOC Installations Using the alive Record Dohn Arms Beamline Controls & Data Acquisition Group Advanced Photon Source

2 Rationale for the alive Record  Issue: Want convenient central resource for IOCs operational statuses.  Issue: Want adding a new IOC to system to need no central configuration.  Issue: Want to not have to worry about being on the same subnet as IOCs.  Solution: Use centralized heartbeating system as failure detection model. –alive record on the IOC sends UDP heartbeats to the central database server. –Database server accepts any IOC that sends heartbeat, but it can always filter. –Not using CA, so no subnet issues. EPICS Collaboration Meeting, Spring 2015 2

3 Rationale for the alive Record  Issue: Know details of IOCs that have constantly changing configurations automatically. –Beamline wants latest version of hardware or software. –Detector is moved between beamlines.  Solution: Allow database server to remotely query the alive record for configuration information. –The record has a TCP port that only the database server can access. –Usually only done when a new alive session is seen. EPICS Collaboration Meeting, Spring 2015 3

4 alive Record Design  Has two parts –Thread with timer that sends a UDP heartbeat once a period to database server. –Thread that binds a TCP port, waiting for information requests from the database server.  Uses a custom networking protocol.  Processing record doesn’t actually do anything, as it uses internal timer. Has fields for turning things off if necessary.  Heartbeat period, local TCP port, and heartbeat value only things that can’t be changed. EPICS Collaboration Meeting, Spring 2015 4

5 alive Record Fields  Version (string)  Remote address and port  Heartbeat value  Heartbeat period  User message (next slide)  Magic number for allowing database server to filter UDP packets  Remote Port –Status –Port number –Read request flag –Read suppression flag  Environment variables to send EPICS Collaboration Meeting, Spring 2015 5

6 User Message  MSG field  Unsigned 32-bit integer  Sent with each heartbeat UDP  Changing MSG value does not trigger a heartbeat to be sent.  Allows simple messages to be passed along with heartbeat messages.  Meant for debugging IOC issues, by passing flags back to the database server and its clients.  Use is up to IOC maintainer and database clients. –A calc record could set a bit in this value if the CPU usage crosses a threshold. EPICS Collaboration Meeting, Spring 2015 6

7 Heartbeat Service  Set by HPRD field value (defaults to 15 seconds)  Can toggle sending heartbeats with HRTBT field  VAL field increments with each heartbeat sent  Heartbeat UDP Packet Contents –Magic number (for filtering at database server) –Protocol Version (5 currently) –Incarnation (uses boot time) and current time –Heartbeat value –Flags (used for information port) –IOC information port number –User message –IOC name EPICS Collaboration Meeting, Spring 2015 7

8 Information Port Service  Port number can be set manually, or be automatically assigned by OS. –If port can’t be bound, thread terminates and sets status (IPSTS) to “Inoperable”.  Connection initialized by database server. –Queries only allowed from IP address of remote database (in RHOST). –Record can turn off allowing connections with ISUP field, a flag is sent in heartbeat message signifying that the port is turned off.  Record can ask the database server to read information with ITRG field. –This is automatically done when fields are changed. –User can do this in case environmental value has been changed. EPICS Collaboration Meeting, Spring 2015 8

9 Information Port Data  Protocol Version (5 currently)  IOC type (currently only vxWorks, Linux, Darwin, and Windows)  Environmental Variables set in ENVxx fields  IOC specific information –Vxworks: boot configuration –Linux and Darwin: user, group, and hostname –Windows: user and machine EPICS Collaboration Meeting, Spring 2015 9

10 Implementing Database Server  Heartbeat processing –Filter against magic number –Filter against support protocol versions –Find or create IOC entry If entry exists, and incarnation different, read IOC information. If heartbeat value is lower, toss (old packet). Calculate uptime from time values sent. Record arrival time (database clock)  Information Reading –Open TCP port (from heartbeat) at IOC. –Attached information to IOC record. EPICS Collaboration Meeting, Spring 2015 10

11 IOC Status Change  Boot –When a new incarnation is seen.  Failure –Periodically check each IOC for failure. –Failure time is set by number of missing heartbeats and heartbeat period. Four heartbeats with default 15 second period is one minute. –Check elapsed time from last heartbeat against failure time.  Recovery –Failure happens, then heartbeats seen with incarnation last seen (network link went down. EPICS Collaboration Meeting, Spring 2015 11

12 BCDA implementation  Has over 100 IOCs in system, from beamlines and detector pool.  Any IOC allowed to join (added more ways to differentiate IOCs).  Design –Written in C as threaded daemon –In-memory database using autobalancing binary tree uses many-reader, single write model (preferring write) –Clients access data over TCP port, using API code –Records state of each IOC in case of daemon restart –Records each boot of an IOC –Does failure determination EPICS Collaboration Meeting, Spring 2015 12

13 BCDA clients  They use API to talk to database server over the network.  Have web page that lists the database (CGI). –Can select on filters for ENGINEER, GROUP, and subnet. –Can select an IOC to see details.  Command line client –Can specify what type of information sent back. –Useful for scripts that can use up-to-date information. I have script that takes me to latest TOP directory for an IOC.  XML clients –Can be run as command-line or as webpage (CGI). EPICS Collaboration Meeting, Spring 2015 13

14 Conclusion  Has been working well for over a year at APS. –There have been no issues with IOCs from running alive record.  Released as version 1-0-0 recently.  It is in synApps 5.8.  Release the database code soon –Need to fix a few minor issues/inconveniences. –Simplify some of the state logic. –The code has some paths that need to be changed to compile-time options. EPICS Collaboration Meeting, Spring 2015 14


Download ppt "Monitoring Dynamic IOC Installations Using the alive Record Dohn Arms Beamline Controls & Data Acquisition Group Advanced Photon Source."

Similar presentations


Ads by Google