Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ok-Kyoon Ha, Guy Martin Tchamgoue, Jeong-Bae Suh, and Yong-Kee Jun Department of Informatics, Gyeongsang National University, Republic of Korea.

Similar presentations


Presentation on theme: "Ok-Kyoon Ha, Guy Martin Tchamgoue, Jeong-Bae Suh, and Yong-Kee Jun Department of Informatics, Gyeongsang National University, Republic of Korea."— Presentation transcript:

1 Ok-Kyoon Ha, Guy Martin Tchamgoue, Jeong-Bae Suh, and Yong-Kee Jun Department of Informatics, Gyeongsang National University, Republic of Korea

2 Contents ARINC-653 ARINC-653 Health Management Data Races On-the-fly Race Healing Framework Race Healing Mechanism Development Evaluation Conclusion

3 DASC 2010 3 ARINC-653 ARINC-653 standard defines an application executive (APEX) –To provide OS or Middle-ware services for IMA The main objective of ARINC-653 is to provide temporal and spatial partitioning –To enable applications, each executing in a partition, to run simultaneously and independently on the same architecture Temporal partitioning provides strict time slicing to guarantee that only one application accesses resources at each time Spatial partitioning provides strict memory management by guaranteeing that a partition exclusively accesses a memory area

4 DASC 2010 4 ARINC-653 Health Management (1/2) An important feature in ARINC-653 is indisputably its health monitor (HM) –It has the responsibility to detect and provide recovery mechanisms for hardware and software failures –It has the objective of containing and isolating faults before they propagate across the whole system. HM manages recovery tables in three levels indexed by both of the error identifier and the system state for a precise error handling –System HM Table –Module HM Table –Partition HM Table

5 DASC 2010 5 ARINC-653 Health Management (2/2) For errors at process level, the HM invokes a user-defined aperiodic error handler –The error handler should be efficient and execute as fast as possible not to monopolize the system RTOS Configuration (XML) Health Monitor Module OS Applications Health Monitor …

6 DASC 2010 6 Data Races (1/2) Data races may occur when two concurrent threads access a shared memory location without proper inter-thread coordination, and at least one of the accesses is a write. –Unpredictable and mysterious results due to data races may be reported to the programmer An example of multithreaded program Thread A: //dCount is shared Lock(L1) Read dCount; Add one; Write dCount; Unlock(L1); Thread B: //dCount is shared Read dCount; Add one; Write dCount; Expected result Thread AThread B Read Write Read Write Lets consider dCount++ instruction

7 DASC 2010 7 Data Races (2/2) Under the influence of the scheduler, the program may run into different interleaving and produce unexpected results Synchronization errors lead to asymmetric races –Symmetric races are usually benign, but asymmetric races are generally harmful Thread AThread B Satisfactory result Read Write Read Write Thread AThread BThread AThread B Unexpected results Read Write Read Write Read Write Our race healing is motivated by these harmful races

8 DASC 2010 8 On-the-fly Race Healing Framework (1/2) We reinforce the native health monitoring function of ARINC-653 with race detection and healing abilities Concept of race healing in ARINC-653 Thread AThread B Race Detection Health Monitor Partition OS ARINC 653 Race Healing Add/Remove Lock Thread AThread B Value Checking Read Write Read Write Read Write Read Write Notifies Invokes Heals

9 DASC 2010 9 On-the-fly Race Healing Framework (2/2) Instrumented Program Instrumented Program On-the-fly Race Detection Engine On-the-fly Race Healing Engine On-the-fly Race Healing Engine Log Partition OS Health Monitor Native Error Handler ARINC 653 Monitoring (1) (2) (3) (4) (5) (1)Instrumented program is monitored by on-the-fly race detector (2)Once a data race is detected, the HM is notified (3)The race healer will be invoked by the concerned partition OS as error handler (4)The race healer accesses the racing code and tries to heal the data race (5)If the healer fails to do this, a notification is sent back to the HM, which might launch an emergency recovery function

10 DASC 2010 10 Race Detection Engine For on-the-fly race detection, our framework uses the protocol presented by Dinning and Schonberg, 1991 –This protocol guarantees to detect at least one race for each shared variable, if any exists –The protocol defines the structure and the maintaining policy for an access history with locking mechanism R1 R3 W2 W4 TMTM TATA TBTB ReadWrite CS- Read CS- Write Access History R3 W4 Reported Races W2-R3R1-W4W2-W4 R1 W2

11 DASC 2010 11 To heal asymmetric races, our technique inserts a lock into not or incompletely synchronized thread to remove or change interleaving Race Healing Engine Thread AThread B Read Write Read Write Race Detection Thread AThread B Read Write Read Write Healing Thread AThread B Read Write Read Write Race Detection

12 DASC 2010 12 Development Environment –Single Board Computer (SBC) with Intel Xeon Dual core 2 CPUs and 4GB Memory –RT-Linux operating system –GNU C compiler 4.3 for OpenMP –Simulated integrated modular avionics (SIMA) was installed to provide ARINC-653 services The race detector and the race healer are both implemented as dynamic libraries using C –The healing function is registered in each monitored program as its error handler –Upon race detection, the SIMA HM is notified by the race detector using RAISE_APPLICATION_ERROR system call.

13 DASC 2010 13 Evaluation The efficiency of our framework was evaluated by analyzing the overhead of the race healing functions –The overhead comes from actions of the label generator, the race detector, and the race healer The results shows that our technique slows down in average about 2 times the original program execution –A set of synthetic programs which only consider asymmetric races was developed using OpenMP directives

14 DASC 2010 14 Conclusion Race Healing Framework –This paper presents a framework that can be embedded in the ARINC-653 health monitor to detect and heal data races on-the-fly –It assures the flight software to run safely Experimentation and Result –The framework implemented on the simulated integrated modular avionics (SIMA) that provides ARINC-653 services –The experimental results show that our framework slows down in average about 2 times the original program execution –The overhead introduced by our framework is manageable for a large class of soft real-time programs We will extend the healing functionality to handle more general race patterns

15


Download ppt "Ok-Kyoon Ha, Guy Martin Tchamgoue, Jeong-Bae Suh, and Yong-Kee Jun Department of Informatics, Gyeongsang National University, Republic of Korea."

Similar presentations


Ads by Google