Download presentation
Presentation is loading. Please wait.
Published byΕυτύχιος Αβραμίδης Modified over 6 years ago
1
Hardware & Software Support for Mixed-Criticality Multicore Systems
Glenn Farrall, Infineon Technologies; Claus Stellwag, Elektrobit Automotive; Jonas Diemer, TU Braunschweig;
2
Agenda TriCore® Introduction AURIX™ Devices
Support for Spatial & Temporal Isolation Interrupt Reliability Example Core to Core Communication WICERT 2013 Presentation
3
TriCore Introduction Most widely distributed microcontroller you’ve probably never heard of In approximately 50% of all automobiles produced this year 32-bit architecture with a focus on real time (hard) 16/32 bit instruction length, register operations, support for C and DSP native data types and operations. Application areas Automotive powertrain Stability control systems EVehicle: charging, BMS etc. Industrial control WICERT 2013 Presentation
4
TriCore Based Products
The marketing view The engineering view Would highlight that apart from a heck of a lot of peripherals, there is a second CPU hidden in the this diagram “the peripheral processor unit”. Version 2 to be precise – so we’ve had some experience of multiprocessors, starting with the slightly harder heterogeneous style and in positioning these (and earlier generations) for Safety Critical applications we began to run up against some of the challenges that made the RECOMP project and ideal opportunity to explore better solutions to some of the isolation problems we’d hit in these earlier generations. WICERT 2013 Presentation
5
Agenda TriCore Introduction AURIX Devices
Support for Spatial & Temporal Isolation Interrupt Reliability Example Core to Core Communication WICERT 2013 Presentation
6
AURIX MultiCore Devices
AURIX is the name of a family of TriCore based devices this is a block diagram of a high end member with 3 cores – the family ranges from uniprocessor devices through to 3 CPU devices such as this. There are features such as “checker cores” associated with two to the CPUs here – this means the Master CPU and the checker core operate as a lockstep pair and any difference between the outputs of the master and checker are reports to the system safety manager as an error and in a few clocks the system is indicating to the outside world that an error has occurred. I mention this hear because this lockstep configuration is not novel and not part of the RECOMP developments. WICERT 2013 Presentation
7
Agenda TriCore Introduction AURIX Devices
Support for Spatial & Temporal Isolation Interrupt Reliability Example Core to Core Communication WICERT 2013 Presentation
8
Spatial Isolation: MPUs
Definitions MPU – Memory Protection Unit MMU – Memory Management Unit a MPU provides functionality to check memory and I/O accesses are allowed, at minimum some description of the address region covered (explicit or implicit) some combinations of fetch, read and write permissions a Trap or Traps (exception) mechanism when access is not permitted without additional storage in the system there is little utility for an MMU in embedded systems. √ Χ MPU lower bound n upper bound n Access range n rights n ... 3FFF rd, ex Access Range 1 Much of this Spatial Isolation functionality was present in earlier TriCore based systems. The enhancements developed during RECOMP were along the lines of supporting a “Virtualisation lite” – for a Hard Real Time Hypervisor. So changes in the MPU system were to cover previously ignored I/O space – so that explicit address space I/O permission is granted Compared to earlier – Supervisor can access all – user can access none. 2000 1FFF rd,wr, ex 0000 Access Range 0 trap data at 0x01000 LD D0, 0x01000 0x01000 CPU 0x03000 ST D0, 0x03000 WICERT 2013 Presentation
9
Spatial Isolation: Risks
MPU in cores allows memory regions to be allocated safely, with 2 caveats. The MPU (in a core) doesn’t protect memory from other bus masters if they are misconfigured Mixed critical software – must presume one core is running “unsafe” software MPU doesn’t protect memory from soft error events (or hard errors which occur during runtime) after an address has been checked by the MPU WICERT 2013 Presentation
10
Risks Remaining w. MPU soft error soft error
Protected memory for CPU0 configured MPU soft error misconfigured (QM code) soft error misconfigured (QM code) configured MPU configured MPU To make this concrete – consider a system with CPU0 requiring exclusive access to resources in the Local Memory Unit (LMU) [click] All three CPUs will configure their MPUs to respect this allocation [click[ But if only the code on CPU0 is developed to the highest criticality level Then there can be no confidence that the code in the other CPUs will not under some circumstances violate this allocation – either directly [click] or indirectly [click] through misprogramming a DMA operation to overwrite LMU memory. In the past there was an “analogue” of a stripped down MPU in the DMA – but as a realisation that a more general mechanism was required (especially to support the QM managed MPU) then this has been removed. In addition to misprogramming we also have the issue of soft & hard errors [click] Its possible that a correctly programmed address is modified by a soft error (flipping a high order address bit for example) to now target the LMU SRAM. So solve both of these problems (mixed criticality code + soft errors) we have added an incoming gate to the LMU [click[ so write accesses are checked to ensure the *MASTER* attempting the modification is permitted to access the SRAM. This gate is not unique to the SRAM –in support of isolation of tasks and showing “freedom from interference” we have added a gate [click] to all resources in the system. The settup of the gate is expected to be done once (at startup) although it can be changed during system runtime if a number of hurdles to inadvertent modification are overcome. WICERT 2013 Presentation
11
Temporal Isolation & WCET
For multicore systems, determining a WCET estimate can be problematic Co-running applications will provide interference lengthening the run time, and may not be known when timing budgets are set to make matters worse determining the highest interfering co-application(s) is not easy either. A system which prevents (or avoids) high interference (i.e. provides temporal isolation), means that pessimism in the WCET estimate can be much smaller. At start of Slide Announce: Now move from considering Spatial Isolation to Temporal Isolation. Consider running an application standalone on an MP system Depending on the degree of resource contention the application will see some slowdown After Slide Content Note that a system with temporal isolation does not have to provide identical timing to a legacy uniprocessor system; even porting between different family members of a product family you woul expect performance in cycles counts or microseconds to differ by a small amount. WICERT 2013 Presentation
12
Interference Controllable System
The AURIX family has support for controlling timing interference. main interconnect is a crossbar, with no contention for disjoint resource access Achieves low interference with specific usage decisions allocation of resources to specific tasks and cores enforcement of that allocation by MPU (and access gate) configuration Remaining temporal interference is just due to peripheral bus interference between applications is now comparable to DMA & controllable by arbitration priority decisions. WICERT 2013 Presentation
13
Agenda TriCore Introduction AURIX Devices
Support for Spatial & Temporal Isolation Interrupt Reliability Example Core to Core Communication WICERT 2013 Presentation
14
Interrupt Infrastructure
soft error soft error CPU Interrupt Router (IRU) directs trigger events software interrupt pin signal peripheral state events, e.g. timer count down or comm data arrival IRU is highly configurable per trigger event select service provider (CPUx or DMA) select priority (ISR priority on CPUx or DMA channel) Soft (and hard) errors could corrupt stored state in the IRU could change values transferred to the Service Provider Read – sources are pretty conventional [ click ] Interrupt router enhanced from earlier generations where there was just two CPUs that could receive interrupts Now have 4 services providers – upto 3 CPUs and the DMA engine Hard Real Time focus means each CPU has hardware management of 255 priorities So far, so good – but for safety systems we need to ensure any errors resulting in misexecution are detected. [click] We can see two areas where soft errors could disrupt things [click] WICERT 2013 Presentation
15
Protecting Interrupt Integrity
There is ECC on the protocol between the IRU and Service Providers to check state is correct when it is used Ensures correct service provider is signaled Ensures correct priority/DMA channel is received Ensures trigger event was correctly enabled An ISR executing on a CPU can be sure it has correctly been initiated by the correct trigger source, the only remaining check required in SW is on interrupt rate too many interrupts: babbling idiot? too few interrupts, e.g. wrong time base, or perhaps none due to a trace broken Programming of IRU is protected by gate mechanism => can restrict access to only trusted CPUs/tasks. WICERT 2013 Presentation
16
Agenda TriCore Introduction AURIX Devices
Support for Spatial & Temporal Isolation Interrupt Reliability Example Core to Core Communication WICERT 2013 Presentation
17
Claus Stellwag (Elektrobit) March 2013 – WICERT
Core-2-Core Module Claus Stellwag (Elektrobit) March 2013 – WICERT
18
Concepts (1) Basic Safety Should be fit into an AUTOSAR system
Static approach / no dynamics HW assumption: shared memory accessible on all cores. Safety No propagation of faults over C2C Lockfree behaviour ( No deadlocks) Only “local” writes, allow protection mechanisms (MPU) State handling (Safety state) 13 November 2018, Slide 18
19
Concepts (2) Initializing
Requirement to update cores without need to update all. How to find other core structures after update? Search for other cores Add / remove cores 13 November 2018, Slide 19
20
Concepts (3) States of the C2C module 13 November 2018, Slide 20
21
Concepts (4) Communication based on .. Channels (“message box”)
Messages are send/received with channels “last is best” semantic 1 sender (core) and multiple receiver (cores) Receiver Channel Sender Receiver Message Receiver 13 November 2018, Slide 21
22
Concepts (5) Receiver Cores Sender Core TASK Channel TASK TASK Message
Local RAM TASK Channel TASK TASK Message TASK 13 November 2018, Slide 22
23
Configuration 13 November 2018, Slide 23
24
Questions
25
Prototyping WICERT 2013 Presentation
26
WICERT 2013 Presentation
27
EcoSystem Debugging WICERT 2013 Presentation
28
Application Prototyping
WICERT 2013 Presentation
29
Supporting Material if Required
30
TriCore Memory Protection Unit (MPU)
The memory protection system allows up to 8 code regions to be accessed concurrently The memory protection system allows up to 16 data regions to be accessed concurrently these regions can grant access to peripheral addresses as well as memory addresses. e.g. can configure protection so that TaskA can load/store directly to SPI0, but not SPI1; while TaskB can load/store directly to SPI1, but not SPI0. RECOMP DATE Tutorial 2012
31
Memory Range Definitions
Code Protection Range Registers Data Protection Range Registers lower bound 7 upper bound 7 code range 7 data range 15 lower bound 15 upper bound 15 ... ... data range 2 lower bound 2 upper bound 2 lower bound 1 upper bound 1 data range 1 lower bound 1 upper bound 1 Typically set 0 is used for regular (background) tasks and set 1 is used for ISRs code range 1 code range 0 lower bound 0 upper bound 0 data range 0 lower bound 0 upper bound 0 RECOMP DATE Tutorial 2012
32
Memory Protection Sets (0..3)
Code Protection Range Registers Data Protection Range Registers lower bound 7 upper bound 7 code range 7 execution permitted data range 15 lower bound 15 upper bound 15 ... √ Χ √ Χ ... ... √ Χ write only ... ... data range 2 lower bound 2 upper bound 2 read only lower bound 1 upper bound 1 data range 1 lower bound 1 upper bound 1 no access Typically set 0 is used for regular (background) tasks and set 1 is used for ISRs code range 1 execution NOT permitted code range 0 lower bound 0 upper bound 0 data range 0 lower bound 0 upper bound 0 Execute Enable Register Set 0 Execute Enable Register Sets 0-3 execution permitted Read Enable Register Sets 0-3 Read Enable Register Set 0 Write Enable Register Sets 0-3 Write Enable Register Set 0 read & write Note: by construction any address not enabled in a range definition has no execute or data access RECOMP DATE Tutorial 2012
33
Memory Protection System Traps
Any non-permitted access (memory or peripheral) takes a protection trap PROTECTION READ PROTECTION WRITE PROTECTION EXECUTION with the violating address and other information available This information allows the supervisor code to decide to either terminate the task, or emulate the access on behalf of the task, or reconfigure the memory protection system to permit the access and then return to the task to retry the access, or any other action that is suitable to the system (e.g. suspend the task pending some other operation). RECOMP DATE Tutorial 2012
34
TriCore Interrupt Priority Numbers
The PN (interrupt Priority Number) of cores go from 0 (lowest) to 255 (highest). Event triggers from peripherals or software are managed by service request nodes in the interrupt router these contain a SRPN (Service Request Priority Number) between 0 and 255 An interrupt is taken by a core when two conditions are true the core’s interrupt enable bit (ICR.IE) is set (1) the SRPN of an incoming interrupt is greater than the cores Current CPU Priority Number (ICR.CCPN) If a CPU has a CCPN of 0 and IE is 1 then any interrupt with an SPRN >0 will be taken If CPU is at CCPN of 255, regardless of the IE value no interrupt received will be taken. Programming an SRPN to 0 will cause it to never be taken by a CPU (but will for a DMA). RECOMP DATE Tutorial 2012
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.