Presentation on theme: "Presented by: Jeff Schaffer Sr"— Presentation transcript:
1 Presented by: Jeff Schaffer Sr Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems“Embedded Operating Systems:The State of the Art”QNX is a leading provider of real time operating system (RTOS) software, development tools, and services for mission critical embedded applications.
2 Role of the Embedded OS Traditional Permit sharing of common resources of the computer (disks, printers, CPU)Provide low-level control of I/O devices that may be complex, time dependent, and non-portableProvide device-independent abstractions (e.g. files, filenames, directories)Additional RolesPrevent common causes of system failure and instability; minimize impact when they occurExtend system life cyclesIsolate problems during development and at runtime
3 Architecture Comparison REAL TIME EXECUTIVEAdvantage: single address spaceDisadvantage: single address space, different binary imagesFailure: means rebootMONOLITHIC KERNELAdvantage: apps run in own memory spaceDisadvantage: kernel not protected, kernel testingFailure: might mean rebootTRUE MICROKERNELAdvantageModules run in own memory spaceAdd/replace services on the flyReusable modulesDirect hardware accessDisadvantage: context switchingFailure: usually does not mean reboot
4 MicroKernel – Neutrino FlashfsysAudiodriverTCP/IPProcessManagerAppSerialdriverHttpserverJavaPhoton GUIMicrokernelX86, PPC, MIPS, SH4, ARM, StrongARM, XScalePhilosophy: a trusted kernel running a system of untrusted software componentsDynamic architecture makes hot-start and upgrades easy, even with driversProcesses communicate via messages or other methods, such as shared memory. Permits loose inter-module coupling.Processes provide a reusable component model with well defined message interfacesNo requirement for filesystem, GUI, etc.
5 Typical Forms of IPC Mailboxes Pipes Message Queues Shared Memory Process 1Process 2MailboxesKernelPipesProcess 1Process 2Message Queuesmsg 4msg 3msg 2msg 5Process addressmapProcess addressmapShared memoryobjectShared Memorymapmap
6 Which Architecture for me? Depends on your application and processor!Simple apps (such as single control loops) generally only need a real-time executiveAs system becomes more complex, typically need a more complex operating system architectureNeed to look at factors such as scalability and reliabilityDo standards matter?
7 Advantages of standards API’sTwo most common standardsAdvantages of standardsPortability of codeHiring of programmers
8 Do I need Real-Time? Maybe ... What is Real Time? Less than 1 second response?Less than 1 millisecond response?Less than 1 microsecond response?
9 Real-Time"A real-time system is one in which the correctness of the computations not only depends upon the logical correctness of the computation but also upon the time at which the result is produced. If the timing constraints of the system are not met, system failure is said to have occurred."Donald Gillies (comp.realtime FAQ)
10 Bill O. Gallmeister - POSIX.4 Programming for the Real World A Simple Example...“it doesn’t do you any good if the signal that cuts fuel to the jet engine arrives a millisecond after the engine has exploded”Bill O. Gallmeister - POSIX.4 Programming for the Real World
11 “Hard” vs. “Soft” Real Time absolute deadlineslate responses cannot be tolerated and may have a catastrophic effect on the systemexample: flight controlSoftsystems which have reduced constraints on "lateness”; e.g. late responses may still have some valuestill must operate very quickly and repeatablyexample: cardiac pacemakerATM
12 Real-time OS Requirements Operating system factors that permit real-time:Thread SchedulingControl of Priority InversionTime Spent in KernelInterrupt Processing
14 Factor #2: Priority Inversion Information Bus ManagerCommunications TaskMeteorological Data Gathering TaskSource: Embedded Systems ProgrammingSequence:1. Low priority task acquires bus mutex to transfer data2. High priority task blocks until mutex released3. Medium priority task pre-empts low priority task4. Watchdog timer resets since Bus Manager has not run in some time
15 Factor #3: Kernel Time Kernel operations must be pre-emptible if they are not, an unknown amount of time can be spent in the kernel performing an operation on behalf of a user processcan cause real-time process to miss deadlineAll kernels have some window (or multiple windows) of time where pre-emption cannot occurSome operating systems attempt to provide real-time capability by adding “checkpoints” within the kernel so they can be interrupted at these points
16 Example usecs usecs usecs int KER a few opcodes Interrupts off Entrya few opcodesInterrupts offUnlockedKernelOperationwhichmayincludemessagepassusecstomsecsPre-emptableA Kernel call is asoftware interruptLockedusecsNo pre-emptionInterrupts onUnlockedusecsPre-emptableiretExita few opcodesInterrupts off
17 Split Out Long Operations NtoProcThreadSyncMessageSchedSignalChannelClockTimerIntrForkExecPathnameSpawnMmapWaitpidSessionUID/GIDDebugProcessManager
18 Factor #4: Interrupts This is broken down into the following areas: Method of handling the interrupt processing chainHandling of Nested Interrupts
19 Interrupt Processing Chain ISRINT xINT yISTIST scheduled by normal OS scheduling,deterministicISRINT xINT yISTIST scheduled whenever queue emptied, non-deterministic
20 Can I Make Any Conventional OS Real-Time MethodAdd real-time layer below conventional OS, running conventional OS as a low priority real-time processAdd real-time layer to hardware service layerConventional OSProblemsdifferent API’sreal-time layer proprietaryexisting OS apps not R/Tpoor communication between operating systemsloss of control issueReal-time kernel
22 Scaling Solution #1: Single Board, Single Node Mem.BridgePCIBusCPUPeripheralsThe only scaling possible is a CPU replacement
23 Scaling Solution #2: Single Board, Multiple Nodes Mem.BridgePCIBusCPUPeripheralsNode 1BridgePCIBusCPUPeripheralsNode 2Relatively simple to implementAllows “scaling-on-demand”Suitable if nodes have independent “work”Inter-node IPC slower than memory accessComplexity in maintaining global view of dataDifficult to break-up computationally-intensive tasks
24 Scaling Solution #3: Single Board, Multiple Processors Mem.BridgeCPUPCIBusCPU1PeripheralsTightly-coupled symmetric multiprocessing (SMP)All processors have a symmetric and consistent view of physical memory and peripheralsScales processing powerNeed software (RTOS) support
25 The SMP OS DilemmaSMP systems to date use desktop operating systems; not responsive enough for real-time requirementsApplication serversDatabasesWeb serversTypical real-time operating systems (home-built or commercial), such as are commonly used in routers and switches today, do not have SMP supportSMP capable real-time operating systems run the CPU’s as independent processors with independent operating systems
26 SMP Support True (tightly coupled) SMP support Only the kernel needs SMP awarenessTransparent to application software and drivers - identical binaries for UP and SMP systemsAutomatic scheduling across all CPU’s
27 QNX “True” SMP STATE_RUNNING thread on each processor Priority-based ready queuesEach thread can be locked to a specific CPU by using a processor affinity maskScheduler remembers last CPU thread ran onMinimize thread migrationOptimize cache usageHighest-priority READY thread always immediately scheduledThreadRunningCPU 0ProcessCPU 1Ready queues63Priority6261...Blocked states
28 Why Is Cache Important?Cache efficiency is probably the single largest determinant of performance on SMPCoherent view of physical memory is maintained using cache snoopingCache snooping is done at the CPU bus level and so operates at lower speeds than coreCoherency is “invisible” to software
29 Performance Implications Snoop traffic expected on SMPCache hits generally cause no bus transactionMultiple processors writing to same location degrades performance (ping-pong effect)Performance degrades when large amount of data modified on one processor and read on the otherSometimes it is better to have specific threads in a process run on same CPU
30 Designing for SMP: One Big task Will not work with SMPSingle threadGiant App
31 Designing for SMP: Single Threaded Tasks App 1App 2Works with SMPProcess data can be shared with shared memoryGood concurrency, some complexityIPC not usually as efficient as memory sharing
32 Designing for SMP: Scaling Software with Threads Single copy serverAll process data is implicitly shared and accessibleCan achieve good concurrency with less complexityPOSIX synchronization usedMutexesSemaphoresCondition variablesUsually more efficient than inter-process synchronizationThreadsServerNote: SMP finds concurrency problems fast!
33 Optimizing Compute-intensive Applications Pool of worker threadsDispatch “work” to worker threadsScales very well with SMPThe tricky part is “breaking up” the problemWorker threadWorker threadMain threadThreadsApplication
34 Interrupt Handling IST ISR Interrupt processed on CPU that was targetedCan distribute load by handling interrupts on different processorsSometimes not the optimal strategy due to cache effectsCPU 0CPU 1IRQ CPU1IRQ 7IRQ 10IRQ 8IRQ 9
36 Example Chassis ... Network Network Network High-speed interconnect Line cardLow-speed busNetworkHigh-speed interconnectNetworkNetworkLine card...
37 The QNET MicroNetwork Microkernel Flash Fsys CDROM TCP/IP Process ManagerQNETAudioPhotonAppLAN orInternet orBackplaneMessages flow transparently through QNET from one message bus to another.QNETMicrokernelAppAll applications and servers become network distributed without any special code.
38 QNX Qnet ManagerExtends message passing across multiple QNX microkernelsOver anything with a packet driver:Ethernet, RapidIO, 3GIO, InfiniBand, Stargen, etc.Class of serviceUse symbolic prefixes to make client code independent of location of resource managerLinecardControl
39 One or multiple links can connect different nodes. QNET Class of ServiceControlcardLinecardLinecardOne or multiple links can connect different nodes.
40 QNET: Load-Balanced Distribution LinecardControlData is sent out the link which will deliver it the fastest. This is based upon link speed and queue length for each link.
41 QNET: Ordered Distribution LinecardControlData is sent out a primary link. If it fails, data is diverted to a secondary link. The primary link is probed and when it comes back online, data is diverted back to it.
42 QNET: Parallel Distribution LinecardControlData is sent out both links at the same time. A failure on either of the links is handled gracefully.
43 Designing for Networked SMP: Single/Multi Threaded Tasks Multiple threadsSingle threadApp 1App 2Different processes necessary for different nodesWorks with SMPProcess data can be shared with shared memoryIPC for networked communication
44 Transparent Redirection Client Node/net/a/dev/serviceAClient/serviceB/net/b/dev/serviceSimple link provides transparent redirectionProcess has to monitor status of linkSwitch over is not transparent to client
45 Transparent Redirection Client Node/net/a/dev/serviceAClientServicemgr/dev/serviceB/net/b/dev/serviceService manager acts as a proxyMonitors health of and/or load on services/nodesSwitch over is transparent to client
47 MOST BUS Qnet FLASH FSYS TCP/IP App Blue Tooth Graphics Browser Audio PhotonQnetFLASHFSYSGraphicsBrowserAudioPhotonQnetCDROM
48 MOST BUS Qnet FLASH FSYS TCP/IP App Blue Tooth FLASH FSYS Photon GraphicsCDROMFSYSBrowserBrowserAudioQnetGraphicsQnet
49 Reliability and Availability Title of presentationTitle 2
50 Why? Embedded systems are different! Failure in an embedded system can have severe effects - like death …“Pilots really hate to be told they haveto reboot their plane while in flight”Walter Shawlee
51 Definitions MTBF: Mean Time Between Failure The average number of hours between failures for a large number of components over a long time. (e.g. MIL-HDBK-217)MTTR: Mean Time To RepairTotal amount of time spent performing all corrective maintenance repairs divided by the number of repairsMTBI: Mean Time Between Interruptions.The average number of hours between failures while a redundant component is down.
52 Defining HA Reliability Availability 5 Nines Quantified by failure rate (MTBF) Time to resume service after failure is MTTRReliabilityAllows for failure, with quick service restoration. As MTTR 0, Availability 100%Availability< 5 minutes downtime / year (> % uptime)Assume faults exist: design to contain, notify, recover and restore rapidly5 Nines
53 Costs speak for themselves Annual Cost of Downtime versus AvailabilitySource: Gartner Group ($13,000/minute Cross-industry Average)
54 Availability via Reliability and Repair low MTTR -> high availabilitySystem is composed of reliable components, that are protected from each other, and that communicate ONLY through well known interfaces.this leads tofault isolationspeedy recoveryreset a component not a board/systemdynamic controlstop/startupgrade
55 Software vs Hardware HA utilizes redundancy of key componentsa single fault cannot cause all redundant components to fail (No SPOF). e.g. mirrored disks, multiple system boards, I/O cardsActive/active, active/spare, active/standbyBut that’s only part of the problem!!!Software is a Significant Cause of Downtime
57 High Level Look at a Core Router/Switch OCLD (1W)OCLD (2W)OCLD (3W)OCLD (4W)OCI (1A)OCI (1B)OCI (2A)OCI (2B)OCM (A)OCM (B)OCI (3A)OCI (3B)OCI (4A)OCI (4B)OCLD (4E)OCLD (3E)OCLD (2E)OCLD (1E)Shelf ProcessorFillerIOOFFONMaintenance Panel1234567891011121314151617181920Fiber Management TroughOptical Multiplexer Tray (OMX)Cooling UnitOne or more control elements
58 Isolate Fault to a Board Handling FailuresOCLD (1W)OCLD (2W)OCLD (3W)OCLD (4W)OCI (1A)OCI (1B)OCI (2A)OCI (2B)OCM (A)OCM (B)OCI (3A)OCI (3B)OCI (4A)OCI (4B)OCLD (4E)OCLD (3E)OCLD (2E)OCLD (1E)Shelf ProcessorFillerIOOFFONMaintenance Panel1234567891011121314151617181920Fiber Management TroughOptical Multiplexer Tray (OMX)Cooling UnitIsolate Fault to a BoardSwitch to Backup
59 May not be in the Hardware OCLD (1W)OCLD (2W)OCLD (3W)OCLD (4W)OCI (1A)OCI (1B)OCI (2A)OCI (2B)OCM (A)OCM (B)OCI (3A)OCI (3B)OCI (4A)OCI (4B)OCLD (4E)OCLD (3E)OCLD (2E)OCLD (1E)Shelf ProcessorFillerIOOFFONMaintenance Panel1234567891011121314151617181920Fiber Management TroughOptical Multiplexer Tray (OMX)Cooling UnitIsolate fault to a SW componentRoute ManagerTCP/IP stackSNMP ManagerApplicationFlash DriversDevice ManagerNetworkManagerRTOSHardware
60 Ideal: Identify and Fix Faulty Software ComponentIsolate and containRepair (e.g. restart)NotifyDiagnoseUpgradeApplicationNetworkManagerSNMP ManagerApplicationDevice ManagerRoute ManagerApplicationTCP/IP stackFlash DriversApplicationRTOS
61 Component-level recovery rarely done Lack of suitable protection and isolationLack of modularityTight component couplingFew dynamic capabilitiesSoftware failures normally handled by:Hardware watchdogsRedundant boards
62 Repair Time Board Replacement Hours Reboot Minutes Failover to Standby SecondsSW Component Restart10’s MillisecondsSW FailoverMilliseconds
63 High Availability Manager Process Memory ViolationDISKFSYSKernel notifies HA ManagerFLASHFSYSTCP/IPDump file forpost-mortemanalysisTCP/IPHA ManagerrestartsserviceMicrokernelHAManagerATM
64 Notification and Recovery HA Manager (HAM) monitors components, sends notification of component failureHeart-beat services detect component hangsCore file on crash can be created for debugging and analysisCheckpointing permits recovering current stateHAM Checkpointed StateHAMHAMGuardianStackDriverAppCheckpointed State
65 RecoveryA second “shadow” server attaches to the same name
66 Recovery If primary faults, new clients connect to shadow server A second “shadow” server attaches to the same nameIf primary faults, new clients connect to shadow serverOld clients can re-connect to shadow server.
68 Service Upgrades New version of server attaches to same name ClientServerv 1.0/dev/serviceNewClient/dev/serviceServerv 1.1New version of server attaches to same nameNew clients connect to new serverOld server exits when all old clients have exited
70 Design Goals Tools needed to be easy to learn Tools which could take advantage of QNXTools which could integrate tools from other vendors, company designed tools, and industry specific tools and have them work with our tools and each otherTools needed to be customizable to the user or the company
71 The Best Tools and the Best RTOS QNX® MomenticsQNX® Neutrino® RTOSInvoke command-line toolsC/C++ codedeveloperSourcedebuggerCommand-linetoolsEthernet, Serial,JTAG, ROMulator3rd-PartyToolsJava codedeveloperProfilerBSPsRationalTargetinformationMemoryanalysisDDKsFlashfsysTCP/IPVirtioTargetagentNeutrinoruntimeSystembuilderPhotonapp builder…TBAMicrokernelIDE Workbench(Eclipse framework)Photon microGUIJavaHttpserverWindows, Solaris, QNX NeutrinoXScale
72 QNX IDE: Standards based IBM donated FrameworkJava IDE200 person-years of effortOpen SourceConsortium founding members include