Presentation is loading. Please wait.

Presentation is loading. Please wait.

Windows CE Real-Time Performance Architecture John Hatch Program Manager for CE Kernel Microsoft Corporation.

Similar presentations

Presentation on theme: "Windows CE Real-Time Performance Architecture John Hatch Program Manager for CE Kernel Microsoft Corporation."— Presentation transcript:

1 Windows CE Real-Time Performance Architecture John Hatch Program Manager for CE Kernel Microsoft Corporation

2 Agenda Real-Time Overivew Interrupt Model Features Taking Control Measurement Tools

3 Agenda Real-Time Overview Interrupt Model Features Taking Control Measurement Tools

4 Real-Time Overview Real time Applications where specific timings are requested Hard real time Applications where system fails if timings are not met Soft real time Applications where system tolerates large latencies Actual timing requirements are system-specific

5 Real Time Defined By OMAC Hard Real-Time Cycle Variation or Jitter (µs) 500 us 1 ms 5 ms 10 ms 20 ms 100 ms Cycle Time Hard Real Time 0 1,000 µs 5,000 µs 10,000 µs 100 µs Soft Real-Time Windows CE 2.X Windows NT Windows 90%Apps OMAC represents Industrial Automation Community

6 Real World Example Consumers wanted to know if CE is HARD real-time Want to know if CE was capable of running radio and UI Concerned that CE was not HARD real-time enough to meet the requirements Requirements Run cellular radio DSP Meet tight timing requirements ARM9 250Mhz Full Windows CE UI And play video

7 Real World Timing Requirements So what where the actually requirements? Interrupt every 4.6 ms Allowable jitter < 0.5ms Interrupt every 4.6 ms 0.5 ms Jitter Actual Application Requirements

8 Windows CE Test Results Respond time test using the following configuration Samsung SMDK2410 development board 200 mHz ARM with 16x16 cache Windows CE 5.0 with full UI Running a WMV video ISR startsIST starts minimum 1.2 µs31.7 µs average 3.3 µs67.2 µs Maximum 13.3 µs103.0 µs Time in microseconds (µs) Windows CE Real-Time Test Results

9 What We Learned In terms of the 0.5 ms jitter alone CEs longest ISR response time was 13.3 µs 2.6% of max allowed CEs longest IST response time was 103 µs 20.6% of max allowed Conclusion CEs response time was well within the requirements Project went ahead and is progressing well

10 Agenda Real-Time Overview Interrupt Model Features Taking Control Measurement Tools

11 Definitions Interrupt Hardware signal indicating an event has happened and needs to be serviced Latency The time from when the interrupt occurred to when the event is serviced Jitter Range of allowable variation in service time

12 Threads, Process, And Drivers Thread A unit of execution A piece of code that can be scheduled to run by the kernel May be launch by a process or a driver Process A collection of threads with a common execution environment A process has at least on thread Launch from an executable file Can create threads to handle interrupts Driver A DLL, (dynamically loaded library) loaded into the device manager process Supports the Device I/O Control Interface Can create threads to handle interrupts

13 ISRs And ISTs Interrupt Service Routine (ISR) A piece of code loaded into the kernel Assigned to a particular IRQ Called immediately to handle the hardware interrupt Should be written to run quickly with few outside dependencies Can be chained together if multiple device might use the same IRQ Notifies the kernel which IST should run Interrupt Service Thread (IST) A thread registered to handle an interrupt Can be created by either a process or a driver Scheduled like any other thread on the system Should be written to do the bulk of the interrupt handling work

14 ISRs And ISTs Work Together ISRs and ISTs usually work as pairs ISR handles the critical work IST handles the bulk of the work They synchronize by using an Event Object The IST creates an Event Object Uses the API WaitForSingleObject to sit and wait on that object to be signaled The ISR tells the kernel which object to signal Which unblocks the IST and makes it runable If the IST is the highest priority runable thread, it will get scheduled to run immediately

15 Priority Levels Windows CE 5.0 has 256 levels of priority Level 0 is the highest and 255 is the lowest Level 0 is the highest and 255 is the lowest The old CE model of 8 levels now map to the lowest 8 of the new model The default level for a thread is 252 Levels 0 through 248 can be reserved by OEM LevelsDescription 0 through 96Real-time above drivers 97 through 152Default used by CE device drivers 153 through 247Real-time below drivers 248 through 255Non-real-time priorities

16 Scheduler Is responsible for determining which thread will run Has a queue for threads for each priority level Will always schedule the first thread at the highest priority level A thread gets to run for set length of time, called a quantum Typically 100 milliseconds A quantum of 0 means the quantum never runs out The thread can run until blocked or interrupted The thread can run until blocked or interrupted A Thread runs until Its quantum runs out It is interrupted by a higher priority thread Its blocked by a resource contention Such as access to a critical section or a mutex

17 Fitting It All Together Interrupt Handler calls registered ISR Interrupt Occurs ISR runs, tells kernel which event to signal Kernel signals event, IST becomes runnable Scheduler runs the IST IST runs and resets the interrupt

18 Interrupt Architecture Kernel HW All Higher-Priority Int. Enabled All Except ID All OAL Thread ISR KCall + Scheduler (SetEvent) ISH ISRLatency IST IST Latency ID

19 Latency Behavior Maximum ISR Latency Path Maximum IST Latency Path

20 Where Latency Occurs For an ISR Time required for the kernel to vector to the ISR handler (normal) Saving register, etc. The amount of time that interrupts are turned off (variation) For an IST Time to schedule a thread (normal) Time spent in a KCall (variation) KCall = Kernel code executing with pre-emption disabled

21 Worst Case IST Latency General case In the thread scheduler KCall and take an IRQ that will trigger a different IST Software assisted TLB/cache miss on the IST thread

22 Improvements To Latency Non-preemptable code reduced Large Kcalls split apart and state saved to resume correctly Reduces the latency for an IST Kernel data structures moved to statically mapped virtual address This avoids any TLB misses associated with accessing its data Special-cased ISTs An event registering for an IST can only be used in a WaitForSingleObject New priority inversion model reduces the upper bounds Was a large KCall

23 Agenda Real-Time Overview Interrupt Model Features Taking Control Measurement Tools

24 Nested Interrupts Higher priority ISRs can preempt lower ISRs Based on support by the CPU, additional hardware, and/or OEM code ARM Uses a vectored interrupt table Single CPU interrupt level with an Interrupt register No built in concept of priority IRQ Except FIQ Interrupts are not turned on before entering ISR OEM can re-enable CPU interrupt OEMs can prioritize the interrupts with bit masks to turn on and off the different interrupts

25 Shared Interrupts The hardware design might attach several devices to the same interrupt line Multiple ISRs can be chained together to handle shared interrupts Each ISR in turn determines if it can handle the interrupt If it can, it does its work and either completes the interrupt or the SYSINTR indicating which IST is to run If not, it returns SYSINTR_CHAIN indicating the kernel should try the next ISR in the chain

26 Priority Inheritance Higher priority threads can get stuck waiting for a lower priority thread to release a resource Such as a critical section, semaphore, or mutex Cause priority inversion Kernel detects priority inversion and handles it with priority inheritance, or boosting The lower priority thread inherits the higher priority threads priority Its quantum is set to 0, which lets it run to completion Supports only one level of inheritance Kernel will only boost one thread If the boosted thread is also in turn block by a third thread, the thread third is not boosted

27 Thread Quantum Per thread quantum Default set by the OEM in the OAL dwDefaultThreadQuantum APIs to set Quantum Ce(Set/Get)ThreadQuantum Quantum of 0 sets thread to run-to-completion At any priority Preempted only by higher priority threads

28 System Tick 1 ms timer tick in normal mode Tick interrupt causes a reschedule Will run next highest priority runnable thread Sleep(N) will generally wake up in N to N + 1 ms In Idle mode system tick is reset to next scheduled event On system tick check for reschedule or nop

29 Full Kernel Mode All threads are running in kernel mode Security checks are disabled No need to call SetKMode Entire system is open to all processes All statically mapped virtual addresses Virtual protection is still in place Optimizations for high traffic function For example a router network box

30 Agenda Real-Time Overview Interrupt Model Features Taking Control Measurement Tools

31 Taking Control Real-time developers want to retain control at all times Control of the schedule Control is managed by understanding The hardware The OS Writing code to make optimal use of both features is key to real-time performance

32 Understanding The Hardware Accessing hardware can delay ISRs and ISTs Same CPUs on different boards can produce a wide range of results Devices and associated drivers can produce a wide range of delays

33 Understand The Hardware Understand device access I/O-based access may incur a penalty Certain devices can lock out a bus for many microseconds For example on x86 avoid access to the CMOS RTC Use a software RTC

34 Understanding The OS Priority based preemptive thread scheduler Virtual memory system Provides protection There is some overhead Synchronization Objects Critical Sections, Mutexs, Semaphores, MSQueues Can cause your thread to block System call interactions Demand paging of non-XIP code Stack memory reclaiming Can delay thread execution Going Idle can delay threads

35 Gaining Control Separate User Interface operations from Real-time threads Keeping UI calls out of the real-time threads prevents them from being blocked by the UI User Interface involves many interactions across the OS It can block threads Performance of UI threads is affected by all UI applications Use shared buffers or MSQueues to communicate between UI and RT threads

36 Gaining Control Memory and objects Preallocate all memory Preallocate all threads, sync objects Thread scheduling Set the appropriate priority Set the appropriate quantum Use a Quantum of 0 to run-to-completion Use DisableThreadLibraryCalls Prevent thread notifications to DLLs

37 Gaining Control Avoid making system calls on your real-time thread Dont use SetTimer as a real-time timer Avoid priority inversion conditions Use Event tracking/Kernel tracker Use dwNKMaxPrioNoScav to prevent stack space recovery from real-time threads Trusted Security model and real-time performance do not mix Security checks slow down untrusted applications Launch RT threads from a Trusted process or driver

38 Gaining Control Disable Idle processing When OS calls OEMIdle return immediately instead of sleeping the device Disable demand paging LoadDriver Locks in a single DLL Set configuration in CONFIG.BIB ROMFLAGS Set to 0x0001 Locks in all modules File system block driver can disallow Dont set the flag DISK_INFO_FLAG_PAGEABLE

39 Agenda Real-Time Overview Interrupt Model Features Taking Control Measurement Tools

40 ILTiming ILTiming Software-based real-time measurement tool Measures both ISR and IST latencies ISR latency From IRQ to ISR IST latency From the end of the ISR to the start of the IST Enabled for all sample platforms Varying system loads

41 OSBench Scheduler performance-timing tests Enables you to determine how long it takes to perform a basic kernel tasks such as Acquire or release a critical section Wait or signal an event Create a semaphore or mutex Yield a thread Call system APIs

42 Kernel Tracker Shows interaction between processes, threads, and interrupts Track interrupts TLB misses Priority inversion Thread state such as running, blocked, sleeping, and migrating

43 Summary Windows CE is real-time Windows CE provides all the functionality needed to qualify as a real-time operating system Windows CE provides tools to optimize your real-time platform

44 Real World ZMP Nuvo Robot

45 Real World KUKA Roboter Launching CeWin to help customers build blended real-time solutions based on Windows XP using Windows CE as the real-time scheduler

46 © 2006 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION. John.Hatch @


Download ppt "Windows CE Real-Time Performance Architecture John Hatch Program Manager for CE Kernel Microsoft Corporation."

Similar presentations

Ads by Google