The Problems You’re Having May Not Be the Problems You Think You’re Having: Results from a Latency Study of Windows NT John Regehr — University of Virginia.

Slides:



Advertisements
Similar presentations
Vassal: Loadable Scheduler Support for Multi-Policy Scheduling George M. Candea, Oracle Corporation Michael B. Jones, Microsoft Research.
Advertisements

Provide data pathways that connect various system components.
Tutorial 3 - Linux Interrupt Handling -
Operating Systems Process Scheduling (Ch 3.2, )
1 Augmented CPU Reservations John Regehr John A. Stankovic University of Virginia May 31, 2001.
IT Systems Multiprocessor System EN230-1 Justin Champion C208 –
 A quantum is the amount of time a thread gets to run before Windows checks.  Length: Windows 2000 / XP: 2 clock intervals Windows Server systems: 12.
CPU Reservations and Time Constraints: Implementation Experience on Windows NT John Regehr – University of Virginia Michael B. Jones – Microsoft Research.
Operating System Process Scheduling (Ch 4.2, )
Page 1 Building Reliable Component-based Systems Chapter 15 - Specification of Software Components Chapter 15 Specification of Software Components.
Nooks: an architecture for safe device drivers Mike Swift, The Wild and Crazy Guy, Hank Levy and Susan Eggers.
1 Predictable Scheduling for a Soft Modem Michael B. Jones – Microsoft Research Stefan Saroiu – University of Washington.
Home: Phones OFF Please Unix Kernel Parminder Singh Kang Home:
Chapter 13 Embedded Systems
An Overview of Systems and Networking Research at Microsoft Research Michael B. Jones Systems and Networking Research Group, Microsoft Research April 1999.
1 Process Description and Control Chapter 3 = Why process? = What is a process? = How to represent processes? = How to control processes?
G Robert Grimm New York University Receiver Livelock.
Operating System Process Scheduling (Ch 4.2, )
CSE 451: Operating Systems Winter 2010 Module 15 I/O Mark Zbikowski Gary Kimura.
I/O Tanenbaum, ch. 5 p. 329 – 427 Silberschatz, ch. 13 p
Performance Evaluation of Real-Time Operating Systems
Dreams in a Nutshell Steven Sommer Microsoft Research Institute Department of Computing Macquarie University.
0 Deterministic Replay for Real- time Software Systems Alice Lee Safety, Reliability & Quality Assurance Office JSC, NASA Yann-Hang.
Chapter 8 Input/Output. Busses l Group of electrical conductors suitable for carrying computer signals from one location to another l Each conductor in.
Introduction to Embedded Systems
3/11/2002CSE Input/Output Input/Output Control Datapath Memory Processor Input Output Memory Input Output Network Control Datapath Processor.
Uniprocessor Scheduling
1 CS503: Operating Systems Spring 2014 Dongyan Xu Department of Computer Science Purdue University.
Windows NT and Real-Time? Reading: “Inside Microsoft Windows 2000”, (Solomon, Russinovich, Microsoft Programming Series) “Real-Time Systems and Microsoft.
Predictable Scheduling for a Soft Modem Stefan Saroiu – University of Washington Michael.
Gregory PawloskiAugust 22, 2002 MPC Testing Progress.
A Comparative Study of the Linux and Windows Device Driver Architectures with a focus on IEEE1394 (high speed serial bus) drivers Melekam Tsegaye
CS 346 – Chapter 1 Operating system – definition Responsibilities What we find in computer systems Review of –Instruction execution –Compile – link – load.
1 Zhao Xia Chapter 2 Process , thread, and scheduling Chapter 2 Process , thread, and scheduling —— kernel services.
Operating Systems Lecture 7 OS Potpourri Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard. Zhiqing Liu School of Software.
NT Kernel CS Spring Overview Interrupts and Exceptions: Trap Handler Interrupt Request Levels and IRT DPC’s, and APC’s System Service Dispatching.
Scheduling Lecture 6. What is Scheduling? An O/S often has many pending tasks. –Threads, async callbacks, device input. The order may matter. –Policy,
Lecture 7: Scheduling preemptive/non-preemptive scheduler CPU bursts
WHDC PowerPoint Template Notes & Handouts
ECE 720T5 Fall 2012 Cyber-Physical Systems Rodolfo Pellizzoni.
We will focus on operating system concepts What does it do? How is it implemented? Apply to Windows, Linux, Unix, Solaris, Mac OS X. Will discuss differences.
Operating Systems 1 K. Salah Module 1.2: Fundamental Concepts Interrupts System Calls.
LRPC Firefly RPC, Lightweight RPC, Winsock Direct and VIA.
Preemptive Context Switching
1/9/ :46 1 Priority Model Real-time class Idle Above Normal Normal Below Normal Lowest Highest 31 Time-critical Dynamic classes.
Real Time System with MVL. © 2007 MontaVista Confidential | Overview of MontaVista Agenda Why Linux Real Time? Linux Kernel Scheduler Linux RT technology.
Windows Operating System Internals - by David A. Solomon and Mark E. Russinovich with Andreas Polze Unit OS3: Concurrency 3.3. Advanced Windows Synchronization.
Interrupt driven I/O Computer Organization and Assembly Language: Module 12.
OSes: 2. Structs 1 Operating Systems v Objective –to give a (selective) overview of computer system architectures Certificate Program in Software Development.
Lecture 12 Page 1 CS 111 Online Using Devices and Their Drivers Practical use issues Achieving good performance in driver use.
System Programming Basics Cha#2 H.M.Bilal. Operating Systems An operating system is the software on a computer that manages the way different programs.
Low Overhead Real-Time Computing General Purpose OS’s can be highly unpredictable Linux response times seen in the 100’s of milliseconds Work around this.
Interrupts and Exception Handling. Execution We are quite aware of the Fetch, Execute process of the control unit of the CPU –Fetch and instruction as.
Interrupts and Interrupt Handling David Ferry, Chris Gill CSE 522S - Advanced Operating Systems Washington University in St. Louis St. Louis, MO
Embedded Real-Time Systems Processing interrupts Lecturer Department University.
CPU Scheduling Scheduling processes (or kernel-level threads) onto the cpu is one of the most important OS functions. The cpu is an expensive resource.
Mohit Aron Peter Druschel Presenter: Christopher Head
Kernel Design & Implementation
CS 6560: Operating Systems Design
Chapter 5a: CPU Scheduling
Realtime Linux Clark Williams Tech Lead
Unit OS2: Operating System Principles
CS 286 Computer Organization and Architecture
Plug-and-Play.
Chapter 6: CPU Scheduling
HW & Systems: Operating Systems IS 101Y/CMSC 101 Computational Thinking and Design Tuesday, October 22, 2013 Carolyn Seaman University of Maryland, Baltimore.
I/O BUSES.
Top Half / Bottom Half Processing
EE 472 – Embedded Systems Dr. Shwetak Patel.
CS703 – Advanced Operating Systems
Presentation transcript:

The Problems You’re Having May Not Be the Problems You Think You’re Having: Results from a Latency Study of Windows NT John Regehr — University of Virginia Mike Jones — Microsoft Research

Research Context  Developing soft real-time scheduler for Windows NT  Predictability issues:  How large are observed worst-case thread scheduling latencies?  Can they be improved?  Measured actual latencies

NT Latency Results  Typically can schedule tasks every small number of milliseconds  But ill-behaved drivers, hardware can take many milliseconds  Software delays of up to 16ms observed  Hardware delays of up to 30ms observed Results from NT 5, Pentium II-333

Deferred Procedure Calls  Analogous to Unix bottom halves  Are preempted by interrupts  Preempt normal threads  May not block  Are run in FIFO order  Typical Uses:  I/O Completion Processing  Background Driver Housekeeping

Non-Problems  Interrupts  Interrupt handlers needing substantial work queue DPCs  Never observed interrupt handler taking substantial fraction of ms  Ethernet Packet Processing  With back-to-back 100Mbit incoming packets of UDP or TCP data:  Longest observed DPC 600µs  Longest delay of user code ~2ms  Tested four common Ethernet cards

Problem: “Unimportant” Background Work  DEC dc21x4 PCI Fast 10/100 Ethernet  6ms periodic DPC every 5s  Autosense processing  Most of 6ms in five 0.88ms calls to routine that reads device register that:  Writes a HW register – 1.5µs  Stalls for 5µs  Writes HW register again – 1.5µs  Stalls for 5µs  Reads a HW register – 1.5µs  Stalls for 5µs  And does this 16 times! (once per bit)

Another Long DPC: Intel EE 16  Intel EtherExpress 16 ISA Ethernet  17ms DPC every 10s  Card reset for no received packets Amusing Observation  Unplugging Ethernet makes latency worse!  Despite conventional wisdom to the contrary

Even Worse: Video Cards  Video cards and drivers conspire to hog the PCI bus  Dragging large window locks out interrupts for up to 30ms  Obliterates sound I/O, for instance  Can set registry key to ask drivers to behave, but not default  No problem when set correctly  Manufacturers’ motivation: WinBench ~ 5% improvement

Video Card Misbehavior Details  Don’t check if card FIFO full before write  Eliminate a PCI read  Stalls PCI bus if full to prevent overflow  Even with AGP, big blits are slow  Problem observed on:  AccelStar II AGP  Matrox Millenium II  Several other major cards also do this

Example Bug: Multimedia Timers  MP HAL uses 976µs interrupt period  Multimedia timers compute absolute time for next wakeup in whole ms  Converted to relative wakeup time and passed to kernel  Interrupt occurs just before wakeup  Timer doesn’t fire  Next time, fires twice to catch up  Fix: compute wakeup in 100ns units

Lesson Your Intuition About Performance is Wrong Only Measurement Reveals the Truth!

Bottom Line  Yes, NT can do RT scheduling  Have done a prototype  But will be of limited value if unscheduled activities continue taking tens of milliseconds  NT developed, tested for throughput  Not small numbers of ms of latency  Improvement will require  Systematic latency testing  Latency requirements specifications

The End

Ethernet Cards Tested  Intel EtherExpress Pro 100b  SMC EtherPower II 10/100  Compaq Netelligent 10/100 Tx  DEC dc21x4 Fast 10/100 Ethernet

Offending Video Cards  Those we observed:  AccelStar II AGP – 3D Labs Permedia 2  Matrox Millenium II  Those we’ve read about:  Tseng Labs ET 6000  Hercules Dynamite 128  S3 cards  ATI – no option to fix!  Number 9 – no option to fix!  For more details:  