Download presentation
Presentation is loading. Please wait.
1
Avishai Wool lecture 8 - 1 Introduction to Systems Programming Lecture 8 Paging Design Input-Output
2
Avishai Wool lecture 8 - 2 Steps in Handling a Page Fault
3
Avishai Wool lecture 8 - 3 Virtual Physical mapping CPU accesses virtual address 100000 MMU looks in page table to find physical address –Page table is in memory too Unreasonable overhead!
4
Avishai Wool lecture 8 - 4 TLB: Translation Lookaside Buffer Idea: Keep the most frequently used parts of the page table in a cache, inside the MMU chip. TLB holds a small number of page table entries: Usually 8 – 64 TLB hit rate very high because, e.g., instructions fetched sequentially.
5
Avishai Wool lecture 8 - 5 A TLB to speed up paging Example: Code loops through pages 19,20,21 Uses data array in pages 129,130,140 Stack variables in pages 860,861
6
Avishai Wool lecture 8 - 6 Valid TLB Entries TLB miss: –Do regular page lookup –Evict a TLB entry and store the new TLB entry –Miniature paging system, done in hardware When OS does context switch to a new process, all TLB entries become invalid: –Early instructions of new process will cause TLB misses.
7
Avishai Wool lecture 8 - 7 TLB placement/eviction Done by hardware Placement rule: –TLBIndex = VirtualAddr modulo TLBSize –TLBSize is always 2 k TLBIndex = k least-significant bits –Keep “tag” (rest of bits) to fully identify virtual addr Virtual address can be only in one TLB index No explicit “eviction”: simply overwrite what is in TLB[TLBIndex]
8
Avishai Wool lecture 8 - 8 TLB + Page table lookup In TLB? In page table? Page fault: copy from disk to memory Virtual address Physical address No Yes Yes; update TLB No
9
Avishai Wool lecture 8 - 9 TLB – cont. If address is in TLB page is in physical memory –OS invalidates TLB entry when evicting a page –So page fault not possible if we have a TLB hit “page fault rate” is computed only on TLB misses
10
Avishai Wool lecture 8 - 10 Example: Average memory access time TLB lookup: 4ns Phys mem access: 10ns Disk access: 10ms TLB miss rate: 1% Page fault rate: 0.1% Assume page table is in memory. p=0.99, time=4ns+10ns Page hit: p=0.01*0.999, time=4ns+10ns+10ns Page fault: p=0.01*0.001, time=4ns+10ns+10ms+10ns TLB miss TLB hit Average memory access: 114.1ns (1.141*10 -7 )
11
Avishai Wool lecture 8 - 11 Design issues in Paging
12
Avishai Wool lecture 8 - 12 Local versus Global Allocation Policies: Physical Memory a)Original configuration – ‘A’ causes page fault b)Local page replacement c)Global page replacement
13
Avishai Wool lecture 8 - 13 Local or Global? Local number of frames per process is fixed –If working set grows thrashing –If working set shrinks waste Global usually better Some algorithms can only be local (working set, WSClock).
14
Avishai Wool lecture 8 - 14 How many frames to give a process? Fixed number Proportional to its size (before load) Zero, let it issue page faults for all its pages. –This is called pure demand paging. Monitor page-fault-frequency (PFF), give more pages if PFF high.
15
Avishai Wool lecture 8 - 15 Page fault rate as a function of the number of page frames assigned
16
Avishai Wool lecture 8 - 16 Load Control Despite good designs, system may still thrash When PFF algorithm indicates –some processes need more memory –but no processes need less Solution: Reduce number of processes competing for memory –swap one or more to disk, divide up frames they held –reconsider degree of multiprogramming
17
Avishai Wool lecture 8 - 17 Cleaning Policy Need for a background process, paging daemon –periodically inspects state of memory When too few frames are free –selects pages to evict using a replacement algorithm It can use same circular list (clock) –as regular page replacement algorithm but with diff ptr
18
Avishai Wool lecture 8 - 18 Windows XP Page Replacement Processes are assigned working set minimum and working set maximum Working set minimum is the minimum number of page frames the process is guaranteed to have in memory A process may be assigned as many page frames up to its working set maximum When the amount of free memory in the system falls below a threshold, automatic working set trimming is performed to restore the amount of free memory Working set trimming removes frames from processes that have more than their working set minimum
19
Avishai Wool lecture 8 - 19 Devices, Controllers, and I/O Architectures
20
Avishai Wool lecture 8 - 20 I/O Device Types Block Devices –block size of 512-32768 bytes –block can be read/written individually –typical: disks / floppy / CD Character Devices –delivers / accepts a sequential stream of characters –non-addressable –typical: keyboard, mouse, printer, network Other: Monitor, Clock
21
Avishai Wool lecture 8 - 21 Typical Data Rates
22
Avishai Wool lecture 8 - 22 Device Controllers I/O devices have components: –mechanical component –electronic component The electronic component is the device controller –may be able to handle multiple devices Controller's tasks –convert serial bit stream to block of bytes –perform error correction as necessary –make available to main memory
23
Avishai Wool lecture 8 - 23 Communicating with Controllers Controllers have registers to deliver data, accept data, etc. Option 1: special I/O commands, I/O ports in r0, 4 “4” is not memory address 4, it is I/O port 4 Option 2: I/O registers mapped to memory addresses
24
Avishai Wool lecture 8 - 24 Memory-Mapped Registers Controller connected to the bus Has a physical “memory address” like B0000000 When this address appears on the bus, the controller responds (read/write to its I/O register) RAM configured to ignore controller’s address
25
Avishai Wool lecture 8 - 25 Possible I/O Register Mappings Separate I/O and memory space (IBM 360) Memory-mapped I/O (PDP-11) Hybrid (Pentium, 640K-1M are for I/O)
26
Avishai Wool lecture 8 - 26 Advantages of Memory Mapped I/O No special instructions, can be written in C. Protection by not putting I/O memory in user virtual address space. All machine instructions can access I/O: LOOP: test *b0000004 // check if port_4 is 0 beq READY branch LOOP READY:...
27
Avishai Wool lecture 8 - 27 Disadvantages of Memory Mapped I/O Memory and I/O controllers have to be on the same bus: –modern architectures have separate memory bus! –Pentium has 3 buses: memory, PCI, ISA
28
Avishai Wool lecture 8 - 28 Bus Architectures (a) A single-bus architecture (b) A dual-bus memory architecture
29
Avishai Wool lecture 8 - 29 Memory Mapped with Separate Bus I/O Controllers do not see memory bus. Option 1: all addresses to memory bus. No response I/O bus Option 2: Snooping device between buses –speed difference is a problem Option 3 (Pentium): filter addresses in PCI bridge
30
Avishai Wool lecture 8 - 30 Structure of a large Pentium system
31
Avishai Wool lecture 8 - 31 Principles of I/O Software
32
Avishai Wool lecture 8 - 32 Goals of I/O Software Device independence –programs can access any I/O device –without specifying device in advance ·(floppy, hard drive, or CD-ROM) Uniform naming –name of a file or device a string or an integer –not depending on which machine Error handling –handle as close to the hardware as possible
33
Avishai Wool lecture 8 - 33 Goals of I/O Software (2) Synchronous vs. asynchronous transfers –blocked transfers vs. interrupt-driven Buffering –data coming off a device cannot be stored in final destination Sharable vs. dedicated devices –disks are sharable –tape drives would not be
34
Avishai Wool lecture 8 - 34 How is I/O Programmed Programmed I/O Interrupt-driven I/O DMA (Direct Memory Access)
35
Avishai Wool lecture 8 - 35 Programmed I/O Steps in printing a string
36
Avishai Wool lecture 8 - 36 Polling Busy-waiting until device can accept another character Example assumes memory- mapped registers
37
Avishai Wool lecture 8 - 37 Properties of Programmed I/O Simple to program Ties up CPU, especially if device is slow
38
Avishai Wool lecture 8 - 38 Interrupts Revisited bus
39
Avishai Wool lecture 8 - 39 Interrupt-Driven I/O Code executed when print system call is made Interrupt service procedure
40
Avishai Wool lecture 8 - 40 Properties of Interrupt-Driven I/O Interrupt every character or word. Interrupt handling takes time. Makes sense for slow devices (keyboard, mouse) For fast device: use dedicated DMA controller –usually for disk and network.
41
Avishai Wool lecture 8 - 41 Direct Memory Access (DMA) DMA controller has access to bus. Registers: –memory address to write/read from –byte count –I/O port or mapped-memory address to use –direction (read from / write to device) –transfer unit (byte or word)
42
Avishai Wool lecture 8 - 42 Operation of a DMA transfer
43
Avishai Wool lecture 8 - 43 I/O Using DMA code executed when the print system call is made interrupt service procedure
44
Avishai Wool lecture 8 - 44 DMA with Virtual Memory Most DMA controllers use physical addresses What if memory of buffer is paged out during DMA transfer? Force the page to not page out (“pinning”)
45
Avishai Wool lecture 8 - 45 Burst or Cycle-stealing DMA controller grabs bus for one word at a time, it competes with CPU bus access. This is called “cycle-stealing”. In “burst” mode the DMA controller acquires the bus (exclusively), issues several transfers, and releases. –More efficient –May block CPU and other devices
46
Avishai Wool lecture 8 - 46 Concepts for review TLB Local/Global page replacement Demand paging Page-fault-frequency monitor I/O device controller in/out commands Memory-mapped registers PCI Bridge Programmed I/O (Polling) Interrupt-driven I/O I/O using DMA Page pinning DMA cycle-stealing DMA burst mode
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.