MBG 1 CIS501, Fall 99 Lecture 19: Input/Output (I/O): Buses and Peripherals Michael B. Greenwald Computer Architecture CIS 501 Fall 1999.

MBG 1 CIS501, Fall 99 Lecture 19: Input/Output (I/O): Buses and Peripherals Michael B. Greenwald Computer Architecture CIS 501 Fall 1999

MBG 2 CIS501, Fall 99 Bus Options (See Figure 6.9, page 497)

MBG 3 CIS501, Fall 99 Administration Sotiris will lecture on Thursday (Chapter 4.) Thoughts on exercise 6.10 (should not involve any changes to your homework!)

MBG 4 CIS501, Fall 99 Processor Interface Issues How does bus interface with/to processor? Interconnections/Buses – Shared vs. separate Memory/IO buses – Attach to memory, cache, or proc.(separate only) Processor communication interface –I/O interface vs. Memory mapped I/O I/O Control Structure –Polling –Interrupts –DMA –I/O Controllers –I/O Processors

MBG 5 CIS501, Fall 99 How does processor access I/O devices? Need to read and write control and status registers. Need to transfer data to/from I/O device

MBG 6 CIS501, Fall 99 I/O Interface Independent I/O Bus CPU Interface Peripheral Memory memory bus Seperate I/O instructions (in,out) CPU Interface Peripheral Memory Lines distinguish between I/O and memory transfers common memory & I/O bus VME bus Multibus-II Nubus 40 Mbytes/sec optimistically 10 MIP processor completely saturates the bus!

MBG 7 CIS501, Fall 99 Memory Mapped I/O Single Memory & I/O Bus No Separate I/O Instructions CPU Interface Peripheral Memory ROM RAM I/O $ CPU L2 $ Memory Bus MemoryBus Adaptor I/O bus Bus Adaptor snoops memory bus transactions and converts I/O space addresses to I/O operations on I/O bus. (converts I/O ops to memory reads and writes, too).

MBG 8 CIS501, Fall 99 I/O Architecture Hardware covers interconnection point and number of buses. Software architecture: how I/O is managed by processor(s).

MBG 9 CIS501, Fall 99 Example: Communications Networks Performance limiter is memory system, OS overhead, not protocols Send/receive queues in processor memories Network controller copies back and forth via DMA No host intervention needed Interrupt host when message sent or received

MBG 10 CIS501, Fall 99 I/O Data Flow Impediment to high performance: multiple copies, complex hierarchy Pipeline? Not if shared resource!

MBG 11 CIS501, Fall 99 Processor Interface Issues How does bus interface with/to processor? Interconnections/Buses – Shared vs. separate Memory/IO buses – Attach to memory, cache, or proc.(separate only) Processor communication interface –I/O interface vs. Memory mapped I/O I/O Control Structure –Polling –Interrupts –DMA –I/O Controllers –I/O Processors

MBG 12 CIS501, Fall 99 Programmed I/O (Polling) CPU IOC device Memory Is the data ready? read data store data yes no done? no yes busy wait loop not an efficient way to use the CPU unless the device is very fast & very busy! but checks for I/O completion can be dispersed among computationally intensive code at the cost of increased interrupt latency

MBG 13 CIS501, Fall 99 Performance of Polling Consider a 1Mbps serial line, 32bit word at a time. External device delivers word/32 usecs. 16 usec to access device, 2 usec to xfer 1 word. When machine is idle, 16 usecs/32 usecs, so 50% of machine! When busy, 56% of machine BufferPolling Interval IdleBus (bytes) (usecs) (%)(%) 03250.056 86425.032 1612812.519 51240000.46 2048160000.16 Here, latency = 1/2 polling interval

MBG 14 CIS501, Fall 99 Interrupt Driven Data Transfer CPU IOC device Memory add sub and or nop read store... rti memory user program (1) I/O interrupt (2) save PC (3) interrupt service addr interrupt service routine (4) Xfer: Device xfer rate = 10 MBytes/sec => 0.1 x 10 sec/byte => 0.1 µsec/byte => 1000 bytes = 100 µsec 1000 transfers x 100 µsecs = 100 ms = 0.1 CPU seconds -6 User program progress only halted during actual transfer Overhead to set up xfer: 1000 transfers at 1 ms each: 1000 interrupts @ 2 µsec per interrupt 1000 interrupt service @ 98 µsec each = 0.1 CPU seconds Still far from device transfer rate! 1/2 in interrupt overhead

MBG 15 CIS501, Fall 99 Performance of Interrupts Consider a 1Mbps serial line, 32bit word at a time. External device delivers word/32 usecs. 2 usec to deliver interrupt, 98 usec service routine, 2 usec/word. When machine is idle no overhead. Same buffer trick choose 24 byte buffer to match latency Words CPU time Utilization of serial line

MBG 16 CIS501, Fall 99 Performance of Interrupts Consider a 1Mbps serial line, 32bit word at a time. External device delivers word/32 usecs. 2 usec to deliver interrupt, 98 usec service routine, 2 usec/word. When machine is idle no overhead. Same buffer trick choose 512 byte buffer to match pkt size

MBG 17 CIS501, Fall 99 Direct Memory Access CPU IOC device Memory DMAC Time to do 1000 xfers at 1 msec each: Setup: 1000 DMA set-up sequence @ 50 µsec 1000 interrupt @ 2 µsec 1000 interrupt service sequence @ 48 µsec No Xfer time!. 1 sec of CPU time,.1 sec of device time. CPU sends a starting address, direction, and length count to DMAC. Then issues "start". DMAC provides handshake signals for Peripheral Controller, and Memory Addresses and handshake signals for Memory. 0 ROM RAM Peripherals DMAC n Memory Mapped I/O Can also have DMA on each IOC

MBG 18 CIS501, Fall 99 Input/Output Channels CPU Channel Mem D1 D2 Dn... main memory bus I/O bus CPU IOP issues instruction to Channel interrupts when done (1) memory (2) (3) (4) Device to/from memory transfers are controlled by the Channel directly. Channel steals memory cycles. Limited programmability. Fixed task set. CPU downloads program Like DMA, can be single channel controlling all I/O devices, can be multiple channels each controlling many devices, or 1 channel per device

MBG 19 CIS501, Fall 99 Input/Output Processors CPU IOP Mem D1 D2 Dn... main memory bus I/O bus CPU IOP Selects task set in IOP interrupts when done (1) memory (2) (3) (4) Can do local processing (byte-swapping, echo negotiation, etc.) IOP steals memory cycles. OP Device Address target device where cmnds are looks in memory for commands OP Addr Cnt Other what to do where to put data how much special requests Similar approach using FEP (PDP 11 FEP for KL10), or co- processor

MBG 20 CIS501, Fall 99 Relationship to Processor Architecture I/O instructions and buses have largely disappeared Interrupt vectors have been replaced by jump tables PC <- M [ IVA + interrupt number ] PC <- IVA + interrupt number Interrupts: –Stack replaced by shadow registers –Handler saves registers and re-enables higher priority int's –Interrupt types reduced in number; handler must query interrupt controller

MBG 21 CIS501, Fall 99 Relationship to Processor Architecture Caches required for processor performance cause problems for I/O –Flushing is expensive, I/O pollutes cache –Solution is borrowed from shared memory multiprocessors "snooping”: cache coherency protocols Virtual memory frustrates DMA: –If > page, physical not contiguous, virtual requires wiring pages down. Load/store architecture at odds with atomic operations – load locked, store conditional Stateful processors hard to context switch

MBG 22 CIS501, Fall 99 Interconnect Trends Interconnect = glue that interfaces computer system components High speed hardware interfaces + logical protocols Networks, channels, backplanes memory-mapped wide pathways centralized arb message-based narrow pathways distributed arb

MBG 23 CIS501, Fall 99 1990 Bus Survey (P&H, 1st Ed) VME FutureBusMultibusII IPISCSI Signals1289696168 Addr/Data muxnoyesyesn/an/a Data width16 - 323232168 Mastersmultimultimultisinglemulti ClockingAsyncAsyncSyncAsynceither MB/s (0ns, word)253720251.5 (asyn) 5 (sync) 150ns word12.915.510== 0ns block27.995.240== 150ns block13.620.813.3== Max devices21202187 Max meters0.50.50.55025 Standard IEEE 1014IEEE 896.1ANSI/IEEEANSI X3.129ANSI X3.131 1296

MBG 24 CIS501, Fall 99 SCSI: Small Computer System Interface Up to 8 devices to communicate on a bus or “string” at sustained speeds of 4-5 MBytes/sec SCSI-2 up to 20 MB/sec Devices can be slave (“target”) or master(“initiator”) SCSI protocol: a series of ``phases", during which specif- ic actions are taken by the controller and the SCSI disks –Bus Free: No device is currently accessing the bus –Arbitration: When the SCSI bus goes free, multiple devices may request (arbitrate for) the bus; fixed priority by address –Selection: informs the target that it will participate (Reselection if disconnected) –Command: the initiator reads the SCSI command bytes from host memory and sends them to the target –Data Transfer: data in or out, initiator: target –Message Phase: message in or out, initiator: target (identify, save/restore data pointer, disconnect, command complete) –Status Phase: target, just before command complete

MBG 25 CIS501, Fall 99 SCSI “Bus”: Channel Architecture peer-to-peer protocols initiator/target linear byte streams disconnect/reconnect

MBG 26 CIS501, Fall 99 1993 I/O Bus Survey (P&H, 2nd Ed) BusSBusTurboChannelMicroChannelPCI OriginatorSunDECIBMIntel Clock Rate (MHz)16-2512.5-25async33 AddressingVirtualPhysicalPhysicalPhysical Data Sizes (bits)8,16,328,16,24,328,16,24,32,648,16,24,32,64 MasterMultiSingleMultiMulti ArbitrationCentralCentralCentralCentral 32 bit read (MB/s)33252033 Peak (MB/s)898475111 (222) Max Power (W)16261325

MBG 1 CIS501, Fall 99 Lecture 19: Input/Output (I/O): Buses and Peripherals Michael B. Greenwald Computer Architecture CIS 501 Fall 1999.

Similar presentations

Presentation on theme: "MBG 1 CIS501, Fall 99 Lecture 19: Input/Output (I/O): Buses and Peripherals Michael B. Greenwald Computer Architecture CIS 501 Fall 1999."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

MBG 1 CIS501, Fall 99 Lecture 19: Input/Output (I/O): Buses and Peripherals Michael B. Greenwald Computer Architecture CIS 501 Fall 1999.

Similar presentations

Presentation on theme: "MBG 1 CIS501, Fall 99 Lecture 19: Input/Output (I/O): Buses and Peripherals Michael B. Greenwald Computer Architecture CIS 501 Fall 1999."— Presentation transcript:

Similar presentations

About project

Feedback