Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Computer Architecture & Assembly Language Spring 2001 Dr. Richard Spillman Lecture 19 – IO II.

Similar presentations


Presentation on theme: "1 Computer Architecture & Assembly Language Spring 2001 Dr. Richard Spillman Lecture 19 – IO II."— Presentation transcript:

1

2 1 Computer Architecture & Assembly Language Spring 2001 Dr. Richard Spillman Lecture 19 – IO II

3 2 Semester Topics PLU 1 I/O CPU Disk Memory I/O ALU Assembly Microprogramming Alternatives Cache Virtual Structure Operation Network

4 3 Review – Last Lecture Introduction to I/O I/O Devices

5 4 Review – I/O Devices I/O devices come in several forms with widely different requirements: Device Behavior Partner Data Rate (KB/sec) Keyboard Input Human 0.01 Mouse Input Human 0.02 Line Printer Output Human 1.00 Laser Printer Output Human 100.00 Graphics Display Output Human30,000.00 Network-LAN Input or Output Machine 200.00 Floppy disk Storage Machine 50.00 Optical Disk Storage Machine 500.00 Magnetic Disk Storage Machine 2,000.00

6 5 Outline Disk Arrays RAID Systems Bus Structures It’s too much my circuits hurt

7 6 Disk Arrays GOAL GOAL: Increase the throughput of a disk system Method Method: Instead of one large disk with one access link, construct an array of disks operating in parallel I/O Bottleneck Multiple I/O ports

8 7 Evaluation An array of disks has some advantages potential for large data and I/O rates cheaper per MB cost cheaper per Kwatt cost They do present one major disadvantage

9 8 Reliability Analysis The reliability of a single disk system can be quite high On the order of 50,000 hours (6 years) But, the reliability of N disks is (MTTF): Reliability of 1 disk N For a 70 disk array this is 50,000/70 = 714 hours Or, from 6 years down to 1 month

10 9 Data Integrity A large array of disks is necessary for fast data transfer BUT – the reliability drops significantly Backups might help but... If a drive fails you lose all your data and backups take time and only recover data to the last backup

11 10 RAID Systems Redundant Array of Inexpensive Disks When a disk fails the contents can be reconstructed from the remaining disks Capacity penalty to store redundant data Bandwidth penalty to update the disks Redundancy is created in one or more of: mirroring (duplicate disks) Coding (error correction codes)

12 11 Levels of RAID There are several levels of RAID implementation RAID 0 - Data is stripped across several disks RAID 1 - mirror disks RAID 2 - Synchronized Disks with Hamming Code RAID 3 - Bit interleaved w/parity RAID 4 - Block interleaved w/parity RAID 5 - Block interleaved with distributed parity

13 12 RAID 0 This level is designed to increase disk throughput and not reliability no reliability techniques are used Files are “striped” across multiple disks increases the data rate by parallel access Disk I/O File A - four blocks A0, A1, A2, A3 A0 A1 A2A3

14 13 RAID 1 This level is designed to provide a high degree of data protection Each disk in the array has a duplicate disk Disk I/O A0 A0 A1A1 Mirror disks 100% capacity overload Bandwidth sacrifice on a write logical write = 2 physical writes

15 14 RAID 2 This level uses bit interleaving with a Hamming code. Bits are distributed across disks with Hamming code parity bits added: b3 b4 b5b6b2b1b0 Disk I/O Hamming Code Parity Bits How many bit errors will this correct?

16 15 RAID 3 This level uses bit interleaving with a single parity bit If one drive fails, the parity bit combined with the valid data may be used to recover the lost data b3 b4 b5 p b2b1b0 Disk I/O Parity drive

17 16 RAID 4 This level uses block interleaving with a single parity block A write operation requires 2 physical writes one to the data disk and one to the parity disk Disk I/O Parity drive B3 B4 B5 Bp B2B1B0

18 17 RAID 5 This level uses block and parity interleaving the parity blocks are mixed in with the data blocks Result: there is no dedicated parity disk Disk I/O B3 B4 B5 Bp B2B1B0 B9 B10 Bp B11 B8B7B6 B15 Bp B16 B17 B14B13B12 Bp B21 B22 B23 B20B19B18

19 18 RAID 5 Write A Level 5 logical write requires 2 physical reads and 2 physical writes in order to update the parity block New Block Old Blocks nB0B0B1B2B3pB B0B1B2B3pB 3. Write 4. Write + 1. Read Find the differences + 2. Read Find the differences

20 19 Communication Links Problem: A conventional computer consists of several components – How can they communicate with each other? CPU Main Memory USB Port I/O Controller

21 20 Point to Point Links Wires from every subsystem to every other subsystem Highest bandwidth High system cost Connector costs or pin costs Combinatorial explosion as number of subsystems grows Problems with designing for expandability Scales poorly Wires for every path Connector on each node for every connection

22 21 Crossbar Switch Crossbar switch permits connecting, for example, n CPUs to m memory banks for simultaneous accesses Cost is n*m switches Latency is a single switch delay Used for high-bandwidth with few resources Connecting a few processors to interleaved memory Vector Register File to Vector Data Path Scales poorly for large n or m P P P MMMM

23 22 Bus Structures BUS: shared communication link between subsystems Advantages low cost versatile - easy to add additional devices Disadvantages bottleneck speed is affected by a variety of issues such as bus length, the nature of the devices,...

24 23 General Bus Organization l Control Lines: – Signal Requests and Acknowledgments – Indicate What Type of Information Is on the Data Lines l Data Lines Carry Information Between the Source and the Destination: – Data and Addresses – Complex Commands l A Bus Transaction Includes Two Parts: – Sending the Address – Receiving or Sending the Data

25 24 Bus Limitations Length long wires take longer to change Width more data wires means faster throughput, but higher cost Number of devices devices increase load on bus – takes longer to drive more devices means more competition for a shared communication link

26 25 Important Questions How is the bus used? The bus protocol describes the sequence of signals that cause data to be transferred between devices. The protocol also controls the timing of the sequence: what happens when? How fast can we transfer data? Many things determine the effective bus bandwidth: width (number of data signals) clock speed protocol (can steps be overlapped)? How does a device get control of the bus? This is known as bus arbitration.

27 26 Bus Access One of the most important issues in bus design: How is the bus reserved by a device that wishes to use it? Chaos is avoided by a master-slave arrangement: Only the bus master can control access to the bus: It initiates and controls all bus requests A slave responds to read and write requests The simplest system: Processor is the only bus master All bus requests must be controlled by the processor Major drawback: the processor is involved in every transaction

28 27 Master/Slave l A bus transaction includes two parts: – Sending the address – Receiving or sending the data l Master is the one who starts the bus transaction by: – Sending the address l Slave is the one who responds to the address by: – Sending data to the master if the master ask for data – Receiving data from the master if the master wants to send data BUS Master BUS Slave Master sends address Data moves either way

29 28 Bus Protocols Each bus defines a set of rules for devices to communicate General Sequence arbitrate for bus mastership Master sends address to slave data transferred between master and slave Control of state sequencing can become complex timing may vary data lengths may vary

30 29 Multiple Bus Controllers Bus arbitration scheme: A bus master wanting to use the bus asserts the bus request A bus master cannot use the bus until its request is granted A bus master must signal to the arbiter the end of the bus utilization Bus arbitration schemes usually try to balance two factors: Bus priority: the highest priority device should be serviced first Fairness: Even the lowest priority device should never be completely locked out from the bus Bus arbitration schemes can be divided into four broad classes: Daisy chain arbitration Centralized, parallel arbitration Distributed arbitration by self-selection: each device wanting the bus places a code indicating its identity on the bus. Distributed arbitration by collision detection: Each device just “goes for it”. Problems found after the fact.

31 30 Daisy Chain Arbitration An arbitration unit grants the bus to the first requesting device along a chain AU Device 1 Device 2 Device 3 request release grant Advantage: simple Disadvantages: –Cannot assure fairness: A low-priority device may be locked out indefinitely –The use of the daisy chain grant signal also limits the bus speed

32 31 Centralized Parallel Arbitration Used in essentially all processor-memory busses and in high-speed I/O busses AU Device 1 Device 2 Device 3 Req grant Req grant

33 32 Synchronous Bus A clock is part of the control signal. All devices time their actions based on common clock. NOTE: This is NOT the same as the processor clock. Transactions take a fixed number of clock cycles. Control is simple: (1) Ask for something. (2) Wait for a fixed number of cycles. (3) Complete the transaction. Usually faster than asynchronous  often used for processor-memory bus.

34 33 Synchronous Protocol Suppose the protocol for a read is as follows: 1. Processor raises “Read Request” signal; this signal stays high until the request is complete. -- No other device gets the bus while ReadReq is high. 2. At the same time, the processor places the address on the data lines; the address must stay on the bus for two clock cycles. 3. After the address, the memory takes four cycles to access data. 4. The memory places the data on the bus and keeps it there for two clock cycles. 5. The processor drops the Read Request signal.

35 34 Timing Clock Addr Data Read Req Processor asserts Read Req And places address on the bus for 2 clock cycles Wait four cycles Data Memory puts data on the bus for 2 clock cycles Processor drops Read Req

36 35 Asynchronous Bus No clock signal. Devices use “handshake” to coordinate. Move to next protocol step when both parties agree. Accommodates a wide variety of devices. Varying latency ok. Easier to make bus longer, because timing is not as precise.

37 36 Asynchronous Example I/O device wants to read from memory: 1. Device places address on bus and raises ReadReq; memory raises Ack to indicate that it has seen the request and that it has read the address. 2. Device sees Ack, drops ReadReq and address. 3. Memory drops Ack when it sees ReadReq dropped. 4. When data ready, memory places on bus and raises DataRdy signal. 5. Device sees DataRdy, reads data from the bus, and raises Ack. 6. Memory sees Ack, drops data and DataRdy. 7. Device drops Ack.

38 37 Timing Memory sees ReadReq, raises Ack when it’s ready. Device wants to read from memory; it places the address on the bus and raises ReadReq. Device sees Ack, drops ReadReq and Data. Memory sees ReadReq dropped, drops Ack. When memory has data, it puts data on the bus and raises DataRdy. Device sees DataRdy; it reads data from bus and raises Ack. Memory sees Ack, drops data and DataRdy.Device drops Ack when DataRdy is dropped.

39 38 Synchronous Performance Find max. bandwidth with the following assumptions: Clock period is 50 ns. Each bus transmission takes 1 clock cycle. Memory latency is 200 ns. Data transfer is 4 bytes. Time:(1) Send address to memory: 50ns (2) Memory latency:200ns (3) Send data to reader: 50ns TOTAL = 300ns Bandwidth: 4 B / 300 ns = 13.3 MB/sec

40 39 Asynchronous Performance Find max. bandwidth with the following assumptions: Each handshake takes 40 ns. Memory latency is 200 ns. Data transfer is 4 bytes. Time:(1) First handshake: 40ns (2) Memory latency overlapped with steps 2 and 3:200ns (3) Steps 5, 6, 7: 120ns TOTAL = 360ns Bandwidth: 4 B / 360 ns = 11.1 MB/sec

41 40 Standard PC Busses There are two standard buses uses in desktop computers Peripheral Component Interconnect (PCI) parallel bus used for high-bandwidth, block- oriented peripherals Universal Serial Bus (USB) serial bus used for low bandwidth, cost- sensitive peripherals

42 41 Summary Disk Arrays RAID Systems Bus Structures It wasn’t so bad after all


Download ppt "1 Computer Architecture & Assembly Language Spring 2001 Dr. Richard Spillman Lecture 19 – IO II."

Similar presentations


Ads by Google