We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published byDestiney Kingston
Modified over 2 years ago
© Krste Asanovic, 2014CS252, Spring 2014, Lecture 17 CS252 Graduate Computer Architecture Spring 2014 Lecture 17: I/O Krste Asanovic
© Krste Asanovic, 2014CS252, Spring 2014, Lecture 17 Last Time in Lecture 16 Virtual Machines User-Level -ABI = ISA + I/O -Emulating ISAs in software System-Level (Virtual Machine Monitors) -Requirements for virtualizable ISA -Impact on virtual memory (page tables) and I/O 2
© Krste Asanovic, 2014CS252, Spring 2014, Lecture 17 (I/O) Input/Output Computers useless without I/O -Over time, literally thousands of forms of computer I/O: punch cards to brain interfaces Broad categories: Secondary/Tertiary storage (flash/disk/tape) Network (Ethernet, WiFi, Bluetooth, LTE) Human-machine interfaces (keyboard, mouse, touchscreen, graphics, audio, video, neural,…) Printers (line, laser, inkjet, photo, 3D, …) Sensors (process control, GPS, heartrate, …) Actuators (valves, robots, car brakes, …) Mix of I/O devices is highly application-dependent 3
© Krste Asanovic, 2014CS252, Spring 2014, Lecture 17 Interfacing to I/O Devices Two general strategies Memory-mapped -I/O devices appear as memory locations to processor -Reads and writes to I/O device locations configure I/O and transfer data (using either programmed I/O or DMA) I/O channels -Architecture specifies commands to execute I/O commands over defined channels -I/O channel structure can be layered over memory- mapped device structure In addition to data transfer, have to define synchronization method -Polling: CPU checks status bits -Interrupts: Device interrupts CPU on event 4
© Krste Asanovic, 2014CS252, Spring 2014, Lecture 17 Memory-Mapped I/O Programmed I/O uses CPU to control I/O device using load and store instructions, with address specifying device register to access -Load and store can have side effect on device Usually, only privileged code can access I/O devices directly, to provide secure multiprogramming -System calls sometimes provided for application to open and reserve a device for exclusive access Processors provide uncached loads and stores to prevent caching of device registers -Usually indicated by bits in page table entries or by reserving portions of physical address space 5
© Krste Asanovic, 2014CS252, Spring 2014, Lecture 17 Simple I/O Bus Structure Some range of physical memory addresses map to I/O bus devices I/O bus bridge reduces loading on critical CPU-DRAM bus Devices can be slaves, only responding to I/O bus requests Devices can be masters, initiating I/O bus transfers 6 CPU Caches DRAM I/O Bus Bridge Memory Bus I/O Bus I/O Device #1 I/O Device #2 I/O Device #3
© Krste Asanovic, 2014CS252, Spring 2014, Lecture 17 DMA (Direct Memory Access) DMA engines offload CPU by autonomously transferring data between I/O device and main memory. Interrupt/poll for done -DMA programmed through memory-mapped registers -Some systems use dedicated processors inside DMA engines Often, many separate DMA engines in modern systems -Centralized in I/O bridge (usually supporting multiple concurrent channels to different devices), works on slave-only I/O busses -Directly attached to each peripheral (if I/O bus supports mastering) 7 CPU Caches DRAM I/O Bus Bridge Memory Bus I/O Bus I/O Device #1 I/O Device #2 I/O Device #3 DMA
© Krste Asanovic, 2014CS252, Spring 2014, Lecture 17 More Complex Bus Structures 8 CPU Caches DRAM I/O Bus Bridge Memory Bus Fast I/O Bus I/O Device #1 I/O Device #3 Slow I/O Bus Bridge DMA I/O Device #2 Match speed of I/O connection to device demands -Special direct connection for graphics -Fast I/O bus for disk drives, ethernet -Slow I/O bus for keyboard, mouse, touchscreen -Reduces load on fast I/O bus + less bus logic needed on device Slow I/O Bus DMA I/O Device #4 Graphics DMA
© Krste Asanovic, 2014CS252, Spring 2014, Lecture 17 Move from Parallel to Serial I/O Off-chip 9 CPU I/O IF I/O 1I/O 2 Central Bus Arbiter Shared Parallel Bus Wires CPU I/O IF I/O 1 I/O 2 Dedicated Point-to-point Serial Links Parallel bus clock rate limited by clock skew across long bus (~100MHz) High power to drive large number of loaded bus lines Central bus arbiter adds latency to each transaction, sharing limits throughput Expensive parallel connectors and backplanes/cables (all devices pay costs) Examples: VMEbus, Sbus, ISA bus, PCI, SCSI, IDE Point-to-point links run at multi-gigabit speed using advanced clock/signal encoding (requires lots of circuitry at each end) Lower power since only one well-behaved load Multiple simultaneous transfers Cheap cables and connectors (trade greater endpoint transistor cost for lower physical wiring cost), customize bandwidth per device using multiple links in parallel Examples: Ethernet, Infiniband, PCI Express, SATA, USB, Firewire, etc.
© Krste Asanovic, 2014CS252, Spring 2014, Lecture 17 Move from Bus to Crossbar On-Chip Busses evolved in era where wires were expensive and had to be shared Bus tristate drivers problematic in standard cell flows, so replace with combinational muxes Crossbar exploits density of on-chip wiring, allows multiple simultaneous transactions 10 ABC Tristated Bus AA BB CC Crossbar
© Krste Asanovic, 2014CS252, Spring 2014, Lecture 17 I/O and Memory Mapping I/O busses can be coherent or not -Non-coherent simpler, but might require flushing caches or only non-cacheable accesses (much slower on modern processors) -Some I/O systems can cache coherently also (SGI Origin) I/O can use virtual addresses -Simplifies DMA into user address space, otherwise contiguous user segment needs scatter/gather by DMA engine -Provides protection from bad device drivers -Adds complexity to I/O device 11
© Krste Asanovic, 2014CS252, Spring 2014, Lecture 17 Interrupts versus Polling Two ways to detect I/O device status: Interrupts +No CPU overhead until event Large context-switch overhead on each event (trap flushes pipeline, disturbs current working set in cache/TLB) Can happen at awkward time Polling -CPU overhead on every poll -Difficult to insert in all code +Can control when handler occurs, reduce working set hit Hybrid approach: -Interrupt on first event, keep polling in kernel until sure no more events, then back to interrupts 12
© Krste Asanovic, 2014CS252, Spring 2014, Lecture 17 Example ARM SoC Structure 13 [©ARM]
© Krste Asanovic, 2014CS252, Spring 2014, Lecture 17 ARM Sample Smartphone Diagram 14 [©ARM]
© Krste Asanovic, 2014CS252, Spring 2014, Lecture 17 Intel Ivy Bridge Server Chip I/O 15 [©Intel]
© Krste Asanovic, 2014CS252, Spring 2014, Lecture 17 Intel Romley Server Platform 16 [©Intel]
© Krste Asanovic, 2014CS252, Spring 2014, Lecture 17 Acknowledgements This course is partly inspired by previous MIT and Berkeley CS252 computer architecture courses created by my collaborators and colleagues: -Arvind (MIT) -Joel Emer (Intel/MIT) -James Hoe (CMU) -John Kubiatowicz (UCB) -David Patterson (UCB) 17
© Krste Asanovic, 2014CS252, Spring 2014, Lecture 16 CS252 Graduate Computer Architecture Spring 2014 Lecture 16: Virtual Machines Krste Asanovic
PCI Bus CENG Spring Dr. Yuriy ALYEKSYEYENKOV 2 The PCI (Peripheral Component Interconnect) bus was developed as a low-cost, processor-independent.
CPEG3231 CPEG 323 Computer Architecture I/O Systems.
Computer Buses For many of you who have depopulated his/her breadboards, THANKS! There are still some to be depopulated. Please dont forget to do it this.
Cs 152 buses.1 ©DAP & SIK 1995 CpE 242 Computer Architecture and Engineering Busses and OSs Responsibilities.
Computer Systems Lecturer: Szabolcs Mikulas URL: Textbook: W. Stallings,
Lecture 21Comp. Arch. Fall 2006 Chapter 8: I/O Systems (continued) Adapted from Mary Jane Irwin at Penn State University for Computer Organization and.
Chapter 7 Input/Output Computer Organization and Architecture William Stallings 8th Edition.
Motherboards Basic PC Maintenance, Upgrade and Repair Mods 1 & 2.
Computer Buses Ref: Burd, Chp – 220 Englander, Chp 7 p Chp 8, p ,
Chapter 6 Storage and Other I/O Topics. Chapter 6 Storage and Other I/O Topics 2 Introduction I/O devices can be characterized by Behaviour: input, output,
Chapter 7 Input/Output HW: 7:13 & 7:18 Due Wed, 11/8/06.
Buses – Page 1CSCI 4717 – Computer Architecture CSCI 4717/5717 Computer Architecture Topic: Buses Reading: Stallings, Sections 3.4, 3.5, and 7.7.
1. OBJECTIVES: Defining the different types of buses Discussing bus arbitration and handshaking schemes Introducing I2C and PCI bus examples Interconnection.
JR.S00 1 Lecture 15: Busses and Networking (1) Prof. Jan Rabaey Computer Science 252, Spring 2000 Based on slides from Dave Patterson, John Kubiatowicz.
SCSC 311 Information Systems: hardware and software.
Virtualization Technique Virtualization Technique System Virtualization I/O Virtualization.
What is an Operating System? A program that acts as an intermediary between a user of a computer and the computer hardware. Operating system goals: Execute.
William Stallings Computer Organization and Architecture 8 th Edition Chapter 3 Top Level View of Computer Function and Interconnection.
IT253: Computer Organization Lecture 13: Input/Output Tonga Institute of Higher Education.
Hardware Fundamentals Week 4 - Lesson 1. Learning Outcomes Define the term bus Explain the different bus characteristics Calculate bus throughput in bps.
Computer Architecture and Organization Computer Components and System Buses.
Computer Architecture & Operating Systems Workshop 2 - Lecture 1 Plagiarism & Citing Sources / Program Execution, Buses & I/O Devices.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Chapter 4A Transforming Data Into Information.
Computer Graphics Prof. Muhammad Saeed. 2 Hardware ( Graphic Cards ) II Hardware II Computer Graphics 1 August 2012.
1 Operating-System Structures Operating System & its purpose Operating Systems lead to new Hardware Features System Components System Services System Calls.
1 Homework Reading –Tokheim, Section 13-5 (Three State Buffers only) Machine Projects –Continue on MP5 Labs –Continue labs with your assigned section.
Chapter 6 System Integration and Performance. 2 Chapter Goals Implementation of the system bus and bus protocol. Interaction of the CPU with peripheral.
System Integration and Performance. System Bus Connects the CPU with main memory and other system components. Connects the CPU with main memory and other.
Compiled by : S. Agarwal, Lecturer & Systems Incharge St. Xaviers Computer Centre, St. Xaviers College Kolkata. March-2003.
© 2016 SlidePlayer.com Inc. All rights reserved.