Lecturer: Simon Winberg Lecture 19 Configuration architectures … & other FPGA-based RC Building Blocks (Planned as double lecture) Attribution-ShareAlike.

Slides:



Advertisements
Similar presentations
Chapter 5 Internal Memory
Advertisements

Computer Organization and Architecture
+ CS 325: CS Hardware and Software Organization and Architecture Internal Memory.
These slides incorporate figures from Digital Design Principles and Practices, third edition, by John F. Wakerly, Copyright 2000, and are used by permission.
Lecturer: Simon Winberg Lecture 19 Configuration architectures … & other FPGA-based RC Building Blocks.
Chapter 10 Flip-Flops and Registers Copyright ©2006 by Pearson Education, Inc. Upper Saddle River, New Jersey All rights reserved. William Kleitz.
What is memory? Memory is used to store information within a computer, either programs or data. Programs and data cannot be used directly from a disk or.
Chapter 9 Memory Basics Henry Hexmoor1. 2 Memory Definitions  Memory ─ A collection of storage cells together with the necessary circuits to transfer.
1 Lecture 16B Memories. 2 Memories in General Computers have mostly RAM ROM (or equivalent) needed to boot ROM is in same class as Programmable Logic.
Overview Memory definitions Random Access Memory (RAM)
Registers –Flip-flops are available in a variety of configurations. A simple one with two independent D flip-flops with clear and preset signals is illustrated.
Overview Logic Combinational Logic Sequential Logic Storage Devices SR Flip-Flops D Flip Flops JK Flip Flops Registers Addressing Computer Memory.
Registers  Flip-flops are available in a variety of configurations. A simple one with two independent D flip-flops with clear and preset signals is illustrated.
Lecture 4: Computer Memory
IT Systems Memory EN230-1 Justin Champion C208 –
Memory Key component of a computer system is its memory system to store programs and data. ITCS 3181 Logic and Computer Systems 2014 B. Wilkinson Slides12.ppt.
1 Lecture 16B Memories. 2 Memories in General RAM - the predominant memory ROM (or equivalent) needed to boot ROM is in same class as Programmable Logic.
Sequential circuits part 2: implementation, analysis & design All illustrations  , Jones & Bartlett Publishers LLC, (
CompE 460 Real-Time and Embedded Systems Lecture 5 – Memory Technologies.
Charles Kime & Thomas Kaminski © 2008 Pearson Education, Inc. (Hyperlinks are active in View Show mode) Chapter 8 – Memory Basics Logic and Computer Design.
EKT 221 Digital Electronics II
Lecture on Electronic Memories. What Is Electronic Memory? Electronic device that stores digital information Types –Volatile v. non-volatile –Static v.
C.S. Choy95 COMPUTER ORGANIZATION Logic Design Skill to design digital components JAVA Language Skill to program a computer Computer Organization Skill.
Faculty of Information Technology Department of Computer Science Computer Organization and Assembly Language Chapter 5 Internal Memory.
Higher Computing Computer Systems S. McCrossan 1 Higher Grade Computing Studies 2. Computer Structure Computer Structure The traditional diagram of a computer...
Lecture#14. Last Lecture Summary Memory Address, size What memory stores OS, Application programs, Data, Instructions Types of Memory Non Volatile and.
Logic and Computer Design Dr. Sanjay P. Ahuja, Ph.D. FIS Distinguished Professor of CIS ( ) School of Computing, UNF.
Chapter 5 Internal Memory. Semiconductor Memory Types.
Memory and Storage Dr. Rebhi S. Baraka
Charles Kime & Thomas Kaminski © 2004 Pearson Education, Inc. Terms of Use (Hyperlinks are active in View Show mode) Terms of Use ECE/CS 352: Digital Systems.
Memory System Unit-IV 4/24/2017 Unit-4 : Memory System.
CPEN Digital System Design
Digital Logic Design Instructor: Kasım Sinan YILDIRIM
SEQUENTIAL CIRCUITS Component Design and Use. Register with Parallel Load  Register: Group of Flip-Flops  Ex: D Flip-Flops  Holds a Word of Data 
Basic Sequential Components CT101 – Computing Systems Organization.
CSI-2111 Computer Architecture Ipage Control, memory and I/O v Objectives: –To define and understand the control units and the generation of sequences.
+ CS 325: CS Hardware and Software Organization and Architecture Memory Organization.
SKILL AREA: 1.2 MAIN ELEMENTS OF A PERSONAL COMPUTER.
Chapter 4: MEMORY Internal Memory.
Computer Architecture Lecture 24 Fasih ur Rehman.
Semiconductor Memory Types
Computer operation is of how the different parts of a computer system work together to perform a task.
CO5023 Latches, Flip-Flops and Decoders. Sequential Circuit What does this do? The OUTPUT of a sequential circuit is determined by the current output.
Charles Kime & Thomas Kaminski © 2008 Pearson Education, Inc. (Hyperlinks are active in View Show mode) Chapter 8 – Memory Basics Logic and Computer Design.
1 KU College of Engineering Elec 204: Digital Systems Design Lecture 22 Memory Definitions Memory ─ A collection of storage cells together with the necessary.
07/11/2005 Register File Design and Memory Design Presentation E CSE : Introduction to Computer Architecture Slides by Gojko Babić.
Computer Architecture Chapter (5): Internal Memory
RAM RAM - random access memory RAM (pronounced ramm) random access memory, a type of computer memory that can be accessed randomly;
Sequential Logic Design
EEE4084F Digital Systems Lecture 21
Chapter 5 Internal Memory
William Stallings Computer Organization and Architecture 7th Edition
EEE4084F Digital Systems NOT IN 2017 EXAM Lecture 25
Memory Units Memories store data in units from one to eight bits. The most common unit is the byte, which by definition is 8 bits. Computer memories are.
William Stallings Computer Organization and Architecture 7th Edition
Chapter 11 Sequential Circuits.
William Stallings Computer Organization and Architecture 8th Edition
EEE4084F Digital Systems NOT IN 2018 EXAM Lecture 24X
William Stallings Computer Organization and Architecture 7th Edition
William Stallings Computer Organization and Architecture 8th Edition
BIC 10503: COMPUTER ARCHITECTURE
part 2: implementation, analysis & design
William Stallings Computer Organization and Architecture 8th Edition
Presentation transcript:

Lecturer: Simon Winberg Lecture 19 Configuration architectures … & other FPGA-based RC Building Blocks (Planned as double lecture) Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)

 Configuration architectures  Short video on NIOS II  RC Building blocks  Memories  DMA  Digital Signals  Signal Latching

Configuration Architectures RC Architecture

 Configuration architecture =  Underlying circuitry that loads configuration data and keeps it at the correct locations CPU Finite State Machine ROM Configuration controller FPGA Configuration data Configuration control Configuration requests Adapted from Hauck and Dehon Ch4 (2008) Could store pre-configured bitmaps in memory on the platform without having to send it each time from the CPU. Include hardware for programming the hardware (instead of the slower process of e.g., programming devices via JTAG from the host)

 Larger systems (e.g., the VCC) may have many FPGAs to be programmed)  Models:  Sequentially programming FPGAs by shifting in data  Multi-context – having a MUX choose which FPGA to program Configuration bit … FPGA Configuration clock Configuration enable IN OUT IN OUT IN OUT Adapted from Hauck and Dehon Ch4 (2008)

 Partially reconfigurable systems  Not all configurations may need entire chip  Could leave parts of chips unallocated  Partial configuration decreases configuration time  Modifying part of a previously configured system  E.g., a placement and routing configuration based on a currently configured state Initial ConfigurationUpdated Configuration

 Block configurable architecture  Not the same as “logical blocks” in an FPGA  Relocating configurations to different blocks at run time also referred to as “swappable logic units” (SLUs)  Example: SCORE* relocatable architecture in which configurable blocks are handled in the same way as a virtual memory system * Capsi & DeHon and Wawrzynek. “A streaming multithreaded model” In Third workshop on media and stream processors. 2001

 Reading  Hauck, Scott (1998). “The Roles of FPGAs in Reprogrammable Systems” In Proceedings of the IEEE. 86(4) pp

Short Video… Case study of PIC32 DMA Controller

 Volatile  Non-volatile

 DRAM  Capacitor stores “memory” that leaks away and needs to be periodically refreshed  High memory capacity  SDRAM = Synchronous DRAM  Runs in synch with system* clock  DDR SDRAM = Double-data rate SDRAM, runs at 2x the system clock * Note the system clock in this case is closer to the “motherboard” clock. Usually considerably slower than the processor clock (standard DRAM may have its own even slower clock and synchronization hassles)

 SRAM  Static RAM  Does not need refreshing  Uses “bistable latching circuitry” (i.e. a flip flop) to store each bit  Can be very fast compared to DRAM  A small amount of SRAM (~16 Kb) is typically used within a microcontroller / FPGA to hold things such as a boot loader and interrupt vectors, and as CACHE SR Latch to hold a bit of SRAM * SR Latch implemented using two NOR gates * * Images from

 BRAM or Block RAM  This refers to a small block of RAM (a few Kilobytes) integrated within the FPGA (connected some LBs)  Generally only found in higher-end FPGAs (e.g. 16Kb takes ~ 256K transistors if not more for connection and addressing logic)  Block SRAM is more common and easier to use; the FPGA may include Block DRAM  Generally can be set to RAM or ROM  As ROM it can be used as a (big) LUT  Usually not directly accessible form outside the FPGA (need to provide circuitry / softcore and comms protocol to access it from a PC)

 Under development  Z-RAM : Zero-capacitor RAM  Single transistor  Higher density than DRAM  Although it is called zero-capacitor, the capacitor is actually there in the form of a “floating body effect” caused by the transistor substrate  See:

 Trusty old ROM and EEPROM  Still widely used as it is highly robust  Current versions store large amounts of data  Fairly simple technology (i.e. fused connections) and (in EEPROM ability to fuse and then program/un-fuse connections)  Usually ROM is slower than RAM  Shadowing ROM (i.e. copy to RAM) to make it faster – especially for EEPROMs  EEPROM very slow write; faster read

 Flash memory  Can be electrically erased and programmed  High capacity (e.g., millions of bytes/chip)  Needs to be programmed one block at a time (~8Kb / block)  Erased (all bits in block set to 1)  Programmed one block at a time  Memory wear  Limited to about 100,000 erase – write cycles  Usually a file system (e.g. ext3) will keep track of bad sectors (i.e., mark deteriorated blocks). But this deterioration might happen a certain time after the erase and write is complete and verified.

NAND Flash memory model The above diagram provides a macro circuit model for a single flash memory cell, showing a Effective-Control-Gate (ECG) equivalent circuit and the Ideal- Current-Mirror (ICM) used to calculate the floating gate (FG*) voltage. MOSFET1 is the equivalent N-MOSFET model of a flash memory cell, and MOSFET2 is the model of a N-MOSFET test structure that is identical with the flash memory cell (excluding the short between FG and CG). * Image source: IEEE Electron Device Letters, Vol. 26, No. 8, AUGUST 2005, pg 564 Available at: 203/1570/1/ pdf 203/1570/1/ pdf

RC Building Blocks: Digital Signals and Data Transfers Reconfigurable Computing

 Although our objective is towards parallel operations, there are still sequential issues involved, for example a device B waiting for a device A to provide input  Furthermore the input to a device A might disappear (become invalid) before device A has completed its computations. In Out Device ADevice B

 There are other issues involved such as:  How does device A know when new data has arrived?  How does device B know when device A has completed?  What if both devices need to be clocked, but aren’t active all the time?  What if you want to share address and data lines? In Out Device ADevice B handshaking lines

OUTPUTS  A sequential logic system typically involves two parts:  Storage (aka “bistable” device)  Combinational logic (OR, AND, etc gates) INPUTS Combinational Logic Device Storage Data Another combinational logic device(s) Another combinational logic device(s) potentially shared data busses, possibly 2 separate busses for full-duplex, one for read one for write control lines (e.g., do you want to read or write, are you done setting all the bits, etc.)

 Usually need the following  Address bus  Data bus  Control lines  Chip / Device select lines  Write enable lines  Read enable lines

RC Building Blocks: DMA – Efficient Data Transfer Reconfigurable Computing

 Originally direct memory access (DMA) referred to a feature provided on a computer systems whereby peripherals within the computer can access the system memory for reading and/or writing independently of the central processing unit.  This is still an appropriate definition; except rather consider DMA as a more general description, whereby separate hardware can both access memory directly (without the CPU doing any work), and can request the memory subsystem (really the DMAC) to perform memory copies or transfers.

Typical computer design without DMA address Address Decoder Memory (Device 0) address CS0* CS* UART (Device 1) CS1* CS2* CS* RD* … CPU WR* data Signals: Address : address line (e.g. 32 bits) Data : data line (e.g., 32 bits) IRQ : Interrupt ReQuest line CS* : Chip select (active low) RD* : Read enable (active low) WR*: Write enable (active low) IRQ In this approach, each peripheral signals the CPU and tells it to receive data and r/w memory

DMA : Direct Memory Access address Memory CPU IRQ DMA Direct memory access is system in which memory is accessed without using the CPU. A certain stimulus (e.g. a device needing data sent/received) can have this data sent/received directly from/to a block of memory location via the DMA controller (DMAC). Peripherals such as ADCs, GPUs and Ethernet, which require frequent movements of memory, typically support DMA. DMA controllers can be configured to handle moving collected data from peripherals into specific memory locations (e.g., arrays directly accessible from a C program). Additional control logic is required to manage the sharing of the address and data bus. IRQ Further reading: data Device (e.g., Graphics Card) DMA Controller

 Standard block transfer  DMAC does sequence of memory transfers  Load operation from source address, store operation to destination  Initiated under software control (e.g., copying data from one memory area to another) i.e., array X = array Y  Demand-mode transfer  Same as block transfer, but controlled by external device. I/O device requests and synchronizes the operation Ref: Catsoulis, J. (2003). Designing Embedded Hardware. O’Reilly.

 Fly-by transfers  High speed operation  Memory and I/O on different bus  E.g., I/O given read request at same time that memory is given write request  Can simultaneously read/write I/O device and write/read memory  Data-chaining transfers  Linked list in memory  DMAC given pointer to descriptor  Descriptor indicates: size, src address, dest address, next descriptor

DMAC Modes of Operation Adapted from source: Byte Mode Burst Mode DMA Controller can support a range of modes. The three modes shown left are commonly supported. Block Transfer Mode 123w123w123w… w 1 2 w 3 12w2w2w3…12w2w2www… 1 2 w Sequence of states CPU deactivates

RC Building Blocks: Latching (capturing Signals) Reconfigurable Computing

 In order to capture the signals, you need some storage  Two basic types of storage: Latches Flip-flops

 Latches Q = D  Changes state when the input states change (referred to as “transparency”)  Can include an enable input bit – in which case the output (Q) is set to D only when the enable input is set.  Flip-flop Q = D (Q changes when clocked)  A flip-flop only change state when the clock is pulsed.

 Latches are used more in asynchronous designs  Flip-flips are used in synchronous designs  A “synchronous design” is a system that contains a clock You can of course mix synchronous and asynchronous, and this is particularly applicable to parallel systems in which different parts of the system may run at different speeds (e.g., the main processor working at 1GHz and specialized hardware possibly operating asynchronously as fast as their composite pipelined operations are able to complete)

SR Latch ABXY XX A basic latch has two stable states: State 1 Q = 1 not Q = 0 State 2 Q = 0 not Q = 1 And an unstable state in which both S and R are set (which can cause the Q and not Q lines to toggle) S-R Latch (set / reset latch)Symbol

Gated SR Latch: a latch with enable Combinational logic circuit with a clock (or enable) input connected Usually the type used in digital systems. It of course costs more in transistors!! Example signals or “gate” input Only changed on clock pulse Gated SR-Latch Symbol

Flip-flop The standard JK flip-flop is much the same as a gated SR latch, modified so that Q toggles when J = K = 1 The D-type flip flop (which you may want to use in Prac3 to store data) is a JK flop flop modified (see left) to hold the state of input D at each clock pulse. JK flip-flop D flip-flop clockDQ 00X ………

T-type Flip-flop The T-type flip-flops toggle the input. Q = not Q each time T is set to 1 when the clock pulses The D-type flip flop (which you may want to use in Prac3 to store data) is a JK flop flop modified (see left) to hold the state of input D at each clock pulse. T flip-flop D flip-flop ClockTQ ………

Preset and Clocking Preset line (PR) and clear line (CL) are asynchronous inputs used to set (to 1) or clear the value stored by the flip-flop.

Edge triggered devices A note on notation: Edge-triggered inputs are shown using a triangle. Negative edges triggered inputs are shown without a circle on the incoming line. in Negative edge triggered in Positive edge triggered

End of Lecture Any Question??

Image sources: Latch image – Wikipedia open commons (public domain) Video frame showing representation of suggested video clip – YouTube Various screenshots from Altera Quartus II Disclaimers and copyright/licensing details I have tried to follow the correct practices concerning copyright and licensing of material, particularly image sources that have been used in this presentation. I have put much effort into trying to make this material open access so that it can be of benefit to others in their teaching and learning practice. Any mistakes or omissions with regards to these issues I will correct when notified. To the best of my understanding the material in these slides can be shared according to the Creative Commons “Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)” license, and that is why I selected that license to apply to this presentation (it’s not because I particulate want my slides referenced but more to acknowledge the sources and generosity of others who have provided free material such as the images I have used).