PLD Introduction to PLD. Presented by:.

PLD Introduction to PLD. Presented by:

Contents: Idea; History;
High-Capacity PLD’s Architecture & Overview of ALTERA PLD; Computer aided design (CAD) flow for PLD Introduction to VHDL/VerilogHDL; Getting started. 9/17/2018

9/17/2018 Contents: Idea; History; High-Capacity FPGA’s Architecture & Overview of ALTERA FPGA; Computer aided design (CAD) flow for PLD Introduction to VHDL/VerilogHDL; Getting started. 9/17/2018

Today a chips are distributed into three groups:
Idea – Definition. Today a chips are distributed into three groups: ASIC’s (Application Specific Integrated Circuits) Chips with hardware realization of data processing algorithms (microprocessors & microcontrollers) FPGA & CPLD 9/17/2018

Idea – Definition. A programmable logic device can be defined as …
an integrated circuit containing configurable logic and/or storage elements, which are linked together using programmable interconnect In general the following resources are distinguished: Logic; Interconnect; I/O. 9/17/2018

Idea – Details. One or more resources are configurable;
Reprogrammable or in system programmable (ISP); Reduces design and debug cycle Allows field upgrades of existing logic systems A small amount of fault tolerance Raises the possibility of reconfigurable computing – but there are still many problems to be solved before this is realized Nowadays configuration of all resources; Special structures added to the devices, like RAMs, DLLs, … 9/17/2018

Idea - Advantages of PLDs.
The programmability offers: Short development time; Short turnaround time; Rapid prototyping; Flexibility with respect to engineering change orders; Save board space; Small NRE (non recurring engineering ) costs. 9/17/2018

High-Capacity FPGA’s Architecture & Overview of ALTERA FPGA; Computer aided design (CAD) flow for PLD Introduction to VHDL/VerilogHDL; Getting started. 9/17/2018

9/17/2018 History – PLA. The first device developed specifically for implementing logic circuits was PLA. A B C AND plane Programmable switch or fuse OR plane A PLA is structured so that any of its inputs (or their complements) can be AND’ed together in the AND-plane; each AND-plane output can thus correspond to any product term of the inputs. Similarly, each Orplane output can be configured to produce the logical sum of any of the AND-plane outputs. With this structure, PLAs are well-suited for implementing logic functions in sum-of-products form. They are also quite versatile, since both the AND terms and OR terms can have many inputs (this feature is often referred to as wide AND and OR gates). When PLAs were introduced in the early 1970s, by Philips, their main drawbacks were that they were expensive to manufacture and offered somewhat poor speed-performance. Both disadvantages were due to the two levels of configurable logic, because programmable logic planes were difficult to manufacture and introduced significant propagation delays. To overcome these weaknesses, Programmable Array Logic (PAL) devices were developed. A PLA consists of two logic gates levels: programmable “wired” OR-plane; programmable “wired” AND-plane. 9/17/2018

Programmable switch or fuse
9/17/2018 History – PAL. Programmable switch or fuse A B C AND plane To compensate for lack of generality incurred because the OR-plane is fixed, several variants of PALs are produced, with different numbers of inputs and outputs, and various sizes of OR-gates. PALs usually contain flip-flops connected to the OR-gate outputs so that sequential circuits can be realized. PAL devices are important because when introduced they had a profound effect on digital hardware design, and also they are the basis for some of the newer, more sophisticated architectures that will be described shortly. Variants of the basic PAL architecture are featured in several other products known by different acronyms. All small PLDs, including PLAs, PALs, and PAL-like devices are grouped into a single category called Simple PLDs (SPLDs), whose most important characteristics are low cost and very high pin-to-pin speed-performance. PALs feature only a single level of programmability, consisting of a programmable “wired” AND-plane that feeds fixed OR-gates. 9/17/2018

9/17/2018 History – SPLD. A B C Flip-flop Select Enable D Q Clock AND plane MUX As technology has advanced, it has become possible to produce devices with higher capacity than SPLDs. The difficulty with increasing capacity of a strict SPLD architecture is that the structure of the programmable logic-planes grow too quickly in size as the number of inputs is increased. All small PLDs, including PLAs, PALs, and PAL-like devices are grouped into a single category called Simple PLDs (SPLDs), whose most important characteristics are low cost and very high pin-to-pin speed-performance. 9/17/2018

Interconnection Matrix
9/17/2018 History – CPLD. PLD Block Interconnection Matrix I/O Block CPLDs were pioneered by Altera, first in their family of chips called Classic EPLDs, and then in three additional series, called MAX 5000, MAX 7000 and MAX Because of a rapidly growing market for large FPDs, other manufacturers developed devices in the CPLD category and there are now many choices available. CPLDs provide logic capacity up to the equivalent of about 50 typical SPLD devices, but it is somewhat difficult to extend these architectures to higher densities. To build FPDs with very high logic capacity, a different approach is needed. The way to provide large capacity devices is to integrate multiple SPLDs onto a single chip and provide interconnect to programmable SPLD blocks together. PLD that has this basic structure are referred as Complex PLDs (CPLDs). 9/17/2018

9/17/2018 History - FPGA. I/O Logic block Interconnection switches FPGAs comprise an array of uncommitted circuit elements, called logic blocks, and interconnect resources. FPGA configuration is performed through programming by the end user. The highest capacity general purpose logic chips available today are the traditional gate arrays sometimes referred to as Mask-Programmable Gate Arrays (MPGAs). MPGAs consist of an array of pre-fabricated transistors that can be customized into the user’s logic circuit by connecting the transistors with custom wires. Customization is performed during chip fabrication by specifying the metal interconnect, and this means that in order for a user to employ an MPGA a large setup cost is involved and manufacturing time is long. Although MPGAs are clearly not FPDs, they are mentioned here because they motivated the design of the user-programmable equivalent: Field- Programmable Gate Arrays (FPGAs). FPGAs are only one type of PLD that supports very high logic capacity, It has been responsible for a major shift in the way digital circuits are designed! 9/17/2018

High-Capacity PLDs Architecture. Difference between FPGAs and CPLDs
CPLDs often consist of a limited set of complex reconfigurable blocks - Complex Programmable Logic Device consists of multiple SPLD blocks that are interconnected to realize larger digital systems FPGAs are configured at a fine grain level from many equivalent logic blocks or array of configurable gates - Field Programmable Gate Array has narrower logic choices and more memory elements. LUT (Lookup Table) may replace actual logic gates. 9/17/2018

High-Capacity PLDs Architecture. Difference between FPGAs and CPLDs
I/O PLD Block Interconnection Matrix I/O Block 9/17/2018

History - important terminology. Definitions of Relevant Terminology:
CPLD — a Complex PLD that consists of an arrangement of multiple SPLD-like blocks on a single chip; FPGA — a Field-Programmable Gate Array is an PLD featuring a general structure that allows very high logic capacity; Interconnect — the wiring resources in an PLD; Programmable Switch — a user-programmable switch that can connect a logic element to an interconnect wire, or one interconnect wire to another; Logic Cell – The basic building block of an Altera device Macrocell – The basic building block of Product Term-based device MAX9000, MAX7000 9/17/2018

History - important terminology.
Logic Element – The basic building block of Look-Up Table-based device FLEX10K, FLEX8000, FLEX6000 Logic Array Block (LAB) – A collection or group of logic cells Logic Capacity — the capacity of an PLD is measured by the size of gate array which is comparable to logic capacity. It can be thought as “number of 2-input NAND gates”; Logic Density — the amount of logic blocks per unit area in an PLD. Speed-Performance — measures the maximum operable speed of a circuit when implemented in an PLD. 9/17/2018

9/17/2018 History - Overview. Fairchild’s “Micromosaic” Early 70’s - First appearance of PLDs Late 70’s - Arrival of CPLDs First SRAM based devices Late 80’s - Arrival of FPGAs ISP (Lattice) First 1 million gates device First 3 million gates device First 10 million gates device The chart summarizes the categories of FPDs by listing the logic capacities available in each of the three categories. The chart summarizes the categories of FPDs by listing the logic capacities available in each of the three categories. In the figure, “equivalent gates” refers loosely to “number of 2-input NAND gates”. The chart serves as a guide for selecting a specific device for a given application, depending on the logic capacity needed. However, as we will discuss shortly, each type of FPD is inherently better suited for some applications than for others. It should also be mentioned that there exist other special-purpose devices optimized for specific applications (e.g. state machines, analog gate arrays, large interconnection problems). However, since use of such devices is limited they will not be described here. 9/17/2018

High-Capacity FPGA’s Architecture & Overview of ALTERA FPGA; Computer aided design (CAD) flow for PLD Introduction to VHDL/VerilogHDL; Getting started. 9/17/2018

FPGA - Generic Structure
9/17/2018 FPGA - Generic Structure FPGA building blocks: Programmable logic blocks Implement combinatorial and sequential logic. Based on programmable logic (LUT) and DFF. Look Up Tables made from small RAM cells. Programmable logic blocks can also be used as small memory blocks Programmable interconnect Wires to connect inputs and outputs to logic blocks. Programmable interconnects using switching matrixes. Several types of interconnects: clocks, short distance local connections, long distance connections across chip Programmable I/O blocks Special logic blocks at the periphery of device for external connections. I/O buffers have various voltage support and tri-state option. I/O Logic block Interconnection switches 9/17/2018

High-Capacity PLDs Architecture.
9/17/2018 High-Capacity PLDs Architecture. Today’s FPGA Devices Meet Embedded System Requirements Embedded RAM Wide range of fast I/O High-performance Digital Signal Processing (DSP) blocks Abundant logic Substantial embedded memory Low Cost FPGA and Structured ASIC families Soft Processor cores Today’s leading edge FPGAs have the density, features, and performance to support new designs, Wide Range of Fast I/O Up to Gbps PCI Express, SDI, Gigabit Ethernet, Serial RapidIO™, XAUI, SONET etc. 267 MHz DDR2 Memory Interfaces etc. Substantial Embedded Memory Up to 9 Mbits High Performance DSP Blocks 400-MHz MAC, x9, x18, or x36 Abundant Logic Up to 2.2M True Useable ASIC Gates Low Cost FPGA and Structured ASIC Families. Here is just one example 9/17/2018

ALTERA Cyclone II FPGA Overview
9/17/2018 ALTERA Cyclone II FPGA Overview Highest Density 68K Logic Elements (700K ASIC Gates) Highest Performance High-Performance DSP 250 MHz Performance Embedded Systems Software Nios II Processor Key Messages: Cyclone II was built on the success of the first-generation Cyclone family. Altera has shipped over 10 million units of Cyclone in the first 2 years. Cyclone II FPGAs were built from the ground up to provide the lowest cost possible. Since Cyclone II offers the industry’s smallest die size for a low-cost FPGA family, we get a better cost structure that allows Altera to pass the savings to our customers by offering the lowest price. These low prices combined with the functionality & performance of Cyclone II devices allows Cyclone II FPGAs to become an ideal replacement for low-to-mid density ASICs & ASSPs & other FPGAs. Cyclone II FPGAs offer many advantages versus competing low-cost FPGAs. - Lowest Price: Cyclone II offers the industry’s smallest die size. Cyclone II devices offer 30% lower cost (on a cost per logic basis) than Cyclone devices. - Highest Performance: Cyclone II FPGAs are 60% faster than competing 90 nanometer low-cost FPGAs. - High Performance DSP: Cyclone II FPGAs offer up to 150 multipliers capable of running at 250 MHz for high-performance DSP applications. - Low-Cost Embedded Processing: Cyclone II devices support Nios II processors for as low as 35 cents of logic with the Nios II/e (economy) processor in a EP2C20 at 250Ku volumes. The Nios II/f processor offers over 100 DMIPS performance in a Cyclone II FPGA. Use Nios II processors in Cyclone II devices to offload external processors or to replace them & reduce system costs. - Low Power: Cyclone II FPGAs consume less static & dynamic power than Spartan-3 devices. 9/17/2018

ALTERA Cyclone II FPGA Architecture
Features: 90-nm dielectric process High density architecture with 4,608 to 68,416 LEs Up to 1.1 Mbits of RAM True dual-port operations (one read one write, two reads or two writes) for x1, x2, x4, x8, x16 and x18 modes Variable port configurations (x1, x2, x4, x8, x16, x32 and x36) Up to 260MHz operation Embedded Multipliers Advanced I/O support Flexible clock management circuitry Hierarchical clock network for up to MHz Up to 4 PLLs Up to 16 global clock lines 9/17/2018

Overview of ALTERA Cyclone II Device
Embedded Multipliers Logic Array Side I/O Elements with Support for PCI/PCI-X & Memory Interfaces M4K Memory Blocks Top & Bottom I/O Elements with Support for Memory Interfaces Phase-Locked Loops 9/17/2018

High-Capacity FPDs Architecture. Configurable logic
Clock Flip-flop D Q MUX D0 D1 D3 D2 Y S1 S0 CLR LUT 1 Logic resources arranged in arrays or in rows. Logic blocks hold large or small amount of logic (fine grained architecture) 9/17/2018

9/17/2018 Look-Up Tables (LUT) LUT with N-inputs can be used to implement any combinatorial function of N inputs LUT is programmed with the truth-table LUT A B C D Z LUT implementation A B C D Z Truth-table Gate implementation 9/17/2018

LUT Implementation Example: 3-input LUT
0/1 X1 X2 X3 F Example: 3-input LUT Based on multiplexers (pass transistors) LUT entries stored in configuration memory cells Configuration memory cells 9/17/2018

ALTERA Cyclone II Logic Element
Look-up table (LUT) to implement combinatorial logic Register for sequential circuits Additional logic (below): Carry logic for arithmetic functions Expansion logic for functions requiring more than 4 inputs LUT Chain Carry In0 Carry In1 Register Chain Local Routing In1 In2 In3 In4 LUT REG General Routing Clock General Routing Carry Out0 Carry Out1 Register Chain The basic logic block, called a Logic Element (LE) contains a four input LUT, a flip-flop, and special-purpose carry circuitry for arithmetic circuits. The LE also includes cascade circuitry that allows efficient implementation of wide AND functions. 9/17/2018

Logic Element: Normal Mode
9/17/2018 Logic Element: Normal Mode LUT Chain Input Register Chain Input Register Control Signals addnsub cin (2) data1 4-Input LUT Sync Load & Clear Logic data2 D DATA Row, Column & DirectLink Routing data3 data4 Local Routing Register Feedback Each LE’s programmable register can be configured for D, T, JK, or SR operation. Each register has data, true asynchronous load data, clock, clock enable, clear, and asynchronous load/preset inputs. Global signals, general-purpose I/O pins, or any internal logic can drive the register’s clock and clear control signals. LUT Chain Output Register Chain Output The normal mode is suitable for general logic applications and wide decoding functions that can take advantage of a cascade chain. In normal mode, four data inputs from the LAB local interconnect and the carry-in signal are the inputs to a 4-input LUT. 9/17/2018

Logic Element: Dynamic Arithmetic Mode
LAB Carry-In Register Chain Input Register Control Signals Carry-In Logic Carry-In0 Carry-In1 addnsub data1 Sum Calculator Sync Load & Clear Logic D DATA data2 Row, Column & DirectLink Routing data3 Carry Calculator Local Routing Carry-In0 Carry-Out Logic Carry-In1 Register Chain Output The arithmetic mode offers two 3-input LUTs that are ideal for implementing adders, accumulators, and comparators. One LUT provides a 3-bit function; the other generates a carry bit. Carry-Out1 Carry-Out0 9/17/2018

Dynamic Arithmetic Mode Full adder
S = (A xor B) xor C Co = (A * B) + (C*(A xor B)) A B S Co Ci S Co Ci B A 1 2 3 4 5 6 7 9/17/2018

Logic Element: Dynamic Arithmetic Mode
B S Co Ci LAB Carry-In Register Chain Input Register Control Signals Carry-In Logic Carry-In0 Carry-In1 addnsub data1 Sum Calculator Sync Load & Clear Logic D DATA data2 Row, Column & DirectLink Routing data3 Carry Calculator Local Routing Carry-In0 Carry-Out Logic Carry-In1 Register Chain Output Carry-Out1 9/17/2018 Carry-Out0

Counters Ripple counter Ripple-carry counter
B clk Flip-flop D Q R Reset D2 D1 D0 Ripple counter Ripple carry counter is not recommended in FPGA designs due to their asynchronous nature clk Reset Flip-flop D Q R D2 D1 D0 CIN COUT D0 = Q0 xor Cin C0 = Q0 and Cin Ripple-carry counter Synchronous design 9/17/2018

Dynamic Arithmetic Mode Ripple-carry counter
clk Reset Flip-flop D Q R D2 D1 D0 CIN COUT LAB Carry-In Register Chain Input Register Control Signals Carry-In Logic Carry-In0 Carry-In1 addnsub data1 Sum Calculator Sync Load & Clear Logic D DATA data2 Row, Column & DirectLink Routing data3 Carry Calculator Local Routing Carry-In0 Carry-Out Logic Carry-In1 Register Chain Output Carry-Out1 9/17/2018 Carry-Out0

Dynamic Arithmetic Mode Comparator
A=B if A(0) = B(0) and A(1) = B(1) … and A(n-1)=B(n-1) or AeqB = (A(0) xnor B(0)) and (A(1) xnor B(1)) and ….etc. B(2) A(2) B(1) A(1) B(0) A(0) B(2) A(2) B(1) A(1) B(0) A(0) A eq B A eq B 9/17/2018

Logic Array Blocks (LAB)
9/17/2018 Logic Array Blocks (LAB) Control Signals: 2 CLK 2 CLK EN 2 ACLR 1 SCLR 1 SLOAD Direct link interconnect from left and right LAB, MK4 memory block, embedded multiplier, PLL or IOE output 4 LE1 4 LE2 4 LE3 16 LEs Local Interconnect LAB Control Signals LE carry chains Register chains Direct link interconnect to left (up to 48 LEs) 4 Direct link interconnect to right (up to 48 LEs) LE4 Fast Local Interconnect 4 LE13 4 LE14 4 30 LAB Input Lines 10 LE Feedback Lines LE15 4 LE16 9/17/2018

High-Capacity PLDs Architecture
High-Capacity PLDs Architecture. Several types of configurable interconnects Before Programming After Programming Switch matrix 6 pass transistors per switch matrix interconnect point Pass transistors act as programmable switches Pass transistor gates are driven by configuration memory cells Interconnect point 9/17/2018

Programmable Interconnect
9/17/2018 Programmable Interconnect Interconnect hierarchy (not shown) Fast local interconnect Horizontal and vertical lines of various lengths LE LE LE Switch Matrix Switch Matrix LE LE LE 9/17/2018

High-Capacity PLDs Architecture. Several types of I/O
Configurability of the user I/O varies to a great extent there are dedicated pins for power and configuration some of the user I/O pins may be reserved for special function during configuration. In/out/tri-state; Flip-flops, latches; Pull-up/pull-down; DDR; Series resistors; Bus keeper; Drive strength control; Slew rate control; Single ended/differential. 9/17/2018

High-Capacity FPDs Architecture. Several types of I/O
Several low-voltage I/O standards; Mixed voltage I/O bank capability; Delay lines; Boundary scan (JTAG); Differential I/O. Enable Flip-flop From array D Q Clock Flip-flop To array Q D Clock 9/17/2018

Basic I/O Block Structure
Three-State D Q Three-State Control Clock Output D Q Output Path Direct Input Input Path Registered Input Q D 9/17/2018

I/O Bank I/O Bank Numbers & Locations
9/17/2018

High-Capacity FPDs Architecture. Special structures
On chip RAMs and ROMs: Nearly all vendors offer devices with on chip RAM Blocks; RAM blocks may be cascaded; RAM blocks can be configured in different ways (single ported, dual ported, synchronous, asynchronous, CAM). Clock management - on chip DLLs or PLLs: High end devices have up to 8 DLLs/PLLs on chip; Used to deskew on/off chip clock signals (e. g. to RAM banks); Provide clock division and multiplication capabilities; DLLs have a minimum operating frequency. DSP options and applications: On chip Multipliers; On chip MACs. On chip Microprocessor Cores; Support for various interface standards High-speed serial I/Os Boundary Scan/JTAG. 9/17/2018

CycloneII Embedded Memory
4-Kbit Blocks 250-MHz Performance Fully Synchronous True Dual-Port Mode Simple Dual-Port Mode Single-Port memory Flexible Capabilities Mixed-Clock Mode Mixed-Width Mode Shift Register Mode Read-Only Mode Byte Enables Initialization Support Port A Port B DATA ADDR WREN CLK CLKENA OUT CLR DATA ADDR WREN CLK CLKENA OUT CLR 9/17/2018

Block RAM Port Aspect Ratios
1 2 4 8 512 x 8 1k x 4 8+1 2k x 2 512k x (8+1) 16 256 x 16 16+2 4k x 1 256 x (16+2) 32 128 x 32 32+4 9/17/2018 128 x (32+4)

Overview of ALTERA PLDs. Embedded Array Block
9/17/2018 Overview of ALTERA PLDs. Embedded Array Block Use of EAB: Logic Functions – Area-efficient and fast for complex functions – DSP – Arithmatic Logic – Microprocessor / Microcontroller Memory Functions – RAM – ROM – Dual-port RAM – FIFO EAB size is flexible Combine EABs to create larger blocks 9/17/2018

Global Clock Network & Phase-Locked Loops
9/17/2018 Global Clock Network & Phase-Locked Loops Clock management is important within digital systems design High speed designs require low latency, low skew clock solutions Low latency – a minimum propagation delay time throughout the device Low skew – a minimum difference between actual clock edges as seen on various points on the device Sources for clock skew? Cyclone II devices provide the following for clock management A global clock network Up to four phase-locked loops (PLLs) 9/17/2018

PLLs and global clock network
9/17/2018 PLLs and global clock network 3 2 1 4 GCLK Parameter Cyclone II PLL Input Frequency Range 11 to 311 MHz Output Frequency Range 10 to MHz Time to Lock from Power up 1 ms VCO Operating Range 300 MHz to 1 GHz 16 Total Nets Used as Clock Sources for All Device Blocks Fed by Dedicated Clock Pins PLL Outputs Internal Logic CLK[3..0] CLK[15..12] CLK[11..8] CLK[7..4] Contain up to four PLLs arranged in the four corners of the Cyclone II device Provide robust clock management and synthesis for device clock management, and external system clock management PLLs offer clock multiplication and division, phase shifting, and programmable duty cycle Can be used to minimize clock delay and clock skew 9/17/2018

Phase-Locked Loops (PLLs)
9/17/2018 Phase-Locked Loops (PLLs) A PLL is a closed-loop feedback control system that maintains a generated signal in a fixed phase relationship to a reference signal Applications include: Frequency synthesizers for digitally-tuned radio receivers and transmitters FM and AM radio signal demodulation Clock multipliers in digital systems that allow internal elements to run faster (or slower) than external connections, while maintaining precise timing relationships (our basic application in this course) Cyclone II PLLs provide general-purpose clocking with clock multiplication and phase shifting as well as outputs for differential I/O support 9/17/2018

Cyclone II PLL Details Basic PLL Operation
CP LF PFD VCO G0 G1 EG Global Clock Network I/O Buffer N M Lock Detect I/O & Global Routing Reference Clock The main purpose of a PLL is to synchronize the phase and frequency of a voltage controlled oscillator (VCO) to an input reference clock There are a number of components that comprise a PLL to achieve this phase alignment The PLL compares the rising edge of the reference input clock to a feedback clock using a phase-frequency detector (PFD) The PFD produces an up or down signal that determines whether the VCO needs to operate at a higher or lower frequency The PFD output is applied to a charge pump and loop filter, which produces a control voltage for setting the frequency of the VCO If the PFD transitions the up signal high, then the VCO frequency increases If the PFD transitions the down signal high, then the VCO frequency decreases 9/17/2018

Cyclone II PLL Details Basic PLL Operation
9/17/2018 Cyclone II PLL Details Basic PLL Operation The loop filter converts these up/down signals to a voltage that is used to bias the VCO If the charge pump receives a logic high on the up signal, current is driven into the loop filter If the charge pump receives a logic high on the down signal, current is drawn from the loop filter The voltage from the charge pump determines how fast the VCO operates The VCO is implemented as an four-stage differential ring oscillator A divide counter, m, is inserted in the feedback loop to increase the VCO frequency above the input reference frequency, making the VCO frequency fVCO = m × fREF The feedback clock, fFB, applied to one input of the PFD, is locked to the input reference clock, fREF (fIN/n), applied to the other input of the PFD The VCO output can feed up to three post-scale counters (c0, c1, c2) These post-scale counters allow a number of harmonically related frequencies to be produced by the PLL 9/17/2018

CycloneII Embedded Multiplier
9/17/2018 Sign_X 18 X 36 36 Input Registers Output Registers 18 Y Sign_Y Clock Clear 250-MHz Performance 9/17/2018 Note: Fastest Speed Grade with Registers Activated in 18x18 or 9x9 Mode

High-Capacity FPDs Architecture. Programming technologies
9/17/2018 High-Capacity FPDs Architecture. Programming technologies An EEPROM or EPROM transistor is used as a programmable switch for CPLDs (and also for many SPLDs) by placing the transistor between two wires in a way that facilitates implementation of wired-AND functions. The slide shows EPROM transistors as they might be connected in an AND-plane of a CPLD. An input to the AND-plane can drive a product wire to logic level ‘0’ through an EPROM transistor, if that input is part of the corresponding product term. For inputs that are not involved for a product term, the appropriate EPROM transistors are programmed to be permanently turned off. A diagram for an EEPROMbased device would look similar. 9/17/2018

High-Capacity FPDs Architecture. Programming technologies
FPGA products are based either on SRAM or antifuse technologies. An example of usage of SRAM-controlled switches showing two applications of SRAM cells: for controlling the gate nodes of pass-transistor switches and to control the select lines of multiplexers that drive logic block inputs. 9/17/2018

High-Capacity FPGA’s Architecture & Overview of ALTERA FPGA; Computer aided design (CAD) flow for PLD; Introduction to VHDL/VerilogHDL; Getting started. 9/17/2018

Computer aided design (CAD) flow for PLD
9/17/2018 CAD tools are important not only for complex devices like CPLDs and FPGAs, but also for SPLDs. A typical CAD system for PLDs would include software for the following tasks: initial design entry, logic optimization, device fitting, simulation, and configuration. Text entry Schematic capture Marge & translate Optimize equations Device fitter Simulate PLD When designing circuits for implementation in FPDs, it is essential to employ Computer-Aided Design (CAD) programs. Such software tools are discussed briefly in this section to provide a feel for the design process involved. This design flow is illustrated in picture, which also indicates how some stages feed back to others. Design entry may be done either by creating a schematic diagram with a graphical CAD tool, by using a textbased system to describe a design in a simple hardware description language, or with a mixture of design entry methods. Since initial logic entry is not usually in an optimized form, algorithms are employed to optimize the circuits, after which additional algorithms analyse the resulting logic equations and “fit” them into the SPLD. Simulation is used to verify correct operation, and the user would return to the design entry step to fix errors. When a design simulates correctly it can be loaded into a programming unit and used to configure an SPLD. One final detail to note about picture is that while the original design entry step is performed manually by the designer, all other steps are carried out automatically by most CAD systems. Evaluation Board Programming unit Configuration file HW (Board) Simulation & Test 9/17/2018

Quartus II Development System
9/17/2018 Quartus II Development System Fully-Integrated Design Tool Multiple Design Entry Methods Logic Synthesis Place & Route Simulation Timing & Power Analysis Device Programming

Text entry and Schematic Capture 9/17/2018

Marge & translate LUT0 LUT4 LUT1 FF1 LUT5 LUT2 FF2 LUT3 9/17/2018

Optimize equations FPGA 9/17/2018

Device fitter : Placing & Routing FPGA Programmable Connections 9/17/2018

Simulate PLD 9/17/2018

Contents: Idea; History; High-Capacity FPGA’s Architecture;
Computer aided design (CAD) flow for PLD; Introduction to VHDL/VerilogHDL; Getting started. 9/17/2018

VHDL Coding for Synthesis

Non-synthesizable VHDL
Delays: Are not synthesizable. Statements, such as wait for 5 ns a <= b after 10 ns will not produce the required delay, and should not be used in the code intended for synthesis. Initializations: Declarations of signals (and variables) with initialized values, such as SIGNAL a : STD_LOGIC := ‘0’; cannot be synthesized, and thus should be avoided. If present, they will be ignored by the synthesis tools. Use set and reset signals instead. 9/17/2018

Register Transfer Level (RTL) Design Description
Synthesizable VHDL Register Transfer Level (RTL) Design Description Combinational Logic Combinational Logic Flip-flop D Q Clock From array Flip-flop D Q Clock … Registers 9/17/2018

VHDL Design Styles synthesizable VHDL Design Styles dataflow
structural behavioral Concurrent statements Components and interconnects Sequential statements Registers Shift registers Counters State machines synthesizable and more if you are careful 9/17/2018

Combinational Logic Synthesis for Beginners

Simple rules for beginners
For combinational logic, use only concurrent statements concurrent signal assignment () conditional concurrent signal assignment (when-else) selected concurrent signal assignment (with-select-when) generate scheme for equations (for-generate) 9/17/2018

For circuits composed of - simple logic operations (logic gates) - simple arithmetic operations (addition, subtraction, multiplication) - shifts/rotations by a constant Use concurrent signal assignment () 9/17/2018

For circuits composed of - multiplexers - decoders, encoders - tri-state buffers use: conditional concurrent signal assignment (when-else) selected concurrent signal assignment (with-select-when) 9/17/2018

Left vs. right side of the assignment
Left side <= <= when-else with-select <= Right side Internal signals (defined in a given architecture) Ports of the mode - out - inout - buffer Expressions including: Internal signals (defined in a given architecture) Ports of the mode - in - inout - buffer 9/17/2018

Arithmetic operations
Synthesizable arithmetic operations: Addition: + Subtraction: - Comparisons: >, >=, <, <= Multiplication: * Division by a power of 2, /2**6 (equivalent to right shift) Shifts by a constant, SHL, SHR 9/17/2018

Arithmetic operations
The result of synthesis of an arithmetic operation is a - combinational circuit without pipelining. The exact internal architecture used (and thus delay and area of the circuit)may depend on the timing constraints specified during synthesis (e.g., the requested maximum clock frequency). 9/17/2018

Operations on Unsigned Numbers
For operations on unsigned numbers: USE ieee.std_logic_unsigned.all and signals (inputs/outputs) of the type STD_LOGIC_VECTOR or USE ieee.std_logic_arith.all UNSIGNED 9/17/2018

Operations on Signed Numbers
For operations on signed numbers USE ieee.std_logic_signed.all and signals (inputs/outputs) of the type STD_LOGIC_VECTOR or USE ieee.std_logic_arith.all SIGNED 9/17/2018

Integer Types TYPE day_of_month IS RANGE 0 TO 31;
Operations on signals (variables) of the integer types: INTEGER, NATURAL, and their sybtypes, such as TYPE day_of_month IS RANGE 0 TO 31; are synthesizable in the range -(231-1) for INTEGERs and their subtypes for NATURALs and their subtypes Operations on signals (variables) of these types: are less flexible and more difficult to control than operations on signals (variables) of the type STD_LOGIC_VECTOR UNSIGNED SIGNED, and thus are recommened to be avoided by beginners. 9/17/2018

Addition of Signed Numbers (1)
LIBRARY ieee ; USE ieee.std_logic_1164.all ; USE ieee.std_logic_signed.all ; ENTITY adder16 IS PORT ( Cin : IN STD_LOGIC ; X, Y : IN STD_LOGIC_VECTOR(15 DOWNTO 0) ; S : OUT STD_LOGIC_VECTOR(15 DOWNTO 0) ; Cout, Overflow : OUT STD_LOGIC ) ; END adder16 ; ARCHITECTURE Behavior OF adder16 IS SIGNAL Sum : STD_LOGIC_VECTOR(16 DOWNTO 0) ; BEGIN Sum <= ('0' & X) + Y + Cin ; S <= Sum(15 DOWNTO 0) ; Cout <= Sum(16) ; Overflow <= Sum(16) XOR X(15) XOR Y(15) XOR Sum(15) ; END Behavior ; 9/17/2018

LIBRARY ieee ; USE ieee.std_logic_1164.all ; USE ieee.std_logic_arith.all ; ENTITY adder16 IS PORT ( Cin : IN STD_LOGIC ; X, Y : IN SIGNED (15 DOWNTO 0) ; S : OUT SIGNED (15 DOWNTO 0) ; Cout, Overflow : OUT STD_LOGIC ) ; END adder16 ; ARCHITECTURE Behavior OF adder16 IS SIGNAL Sum : SIGNED(16 DOWNTO 0) ; BEGIN Sum <= ('0' & X) + Y + Cin ; S <= Sum(15 DOWNTO 0) ; Cout <= Sum(16) ; Overflow <= Sum(16) XOR X(15) XOR Y(15) XOR Sum(15) ; END Behavior ; 9/17/2018

ENTITY adder16 IS PORT ( X, Y : IN INTEGER RANGE TO 32767; S : OUT INTEGER RANGE TO ); END adder16 ; ARCHITECTURE Behavior OF adder16 IS BEGIN S <= X + Y ; END Behavior ; 9/17/2018

Addition of Unsigned Numbers
LIBRARY ieee ; USE ieee.std_logic_1164.all ; USE ieee.std_logic_unsigned.all ; ENTITY adder16 IS PORT ( Cin : IN STD_LOGIC ; X, Y : IN STD_LOGIC_VECTOR(15 DOWNTO 0) ; S : OUT STD_LOGIC_VECTOR(15 DOWNTO 0) ; Cout : OUT STD_LOGIC ) ; END adder16 ; ARCHITECTURE Behavior OF adder16 IS SIGNAL Sum : STD_LOGIC_VECTOR(16 DOWNTO 0) ; BEGIN Sum <= ('0' & X) + Y + Cin ; S <= Sum(15 DOWNTO 0) ; Cout <= Sum(16) ; END Behavior ; 9/17/2018

Multiplication of signed and unsigned numbers (1)
LIBRARY ieee; USE ieee.std_logic_1164.all; USE ieee.std_logic_arith.all ; entity multiply is port( a : in STD_LOGIC_VECTOR(15 downto 0); b : in STD_LOGIC_VECTOR(7 downto 0); cu : out STD_LOGIC_VECTOR(11 downto 0); cs : out STD_LOGIC_VECTOR(11 downto 0)); end multiply; architecture dataflow of multiply is SIGNAL sa: SIGNED(15 downto 0); SIGNAL sb: SIGNED(7 downto 0); SIGNAL sres: SIGNED(23 downto 0); SIGNAL sc: SIGNED(11 downto 0); SIGNAL ua: UNSIGNED(15 downto 0); SIGNAL ub: UNSIGNED(7 downto 0); SIGNAL ures: UNSIGNED(23 downto 0); SIGNAL uc: UNSIGNED(11 downto 0); 9/17/2018

Multiplication of signed and unsigned numbers (2)
begin -- signed multiplication sa <= SIGNED(a); sb <= SIGNED(b); sres <= sa * sb; sc <= sres(11 downto 0); cs <= STD_LOGIC_VECTOR(sc); -- unsigned multiplication ua <= UNSIGNED(a); ub <= UNSIGNED(b); ures <= ua * ub; uc <= ures(11 downto 0); cu <= STD_LOGIC_VECTOR(uc); end dataflow; 9/17/2018

Describing combinational logic using processes
LIBRARY ieee ; USE ieee.std_logic_1164.all ; ENTITY dec2to4 IS PORT ( w : IN STD_LOGIC_VECTOR(1 DOWNTO 0) ; En : IN STD_LOGIC ; y : OUT STD_LOGIC_VECTOR(0 TO 3) ) ; END dec2to4 ; ARCHITECTURE Behavior OF dec2to4 IS BEGIN PROCESS ( w, En ) IF En = '1' THEN CASE w IS WHEN "00" => y <= "1000" ; WHEN "01" => y <= "0100" ; WHEN "10" => y <= "0010" ; WHEN OTHERS => y <= "0001" ; END CASE ; ELSE y <= "0000" ; END IF ; END PROCESS ; END Behavior ; 9/17/2018

LIBRARY ieee ; USE ieee.std_logic_1164.all ; ENTITY seg7 IS PORT ( bcd : IN STD_LOGIC_VECTOR(3 DOWNTO 0) ; leds : OUT STD_LOGIC_VECTOR(1 TO 7) ) ; END seg7 ; ARCHITECTURE Behavior OF seg7 IS BEGIN PROCESS ( bcd ) CASE bcd IS abcdefg WHEN "0000" => leds <= " " ; WHEN "0001" => leds <= " " ; WHEN "0010" => leds <= " " ; WHEN "0011" => leds <= " " ; WHEN "0100" => leds <= " " ; WHEN "0101" => leds <= " " ; WHEN "0110" => leds <= " " ; WHEN "0111" => leds <= " " ; WHEN "1000" => leds <= " " ; WHEN "1001" => leds <= " " ; WHEN OTHERS => leds <= “zzzzzzz" ; END CASE ; END PROCESS ; END Behavior ; 9/17/2018

LIBRARY ieee ; USE ieee.std_logic_1164.all ; ENTITY compare1 IS PORT ( A, B : IN STD_LOGIC ; AeqB : OUT STD_LOGIC ) ; END compare1 ; ARCHITECTURE Behavior OF compare1 IS BEGIN PROCESS ( A, B ) AeqB <= '0' ; IF A = B THEN AeqB <= '1' ; END IF ; END PROCESS ; END Behavior ; 9/17/2018

Incorrect code for combinational logic - Implied latch (1)
LIBRARY ieee ; USE ieee.std_logic_1164.all ; ENTITY implied IS PORT ( A, B : IN STD_LOGIC ; AeqB : OUT STD_LOGIC ) ; END implied ; ARCHITECTURE Behavior OF implied IS BEGIN PROCESS ( A, B ) IF A = B THEN AeqB <= '1' ; END IF ; END PROCESS ; END Behavior ; 9/17/2018

Incorrect code for combinational logic - Implied latch (2)
AeqB 9/17/2018

Rules that need to be followed: All inputs to the combinational circuit should be included in the sensitivity list No other signals should be included in the sensitivity list None of the statements within the process should be sensitive to rising or falling edges All possible cases need to be covered in the internal IF and CASE statements in order to avoid implied latches 9/17/2018

Covering all cases in the IF statement
Using ELSE IF A = B THEN AeqB <= '1' ; ELSE AeqB <= '0' ; Using default values AeqB <= '0' ; IF A = B THEN AeqB <= '1' ; 9/17/2018

Covering all cases in the CASE statement
Using WHEN OTHERS CASE y IS WHEN S1 => Z <= "10"; WHEN S2 => Z <= "01"; WHEN OTHERS => Z <= "00"; END CASE; CASE y IS WHEN S1 => Z <= "10"; WHEN S2 => Z <= "01"; WHEN S3 => Z <= "00"; WHEN OTHERS => Z <= „ZZ"; END CASE; Using default values Z <= "00"; CASE y IS WHEN S1 => Z <= "10"; WHEN S2 => Z <= "10"; END CASE; 9/17/2018

Combinational Logic Synthesis for Advanced

Advanced VHDL for synthesis
For complex, generic, and/or regular circuits you may consider using PROCESSES with internal VARIABLES and FOR LOOPs 9/17/2018

N-bit NAND LIBRARY ieee; USE ieee.std_logic_1164.all; ENTITY NANDn IS
GENERIC (n: INTEGER := 8) PORT ( X : IN STD_LOGIC_VECTOR(1 TO n); Y : OUT STD_LOGIC); END NANDn; 9/17/2018

N-bit NAND architecture using variables
ARCHITECTURE behavioral1 OF NANDn IS BEGIN PROCESS (X) VARIABLE Tmp: STD_LOGIC; Tmp := X(1); AND_bits: FOR i IN 2 TO n LOOP Tmp := Tmp AND X( i ) ; END LOOP AND_bits ; Y <= NOT Tmp ; END PROCESS; END behavioral1 ; 9/17/2018

Incorrect N-bit NAND architecture using signals
ARCHITECTURE behavioral2 OF NANDn IS SIGNAL Tmp: STD_LOGIC; BEGIN PROCESS (X) Tmp <= X(1); AND_bits: FOR i IN 2 TO n LOOP Tmp <= Tmp AND X( i ) ; END LOOP AND_bits ; Y <= NOT Tmp ; END PROCESS; END behavioral2 ; 9/17/2018

Correct N-bit NAND architecture using signals
ARCHITECTURE dataflow1 OF NANDn IS SIGNAL Tmp: STD_LOGIC_VECTOR(1 TO n); BEGIN Tmp(1) <= X(1); AND_bits: FOR i IN 2 TO n GENERATE Tmp(i) <= Tmp(i-1) AND X( i ) ; END LOOP AND_bits ; Y <= NOT Tmp(n) ; END dataflow1 ; 9/17/2018

Parity generator entity
LIBRARY ieee; USE ieee.std_logic_1164.all; ENTITY oddParityLoop IS GENERIC ( width : INTEGER := 8 ); PORT ( ad : in STD_LOGIC_VECTOR (width - 1 DOWNTO 0); oddParity : out STD_LOGIC ) ; END oddParityLoop ; ARCHITECTURE dataflow OF oddParityGen IS SIGNAL genXor: STD_LOGIC_VECTOR(width DOWNTO 0); BEGIN genXor(0) <= '0'; parTree: FOR i IN 1 TO width GENERATE genXor(i) <= genXor(i - 1) XOR ad(i - 1); END GENERATE; oddParity <= genXor(width) ; END dataflow ; 9/17/2018

Parity generator architecture using variables
ARCHITECTURE behavioral OF oddParityLoop IS BEGIN PROCESS (ad) VARIABLE loopXor: STD_LOGIC; loopXor := '0'; FOR i IN 0 to width -1 LOOP loopXor := loopXor XOR ad( i ) ; END LOOP ; oddParity <= loopXor ; END PROCESS; END behavioral ; 9/17/2018

Sequential Logic Synthesis for Beginners

For Beginners Use processes with very simple structure only
to describe - registers - shift registers - counters - state machines. Use examples discussed in class as a template. Create generic entities for registers, shift registers, and counters, and instantiate the corresponding components in a higher level circuit using GENERIC MAP PORT MAP. Supplement sequential components with combinational logic described using concurrent statements. 9/17/2018

For Intermmediates Use Processes with IF and CASE statements only. Do not use LOOPS or VARIABLES. Sensitivity list of the PROCESS should include only signals that can by themsleves change the outputs of the sequential circuit (typically, clock and asynchronous set or reset) Do not use PROCESSes without sensitivity list (they can be synthesizable, but make simulation inefficient) 9/17/2018

For Intermmediates (2) Given a single signal, the assignments to this signal should only be made within a single process block in order to avoid possible conflicts in assigning values to this signal. Process 1: PROCESS (a, b) BEGIN y <= a AND b; END PROCESS; Process 2: PROCESS (a, b) y <= a OR b; 9/17/2018

For Advanced Describe the algorithm you are trying to implement in pseudocode. Translate the pseudocode directly to VHDL using processes with IF, CASE, LOOP, and VARIABLES. 9/17/2018

Introduction to VHDL/Verilog
LIBRARY IEEE; USE IEEE.STD_LOGIC_1164.ALL; ENTITY SHREG IS PORT(SHIFT,CLK : IN STD_LOGIC; A : IN STD_LOGIC; Y : OUT STD_LOGIC); END SHREG; ARCHITECTURE RTL OF SHREG IS SIGNAL TMP : STD_LOGIC_VECTOR(7 DOWNTO 0) := " "; BEGIN PROCESS(CLK,A,SHIFT,TMP) TMP <= TMP; IF RISING_EDGE(CLK) THEN IF SHIFT = '1' THEN TMP <= TMP(6 DOWNTO 0) & A; END IF; END PROCESS; Y <= TMP(7); END RTL; module SHREG (shift,clk,a,y); input shift,clk,a; output y; reg [7:0] tmp; clk) begin tmp = tmp; if (shift) tmp = {tmp[6:0],a}; end assign y = tmp[7]; endmodule 9/17/2018

reset Y =1 ST0 Y =2 control ST3 ST1 Y =0 ST2 Y =3 9/17/2018

LIBRARY IEEE; USE IEEE.STD_LOGIC_1164.ALL; ENTITY FSM IS PORT(CLK,RST : IN STD_LOGIC; CONTROL: IN STD_LOGIC; Y : OUT STD_LOGIC_VECTOR(1 DOWNTO 0)); END FSM; ARCHITECTURE RTL OF FSM IS TYPE StateTape IS (ST0,ST1,ST2,ST3); SIGNAL STATE : StateTape; BEGIN PROCESS(CLK,RST) IF (RST = '1') THEN STATE <= ST0; ELSIF (CLK'EVENT AND CLK = '1') THEN CASE (STATE) IS WHEN ST0 => STATE <= ST1; WHEN ST1 => IF (CONTROL = '1') THEN STATE <= ST2; ELSE STATE <= ST3; END IF; WHEN ST2 => STATE <= ST3; WHEN ST3 => STATE <= ST0; WHEN OTHERS => NULL; END CASE; END PROCESS; WITH STATE SELECT Y <= "01" WHEN ST0, "10" WHEN ST1, "11" WHEN ST2, "00" WHEN ST3, "00" WHEN OTHERS; END RTL; module FSM (clk,rst,control,y); input clk,rst; input control; output [1:0] y; reg [1:0] y; parameter [1:0] ST0 = 0,ST1 = 1,ST2 = 2,ST3 = 3; reg [1:0] STATE; clk or posedge rst) begin if (rst) STATE = ST0; else case(STATE) ST0: STATE = ST1; ST1: if (control) STATE = ST2; STATE = ST3; ST2: STATE = ST3; ST3: STATE = ST0; endcase end case (STATE) ST0: y = 1; ST1: y = 2; ST2: y = 3; ST3: y = 0; default: y = 1; endmodule 9/17/2018

Overview of ALTERA FPDs. Important points for design
Based on Synthesis together with specialized Place and Route tool. Small local blocks can run quite fast. Large complicated blocks has limited speed because of interconnect delays. General functional blocks optimized for given FPGA architecture often given as macro blocks: Counters, adders, multipliers, FIFO’s, etc. Important to use such blocks in synthesis to obtain good performance Dedicated timing estimator/calculator: LUTs fixed delays, F-F fixed delays, interconnects depends on length and number of routing switches. If design gets close to full utilization then design time increases significantly (problem with place and route).

Contents: Idea; History; High-Capacity FPGA’s Architecture;
Computer aided design (CAD) flow for PLD; Introduction to VHDL/VerilogHDL; Getting started. 9/17/2018

Getting started 9/17/2018

Introduction to FPD. The END 9/17/2018

Xilinx Virtex-II FPGA with hard PowerPC 405 core + >20K CLBs
FPGA IP Programmable logic vendors are adding Hard and Soft IP (“Intellectual Property”) to their arrays, such as small RISC processors, DSPs, multipliers, digital filters, etc. Xilinx Virtex-II FPGA with hard PowerPC 405 core + >20K CLBs Embedded 400 MHz, 600+ D-MIPS RISC core (32-bit Harvard architecture) 5-stage data path pipeline Hardware multiply and divide 32 x 32-bit general-purpose registers 16 KB 2-way set-associative instruction cache 16 KB 2-way set-associative data cache, write back/write through Implements PowerPC User Instruction Set Architecture (UISA) Xilinx MicroBlaze 32-bit processor soft core Altera Excalibur FPGA with Hard ARM bit RISC CPU Altera NIOS 32-bit processor soft core 9/17/2018

PLD Introduction to PLD. Presented by:.

Similar presentations

Presentation on theme: "PLD Introduction to PLD. Presented by:."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

PLD Introduction to PLD. Presented by:.

Similar presentations

Presentation on theme: "PLD Introduction to PLD. Presented by:."— Presentation transcript:

Similar presentations

About project

Feedback