Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chris Martin Member Technical Staff Embedded Applications

Similar presentations


Presentation on theme: "Chris Martin Member Technical Staff Embedded Applications"— Presentation transcript:

1 Building Multi-Processor FPGA Systems Hands-on Tutorial to Using FPGAs and Linux
Chris Martin Member Technical Staff Embedded Applications Good morning everyone And welcome It looks like it is about that time, so let’s go ahead and get started. Firstly, before we do get started, can anyone in the back not hear me. Ok. Great. OK, as a brief introduction: My name is Chris Martin. I am a member techincal staff embedded apps eng at Altera And this morning, I am going to talk about building multiprocessor systems Using FPGAs and Linux This is a hands-on tutorial and we have several boards setup If you have a laptop with a terminal emulator program already installed

2 Agenda Introduction Problem: How to Integrate Multi-Processor Subsystems Why… Why would you do this? Why use FPGAs? Lab 1: Getting Started - Booting Linux and Boot-strapping NIOS Lab 2: Inter-Processor Communication and Shared Peripherals Lab 3: Locking and Tetris Building Hardware: FPGA Hardware Tools & Build Flow Building/Debugging Software: Software Tools & Build Flow References Q&A – All through out. These are the topics we will be discussing today. After this introduction, I want to discuss the problem we are trying to solve And why this problem occurs And why you might want to do this in FPGAs And of course questions are welcome throughout… Feel free to interrupt, that is fine.

3 The Problem – Integrating Multi-Processor Subsystems
Given a system with multiple processor sub-systems, these architecture decisions must be considered: Inter-processor communication Partitioning/sharing Peripherals (locking required) Bandwidth & Latency Requirements Subsystem 1 Processor Periph 1 Periph 2 Periph 3 ??? What is the problem we are trying to solve? Or what problems do we run into when we integrate processors together? Subsystem 2 Processor Periph 1 Periph 2 Periph 3

4 Why Do We Need to Integrate Multi-Processor Subsystems?
May have inherited processor subsystem from another development team or 3rd party Risk Mitigation by reducing change Fulfill Latency and Bandwidth Requirements Real-time Considerations If main processor not Real-Time enabled, can add a real-time processor subsystem Design partition / Sandboxing Break the system into smaller subsystems to service task Smaller task can be designed easily Leverage Software Resources Sometimes problem is resolved in less time by Processor/Software rather than Hardware design Sequencers, State-machines That was the problem. Why does this problem occur? Now, these are some reasons I have run into, having worked on several design teams. There may be others as well. Here, I wanted to mention my colleague, Vince Bridgers, has a very good presentation on an analysis of the RT Kernel patches.

5 Why do we want to integrate with FPGA. (or rather, HOW can FPGAs help
Bandwidth & Latency can be tailored Addresses Real-time aspects of System Solution FPGA logic has flexible interconnect Trade Data width with clock frequency with latency Experimentation Many processor subsystems can be implemented Allows you to experiment changing microprocessor subsystem hardware designs Altera FPGA under-the-hood However: Generic Linux interfaces used and can be applied in any Linux system. Simple Multiprocessor System A Peripheral ARM Shared Peripheral Mailbox NIOS N Peripheral And, why is Altera involved with Embedded Linux…

6 Why is Altera Involved with Embedded Linux?
With Embedded Processor Without Embedded Processor Design Starts 50% Source: Gartner September 2010 More than 50% of FPGA designs include an embedded processor, and growing. Many embedded designs using Linux Open-source re-use. Altera Linux Development Team actively contributes to Linux Kernel

7 SoCKit Board Architecture Overview
Lab focus UART DDR3 LEDs Buttons We are Raffling the board!

8 SoC/FPGA Hardware Architecture Overview
DDR ARM-to-FPGA Bridges Data Width configurable FPGA 42K Logic Macros Using no more than 14% A9 A9 EMIF DMA I$ D$ I$ D$ ROM UART L2 RAM SD/MMC AXI Bridge FPGA2HPS AXI Bridge HPS2FPGA AXI Bridge LWHPS2FPGA FPGA Fabric “Soft Logic” 32/64/128 32 32/64/128 SYS ID RAM GPIO 32 NIOS

9 Lab 1: Getting Started Booting Linux and Boot-strapping NIOS
Topics Covered: Configuring FPGA from SD/MMC and U-Boot Booting Linux on ARM Cortex-A9 Configuring Device Tree Resetting and Booting NIOS Processor Building and compiling simple Linux Application Key Example Code Provided: C code for downloading NIOS code and resetting NIOS from ARM Using U-boot to set ARM peripheral security bits Full step-by-step instructions are included in lab manual.

10 Lab 1: Hardware Design Overview
Subsystem 1 NIOS Subsystem 1 NIOS Gen 2 processor 64k combined instruction/data RAM (On-Chip RAM) GPIO peripheral ARM Subsystem 2 Cortex-A9 (only using 1) DDR3 External Memory SD/MMC Peripheral UART Peripheral SD/MMC EMIF Cortex-A9 UART Subsystem 2 RAM NIOS 0 GPIO Shared Peripherals Dedicated Peripherals

11 Lab1: Programmer View - Processor Address Maps
NIOS ARM Cortex-A9 Address Base Peripheral 0xFFC0_2000 ARM UART 0x0003_0000 GPIO (LEDs) 0x0002_0000 System ID 0x0000_0000 On-chip RAM Address Base Peripheral 0xFFC0_2000 UART 0xC003_0000 GPIO (LEDs) 0xC002_0000 System ID 0xC000_0000 On-chip RAM

12 Lab 1: Peripheral Registers
Address Offset Access Bit Definitions Sys ID 0x0 RO [31:0] – System ID. Lab Default = 0x00001ab1 GPIO R/W [31:0] – Drive GPIO output. Lab Uses for LED control, push button status and NIOS processor resets (from ARM). [3:0] - LED 0-3 Control. ‘0’ = LED off . ‘1’ = LED on [4] – NIOS 0 Reset [5] – NIOS 1 Reset [1:0] – Push Button Status UART 0x14 Line Status Register [5] – TX FIFO Empty [0] – Data Ready (RX FIFO not-Empty) 0x30 Shadow Receive Buffer Register [7:0] – RX character from serial input 0x34 Shadow Transmit Register [7:0] – TX character to serial output

13 Lab 1: Processor Resets Via Standard Linux GPIO Interface
int main(int argc, char** argv) { int fd, gpio=168; char buf[MAX_BUF]; /* Export: echo ### > /sys/class/gpio/export */ fd = open("/sys/class/gpio/export", O_WRONLY); sprintf(buf, "%d", gpio); write(fd, buf, strlen(buf)); close(fd); /* Set direction to Out: */ /* echo "out“ > /sys/class/gpio/gpio###/direction */ sprintf(buf, "/sys/class/gpio/gpio%d/direction", gpio); fd = open(buf, O_WRONLY); write(fd, "out", 3); /* write(fd, "in", 2); */ /* Set GPIO Output High or Low */ /* echo 1 > /sys/class/gpio/gpio###/value */ sprintf(buf, "/sys/class/gpio/gpio%d/value", gpio); write(fd, "1", 1); /* write(fd, "0", 1); */ /* Unexport: echo ### > /sys/class/gpio/unexport */ fd = open("/sys/class/gpio/unexport", O_WRONLY); } NIOS resets connected to GPIO GPIO driver uses /sys/class/gpio interface

14 NIOS RAM address accessed via mmap()
Lab 1: Loading External Processor Code Via Standard Linux shared memory (mmap) /* Map Physical address of NIOS RAM to virtual address segment with Read/Write Access */ fd = open("/dev/mem", O_RDWR); load_address = mmap(NULL, 0x10000, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0xc ); /* Set size of code to load */ load_size = sizeof(nios_code)/sizeof(nios_code[0]); /* Load NIOS Code */ for(i=0; i < load_size ;i++) { *(load_address+i) = nios_code[i]; } /* Set load address segment to Read-Only */ mprotect(load_address, 0x10000, PROT_READ); /* Un-map load address segment */ munmap(load_address, 0x10000); NIOS RAM address accessed via mmap() Can be shared with other processes R/W during load Read-only protection after load

15 Lab 2: Mailboxes NIOS/ARM Communication
Topics Covered: Altera Mailbox Hardware IP Key Example Code Provided: C code for sending/receiving messages via hardware Mailbox IP NIOS & ARM C Code Simple message protocol Simple Command parser Full step-by-step instructions are included in lab manual. User to add second NIOS processor mailbox control.

16 Lab 2: Hardware Design Overview
Subsystem 1 NIOS 0 & 1 Subsystems NIOS Gen 2 processor 64k combined instruction/data RAM GPIO (4 out, LED) GPIO (2 in, Buttons) Mailbox ARM Subsystem 2 Cortex-A9 (only using 1) DDR3 External Memory SD/MMC Peripheral UART Peripheral SD/MMC EMIF Cortex-A9 UART GPIO Subsystem 2 Subsystem 3 MBox RAM MBox RAM NIOS 0 NIOS 1 GPIO GPIO Shared Peripherals Dedicated Peripherals

17 Lab2: Programmer View - Processor Address Maps
NIOS 0 & 1 ARM Cortex-A9 Address Base Peripheral 0xFFC0_2000 ARM UART 0x0007_8000 Mailbox (from ARM) 0x0007_0000 Mailbox (to ARM) 0x0005_0000 GPIO (In Buttons) 0x0003_0000 GPIO (Out LEDs) 0x0002_0000 System ID 0x0000_0000 On-chip RAM Address Base Peripheral 0xFFC0_2000 UART 0x0007_8000 Mailbox (to NIOS 1) 0x0007_0000 Mailbox (from NIOS 1) 0x0006_8000 Mailbox (to NIOS 0) 0x0006_0000 Mailbox (from NIOS 0) 0xC003_0000 GPIO (LEDs) 0xC002_0000 System ID 0xC001_0000 NIOS 1 RAM 0xC000_0000 NIOS 0 RAM

18 Lab 2: Additional Peripheral (Mailbox) Registers
Address Offset Access Bit Definitions Mailbox 0x0 R/W [31:0] – RX/TX Data 0x8 [1] – RX Message Queue Has Data [0] – TX Message Queue Empty

19 LAB 2: Designing a Simple Message Protocol
Design Decisions: Short Length: A single 32-bit word Human Readable Message transactions are closed-loop. Includes ACK/NACK Format: Message Length: Four Bytes First Byte is ASCII character denoting message type. Second Byte is ASCII char from 0-9 denoting processor number. Third Byte is ASCII char from 0-9 denoting message data, except for ACK/NACK. Fourth Byte is always null character ‘\0’ to terminate string (human readable). Byte 0 Byte 1 Byte 2 Byte3 ‘L’ ‘0’ ‘\0’ ‘A’ Message Types: “G00”: Give Access to UART (Push) “A1A”: ACK “N1A”:NACK Can be Extended: “L00”: LED Set/Ready “B00”: Button Pressed “R00”: Request UART Access (Pull) “G00” Cortex-A9 NIOS 0 “A0A” “N0N”

20 Lab 2: Inter-Processor Communication with Mailbox HW Via Standard Linux Shared Memory (mmap)
/* Map Physical address of Mailbox to virtual address segment with Read/Write Access */ fd = open("/dev/mem", O_RDWR); mbox0_address = mmap(NULL, 0x10000, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0xff260000); <snip> /* Waiting for Message Queue to empty */ while((*(volatile int*)(mbox0_address+0x2000+2) & 1) != 0 ) {} /* Send Granted/Go message to NIOS */ send_message = "G00"; *(mbox0_address+0x2000) = *(int *)send_message; /* Disable ARM/Linux Access to UART (be careful here)*/ config.c_cflag &= ~CREAD; if(tcsetattr(fd, TCSAFLUSH, &config) < 0) { } /* Wait for Received Message */ while((*(volatile int*)(mbox0_address+2) & 2) == 0 ) {} /* Re-enable UART Access */ config.c_cflag |= CREAD; tcsetattr(fd, TCSAFLUSH, &config); /* Read Received Message */ printf(" - Message Received. DATA = '%s'.\n", (char*)(mbox0_address)); Wait for Mailbox Hardware message empty flag Send message (4 bytes) Disable ARM/Linux Access to UART Wait for RX message received flag Re-enable ARM/Linux UART Access

21 Lab 3: Putting It All Together – Tetris
Lab 3: Putting It All Together – Tetris! Combining Locking and Communication Topics Covered: Linux Mutex Key Example Code Provided: C code showcasing using Mutexes for locking shared peripheral access C code for multiple processor subsystem bringup and shutdown Full step-by-step instructions are included in lab manual. User to add code for second NIOS processor bringup, shutdown and locking/control.

22 Lab 3: Hardware Design Overview (Same As Lab 2)
Subsystem 1 NIOS 0 & 1 Subsystems NIOS Gen 2 processor 64k combined instruction/data RAM GPIO (4 out, LED) GPIO (2 in, Buttons) Mailbox ARM Subsystem 2 Cortex-A9 (only using 1) DDR3 External Memory SD/MMC Peripheral UART Peripheral SD/MMC EMIF Cortex-A9 UART GPIO Subsystem 2 Subsystem 3 MBox RAM MBox RAM NIOS 0 NIOS 1 GPIO GPIO Shared Peripherals Dedicated Peripherals

23 Lab 3: Programmer View - Processor Address Maps
NIOS 0 & 1 ARM Cortex-A9 Address Base Peripheral 0xFFC0_2000 ARM UART 0x0007_8000 Mailbox (from ARM) 0x0007_0000 Mailbox (to ARM) 0x0005_0000 GPIO (In Buttons) 0x0003_0000 GPIO (Out LEDs) 0x0002_0000 System ID 0x0000_0000 On-chip RAM Address Base Peripheral 0xFFC0_2000 UART 0x0007_8000 Mailbox (to NIOS 1) 0x0007_0000 Mailbox (from NIOS 1) 0x0006_8000 Mailbox (to NIOS 0) 0x0006_0000 Mailbox (from NIOS 0) 0xC003_0000 GPIO (LEDs) 0xC002_0000 System ID 0xC001_0000 NIOS 1 RAM 0xC000_0000 NIOS 0 RAM

24 Available Linux Locking/Synchronization Mechanisms
Need to share peripherals Choose a Locking Mechanism Available in Linux Mutex <- Chosen for this Lab Completions Spinlocks Semaphores Read-copy-update (decent for multiple readers, single writer) Seqlocks (decent for multiple readers, single writer) Available for Linux MCAPI - openmcapi.org

25 Tetris Message Protocol – Extended from Lab 2
NIOS 0 “B00” Cortex-A9 NIOS Control Flow: Wait for button press Send Button press message Wait for ACK (Free to write to LED GPIO) Write to LED GPIO Send LED ready msg Wait for ACK ARM Control Flow: Wait for button press message Lock LED GPIO Peripheral Send ACK (Free to write to LED GPIO) Wait for LED ready msg Send ACK Read LED value Release Lock/Mutex “A0A” “L00” “A0A” NIOS 1 “B10” “A1A” “L10” “A1A”

26 Lab 3: Locking Hardware Peripheral Access Via Linux Mutex
pthread_mutex_t lock; <snip – Initialize/create/start> /* Initialize Mutex */ err = pthread_mutex_init(&lock, NULL); /* Create 2 Threads */ i=0; while(i < 1) { err = pthread_create(&(tid[i]), NULL, &nios_buttons_get, &(nios_num[i])); i++; } <snip – Critical Section> pthread_mutex_lock(&lock); /* Critical Section */ pthread_mutex_unlock(&lock); <snip Stop/Destroy> /* Wait for threads to complete */ pthread_join(tid[0], NULL); pthread_join(tid[1], NULL); /* Destroy/remove lock */ pthread_mutex_destroy(&lock); In this example, LED GPIO is accessed by multiple processors Wrap LED critical section (LED status reads) with: pthread_mutex_lock() pthread_mutex_unlock() Also need Mutex init/destroy: pthread_mutex_init() pthread_mutex_destroy()

27 References

28 Altera References System Design Tutorials:
Designing_with_AXI_for_Altera_SoC_ARM_Devices_Workshop_Lab Simple_HPS_to_FPGA_Comunication_for_Altera_SoC_ARM_Devices_Workshop Multiprocessor NIOS-only Tutorial: Quartus Handbook: Qsys: System Design with Qsys (PDF) section in the Handbook Qsys Tutorial: Step-by-step procedures and design example files to create and verify a system in Qsys Qsys 2-day instructor-led class: System Integration with Qsys Qsys webcasts and demonstration videos SoC Embedded Design Suite User Guide:

29 Related Articles Performance Analysis of Inter-Processor Communication Methods Communicating Efficiently between QorlQ Cores in Medical Applications Linux Inter-Process Communication: Linux locking mechanisms (from ARM): OpenMCAPI: Mutex Examples:

30 Full Tutorial Resources Online
Project Wiki Page: Includes: Source code Hardware source Hardware Quartus Projects Software Eclipse Projects

31 BACKUP SLIDES

32 Post-Lab 1 Additional Topics
Hardware Design Flow and FPGA Boot with U-boot and SD/MMC

33 Building Hardware: Qsys (Hardware System Design Tool) User Interface
Interfaces Exported In/out of system Connections between cores

34 Hardware and Software Work Flow Overview
Quartus& Qsys Preloader & U-Boot Eclipse DS-5 & Debug Tools Device Tree RBF Inputs: Hardware Design (Qsys or RTL or Both) Outputs (to load on boot media): Preloader and U-boot Images FPGA Programmation File: Raw Binary Format (RBF) Device Tree Blob

35 SDCARD Layout Partition 1: FAT Partition 2: EXT3 – Rootfs
Uboot scripts FPGA HW Designs (RBF) Device Tree Blobs zImage Lab material Partition 2: EXT3 – Rootfs Partition 3: Raw Uboot/preloader Partition 4: EXT3 – Kernel src

36 Updating SD Cards More info found on Rocketboards.org
File Update Procedure zImage Mount DOS SD card partition 1 and replace file with new one: $ sudo mkdir sdcard $ sudo mount /dev/sdx1 sdcard/  $ sudo cp <file_name> sdcard/  $ sudo umount sdcard soc_system.rbf soc_system.dtb u-boot.scr preloader-mkpimage.bin $ sudo dd if=preloader-mkpimage.bin of=/dev/sdx3 bs=64k seek=0 u-boot-socfpga_cyclone5.img $ sudo dd if=u-boot-socfpga_cyclone5.img of=/dev/sdx3 bs=64k seek=4 root filesystem $ sudo dd if=altera-gsrd-image-socfpga_cyclone5.ext3 of=/dev/sdx2 More info found on Rocketboards.org Automated Python Script to build SD Cards: make_sdimage.py

37 Post-Lab 2 Additional Topic
Using Eclipse to Debug: NIOS Software Build Tools

38 Altera NIOS Software Design and Debug Tools
Nios II SBT for Eclipse key features: New project wizards and software templates Compiler for C and C++ (GNU) Source navigator, editor, and debugger Eclipse project-based tools Download code to hardware

39 Key Multi-Processor System Design Points
Startup/Shutdown Processor Peripheral Covered in Lab 1. Communication between processors What is the physical link? What is the protocol & messaging method? Message Bandwidth & Latency Covered in Lab 2 Partitioning peripherals Declare dedicated peripherals – only connected/controlled by one processor Declare shared peripherals – Connected/controlled by multiple processors Decide Upon Locking Mechanism Covered in Lab 3

40 Post Lab 3 Additional Topic
Altera SoC Embedded Design Suite

41 Altera Software Development Tools
Eclipse For ARM Cortex-A9 (ARM Development Studio 5 – Altera Edition) For NIOS Pre-loader/U-Boot Generator Device Tree Generator Bare-metal Libraries Compilers GCC (for ARM and NIOS) ARMCC (for ARM with license) Linux Specific Kernel Sources Yocto & Angstrom recipes: Buildroot:

42 System Development Flow
FPGA Design Flow Software Design Flow Hardware Development Software Development Quartus II design software Qsys system integration tool Standard RTL flow Altera and partner IP Eclipse GNU toolchain OS/BSP: Linux, VxWorks Hardware Libraries Design Examples Design Simulate ModelSim, VCS, NCSim, etc. AMBA-AXI and Avalon bus functional models (BFMs) Debug SignalTap™ II logic analyzer System Console GDB, Lauterbach, Eclipse Release Quartus II Programmer In-system Update Flash Programmer

43 Inside the Golden System Reference Design
Complete system example design with Linux software support Target Boards: Altera SoC Development Kits Arrow SoC Development Kits Macnica SoC Development Kits Hardware Design: Simple custom logic design in FPGA All source code and Quartus II / Qsys design files for reference Software Design: Includes Linux Kernel and Application Source code Includes all compiled binaries

44 ---Topics – Back Up--- Introductions: Altera and SoC FPGAs
Development Tools How to Build Hardware: FPGA Hardware Tools & Build Flow How to Build Software: Software Tools & Build Flow How to Debug all-of-the-above: Debug Tools Key Multi-processor System Design Points Hardware design Shared peripherals Available Hardware IP Software design Message Protocols Linux tools/mechanism available today

45 Quartus – Hardware Development Tool

46 Quartus II User Interface
Quartus II main window provides a high level of visibility to each stage of the design flow Project navigator provides direct visual access to most of the key project information Tasks window allows you to use the tools and features of the Quartus II software and monitor their progress from a flow-based layout Tool View window shows various tools and design files Messages window outputs messages from each process of the run Project Navigator Tool View window Tasks window Altera tools have the same look and feel; timing analyzer, Qsys… Messages window

47 Typical Hardware Design Flow
In-system debug Functional verification Design creation Project creation Project definition Design compilation Logic I/O Memory Design entry/RTL coding and early pin planning Behavioral or structural description of design Early pin planning allows board development in parallel Functional verification Verify design behavior Synthesis (mapping) Translate design into device-specific primitives Optimization to meet required area and performance constraints Placement and routing (fitting) Place design in specific device resources with reference to area and performance constraints Connect resources with routing lines Timing analysis Verify performance specifications were met Static timing analysis Functional verification Verify design will work in target technology PC board simulation and test Simulate board design Program and test device on board On-chip tools for debugging

48 Quartus II Feature Overview
In-system debug Functional verification Design creation Project creation Project definition Design compilation Logic I/O Memory Fully integrated development tool Multiple design entry methods Includes intellectual property- (IP-) based system design Up-front I/O assignment and validation Enables printed circuit board (PCB) layout early in the design process Incremental compilation Reduces design compilation and improves timing closure Logic synthesis Includes comprehensive integrated synthesis solution Advanced integration with third-party EDA synthesis software Timing-driven placement and routing Physical synthesis Improves performance without user intervention Verification solution TimeQuest timing analyzer PowerPlay power analysis and optimization Functional simulation On-chip debug and verification suite

49 Quartus II Feature Overview (1/2)
Quartus II Software Project creation New project wizard Design entry HDL editor Schematic editor State machine editor MegaWizard™ Plug-In Manager Customization and generation of IP Qsys system integration tool Design constraint assignments Assignment editor Pin planner Synopsys Design Constraint (SDC) editor Synthesis Quartus II Integrated Synthesis (QIS) Third-party EDA synthesis Design assistant Fitting and placing design into FPGA to meet user requirements Fitter (including physical synthesis) Design analysis and debug Netlist viewers Advisors Power analysis PowerPlay power analyzer

50 Quartus II Feature Overview (2/2)
Quartus II Software Static timing analysis on post-fitted design TimeQuest timing analyzer Viewing and editing design placement Chip Planner Functional verification ModelSim®-Altera edition Third-party EDA simulation tools Generation of device programming file Assembler On-chip debug and verification SignalTapTM II (embedded logic analyzer) In-system memory content editor Logic analyzer interface editor In-system sources and probes editor SignalProbe pins Transceiver Toolkit External memory interface toolkit Technique to optimize design and improve productivity Quartus II incremental compilation Physical synthesis optimization Design Space Explorer (DSE)

51 Quartus II Subscription Edition vs. Web Edition
Device supported MAX series devices: All (Excluding MAX7000 / 3000) Cyclone III/IV/V FPGAs: All Arria II/V FPGAs: All Stratix III, IV, V FPGAs: All Cyclone V SoCs: All Cyclone V FPGAs: All (Excluding 5CEA9, 5CGXC9, and 5CGTD9) Cyclone III/IV FPGAs: All Arria II GX FPGA: EP2AGX45 Software features: Incremental compilation and team-based design SSN Analyzer Transceiver Toolkit Yes No SignalTap II, SignalProbe If TalkBack feature is enabled Multi-processor support IP Base Suite MegaCore® functions No license required for OpenCore Plus hardware evaluation License fee required for production use Platform support Windows 32/64-bit Linux 32/64-bit License and maintenance terms Perpetual (continues to work after expiration) No license required except for IP core Price $ Free

52 How to Get Started Using Quartus II Software
Download Quartus II software today and start designing with Altera programmable logic devices Quartus II Handbook - Guides you through the programmable logic design cycle from design to verification Also covers third-party EDA vendor tool interfaces Online demonstrations - Easiest way to learn about the latest Quartus II software features and design flows Training classes - Offers online training classes and live presentation coupled with hands-on exercises to learn about Quartus II features and design flows Agenda

53 Qsys – System Integration Platform

54 Qsys System Integration Platform
High-Performance Interconnect Based on Network-on-a-Chip (NoC) Architecture Hierarchy Design Reuse Design System Add to Library Package as IP Automated Testbench Generation Industry-Standard Interfaces Avalon® Interfaces AMBA® AXI, APB, AHB Real-Time System Debug Qsys is Altera’s design environment for Deployment of IP, with hierarchal support Development platform for Altera custom solutions Design platform for customers to quickly create system designs

55 Improved Validation Display
Qsys User Interface Interfaces Exported for Hierarchy Toolbar Improved Validation Display

56 Qsys improves productivity
Qsys Benefits Raises the level of design abstraction System-level design and system visualization Simplifies complex hierarchal system development Automated interconnect generation Provides a standard platform IP integration, custom IP authoring, IP verification Enables design re-use Reduces time to market System-level design reduces development time Facilitates verification Qsys improves productivity

57 Network-on-Chip Architecture
Transaction Layer Converts transactions to command packets and responses packets to responses Transport Layer Transfers packets to destination Transaction Layer Converts command packets to transactions and responses to response packets Master Network Interface Avalon-ST Avalon-MM AXI-MM Avalon-MM AXI-MM Avalon ST (Response) (Command) Slave Packet transactions and transport Each command encapsulated in a packet to be sent to a slave Each response encapsulated in a packet to be sent back to a master

58 Benefits of Network-On-Chip Approach
See white paper: Applying the Benefits of NoC Architecture to FPGA System Design Independent implementation of transaction/transport layers Different transport layer network topologies can be implemented without transaction layer modification e.g. High performance components on a wide high-frequency crossbar network Supports standard interface interoperability Mix and match interface types on transaction layer without transport layer modification Scalability Segment network into sub-networks using Bridges Clock crossing logic

59 Industry-Standard Interfaces
Developer Standard Interface Protocol Avalon® Interfaces AMBA® AXI, AMBA APB, and AMBA AHB Many IP cores use industry standard interfaces for easier integration. However it is unlikely that IP vendors, FPGA vendors and FPGA designers like you all use the same industry standard interface. So that’s why Qsys provides an interconnect that supports mixing of different interfaces. For example, here you have 3 masters and 3 slaves. There is a mix of standard interfaces, some using Avalon interfaces and some using AXI. The Qsys interconnect, based on a high performance network on a chip architecture, is able to create and transport packets between the masters and slaves. These packets encapsulate AXI or Avalon based transactions. Qsys supports mixing of different interfaces

60 Target Qsys Applications
Qsys can be used in almost every FPGA design Designs fall into two categories Control plane Memory mapped Reading and writing to control and status registers Data plane Streaming Data switching (muxing, demuxing), aggregation, bridges

61 “Packets………I care about Latency!”
Qsys packet format is wide Packet format contains a complete transaction in a single clock cycle Supports: Writes with 0 cycles of latency Reads with a round-trip latency of 1 cycle You can control latency via Qsys configuration Separate command and response network Increases concurrency Command traffic and Response traffic don’t compete for resources

62 Qsys: Wide Range of Compliant IP
Wide range of plug-and-play intellectual property (IP): Interface protocol IP E.g. PCIe, Ethernet 10/100/1000 Mbps (Triple-Speed Ethernet), Interlaken, JTAG, UART, SPI External memory interface IP E.g. DDR/DDR2/DDR3 Video and imaging processing (VIP) IP E.g. VIP Suite including scaler, switch, deinterlacer, and alpha blending mixer Embedded processor IP E.g. Hardened ARM processor system, Nios II processor Verification IP E.g. Avalon-MM/-ST, AXI4, APB So what standard cores are available? Well, Qsys offers a wide range of plug ‘n play IP also known as Qsys compliant IP cores. Qsys compliant cores range from interface protocol IPs like PCI express to memory IPs like DDR3 to Video and imaging processing IPs like the scaler or deinterlacer IP cores. In the embedded IP world, the Nios II software processors, JTAG, UART, SPI, and RS232 IP are all Qsys compliant. Today, there are well over 100 Qsys compliant IP cores with more to come. >100 Qsys compliant IP available

63 Qsys as a Platform for System Integration
Library of Available IP Connect IP and Systems Interface protocols Memory DSP Embedded Bridges PLL Custom systems IP 1 Custom 1 IP 2 IP 3 Custom 2 So we just looked at the wide variety of Qsys compliant IP, and how these IP cores can help you accelerate your development. So now lets look at how Qsys can help you with your glue logic. First of all, how much fun is it to connect hundreds of signals and buses between your multiple design blocks? And if you’ve done it in the past, how many times did you end up going back to fix an error? I think you’ll agree that integrating multiple design blocks together can be a pretty mundane and error-prone process. So why not let Qsys do some, or all of the integration work for you. Qsys provides an intuitive GUI interface that allows you to make connections between different blocks, effectively connecting many signals and busses together, with a single mouse click. After you’ve described the connections between your design blocks, Qsys will automatically generate the interconnect for you and the interconnect HDL. Accelerate Development HDL Simplify Integration Automate Error-Prone Integration Tasks

64 Additional Resources Watch online demos (3-5 min) www.altera.com/qsys
Complete the Qsys tutorial (2-3 hrs) Watch free webcasts (10-15 mins) Sign up for Qsys training To learn more about Qsys, you can watch short online demos, download and complete the qsys tutorial or sign up for qsys technical training. You can find all this and much more on

65 In-system Verification

66 Debug Can Be Costly Debug Challenges
Accessing and viewing internal signals Not enough pins to use as test points Capabilities in creating trigger conditions that correctly capture data Verification of standard or proprietary protocol interfaces Overall design process bottleneck Where do designers spend more time debugging ? Our FPGA customers told us that a big portion of the debug time after diagnosing the functional failure is spent selecting the design signals of interest, and in creating trigger conditions. FPGA designs now include standard protocol interfaces at the board level. This is another area of focus for Altera. Debug Can Be Costly

67 Reduce Debug Cycles by Using On-chip Debug Tools
Access and view internal signals Store captured data in FPGA embedded memory Use JTAG interface as debug ports Incrementally add internal signals to view Reduce Debug Cycles by Using On-chip Debug Tools

68 On-chip Debug Technology
Debug tools communicate with the FPGA via standard JTAG interface Multiple debug functions can share the JTAG interface simultaneously Altera’s system-level debugging (SLD) hub technology makes this possible All Altera tools and some third-party tools support the SLD hub JTAG interface User's Design (Core Logic) FPGA Node 1 Node 2 Download Cable Node N JTAG Tap Controller SLD Hub Node N-1

69 On-chip Debug Tools in Quartus II Software
SignalTap II logic analyzer Captures and displays hardware events, fast turnaround times Incrementally creates trigger conditions and adds signals to view Uses captured data stored in on-chip RAM and JTAG interface for communication In-system memory content editor Displays content of on-chip memory Enables modification of memory content in a running system External logic analyzer interface Uses external logic analyzer to view internal signals Dynamically switches internal signals to output In-system sources and probes Stimulate and monitor internal signals without using on-chip RAM Exception: SignalProbe incremental routing feature does not use JTAG interface (i.e. SLD hub technology) Quickly routes an internal node to a pin for observation

70 SignalTap II Logic Analyzer
Provides the most advanced triggering capabilities available in an FPGA-embedded logic analyzer Proven to be invaluable in the lab Captures bugs that would take weeks of simulation to uncover Has broad customer adoption Features and benefits An embedded logic analyzer Uses available internal memory Probes state of internal signals without using external equipment or extra I/O pins Incremental compilation support Fast turnaround time when adding signals to view Advanced triggering for capturing difficult events/transactions Power-up trigger support Debug the initialization code Megafunction support Optionally, instantiate in HDL

71 In-system Memory Content Editor
Enables FPGA memory content and design constants to be updated in-system, via JTAG interface, without recompiling a design or reconfiguring the rest of the FPGA Fault injection into system Update memory while system is running Change value of coefficients in DSP applications Easily perform “what if?” type experiments in-system in just seconds Supports MIF and HEX formats for data interchange Megafunctions supported LPM_CONSTANT, LPM_ROM, LPM_RAM_DQ, ALTSYNCRAM (ROM and single-port RAM mode) Enable memory content editor

72 In-system Memory Content Editor
Under Tools menu  In-system Memory Content Editor

73 Altera SoC Embedded Design Suite

74 Included in SoC Embedded Design Suite (EDS)
Development Studio 5 Altera Edition Awesome debugger, especially when combined with USB Blaster II Altera SoC FPGA System Trace Macrocells Application development environment Streamline system analyzer Hardware Libraries GNU-based bare-metal (EABI) compiler tools U-Boot Root file system to jump start software development Pre-built Linux kernel for source trees and community access

75 System Development Flow
FPGA Design Flow Software Design Flow Hardware Development Software Development Quartus II design software Qsys system integration tool Standard RTL flow Altera and partner IP ARM Development Studio 5 GNU toolchain OS/BSP: Linux, VxWorks Hardware Libraries Design Examples Design Simulate ModelSim, VCS, NCSim, etc. AMBA-AXI and Avalon bus functional models (BFMs) Debug SignalTap™ II logic analyzer System Console GNU, Lauterbach, DS5 Release Quartus II Programmer In-system Update Flash Programmer

76 Altera SoC Embedded Design Suite
FPGA Design Flow Software Design Flow Hardware Development Software Development Quartus II design software Qsys system integration tool Standard RTL flow Altera and partner IP ARM Development Studio 5 GNU toolchain OS/BSP: Linux, VxWorks Hardware Libraries Design Examples Design Software Development HW/SW Handoff Simulate ModelSim, VCS, NCSim, etc. AMBA-AXI and Avalon bus functional models (BFMs) Virtual Target Debug SignalTap™ II logic analyzer System Console GNU, Lauterbach, DS5 FPGA-Adaptive Debugging Release Quartus II Programmer In-system Update Flash Programmer

77 Altera SoC Embedded Design Suite
Firmware Development Hardware-to-Software Handoff Linux Application Development FPGA-Adaptive Debugging Comprehensive Suite SW Dev Tools Hardware / software handoff tools Linux application development Yocto Linux build environment Pre-built binaries for Linux / U-Boot Work in conjunction with the Community Portal Bare-metal application development SoC Hardware Libraries Bare-metal compiler tools FPGA-adaptive debugging ARM DS-5 Altera Edition Toolkit Design examples Free Web Edition Subscription Edition Free 30-day Eval

78 Hardware-to-Software Handoff
Qsys system info, SDRAM calibration files, ID / timestamp, HPS IOCSR data Hardware system.iswinfo system.sopcinfo Preloader Generator Device Tree Generator .c & .h source files Linux Device Tree Software

79 Hardware / Software Handoff Tools
Allow hardware and software teams to work independently and follow their familiar design flows Take Altera Quartus® II / Qsys output files and generate handoff files for the software design flow Device Tree standard specifies hardware connectivity so that Linux kernel can boot up correctly

80 Linux Application Development
Yocto build support for Linux Yocto standard enables open, versatile, and cost-effective embedded software development Allows a smooth transition to commercial Linux distributions Pre-built Linux kernel, U-Boot, and root file system to jump start software development Link to community portal for source trees and community access

81 Bare-metal Application Development
Hardware Libraries Software interface to all system registers Functions to configure some basic system operations (e.g. clock speed settings, cache settings, FPGA configuration, etc.) Support board bring-up and diagnostics development Can be used by bare-metal application, device drivers, or RTOS GNU-based bare-metal (EABI) compiler tools Operating System SoC FPGA Application HAL BMAL PAL BSP Bare-metal App Hardware Libraries

82 Golden System Reference Design
Complete system design with Linux software support Simple custom logic design in FPGA All source code and Quartus II / Qsys design files for reference Include all compiled binaries- example can run on an Altera SoC Development Kit to jumpstart development

83 DS-5 Altera Edition- One Tool, Three Usages
1 2 3 JTAG-Based Debugging Board Bring-up OS porting, Drivers Dev, Kernel Debug Application Debugging Linux User Space Code RTOS App Code System Integration System Debug FPGA-Adaptive Debugging

84 One Device, Two Debugging Tools?
ARM® DS-5™ Toolkit Altera Quartus™ II Software JTAG DSTREAM™ JTAG Dedicated JTAG connection Visualize & control CPU subsystem Dedicated JTAG connection Visualize & control FPGA

85 One Device, Two Debugging Tools?
ARM® DS-5™ Toolkit Altera Quartus™ II Software Debugging Barrier No single tool/cable to visualize and control both CPU and FPGA domains No way for CPU and FPGA to cross trigger and correlate hardware and software events No “fixed” debugger can address the needs of “changing” FPGA hardware JTAG DSTREAM™ JTAG Dedicated JTAG connection Visualize & control CPU subsystem Dedicated JTAG connection Visualize & control FPGA

86 Industry First: FPGA-Adaptive Debugging
Altera USB-Blaster™II Connection ARM® Development Studio 5 (DS-5™) Altera® Edition Toolkit Removes debugging barrier between CPUs and FPGA Exclusive OEM agreement between Altera and ARM Result of innovation in silicon, software, and business model

87 FPGA-Adaptive Debugging Features
Single USB-Blaster II cable for simultaneous SW and HW debug Automatic discovery of FPGA peripherals and creation of register views Hardware cross-triggering between the CPU and FPGA domains Correlation of CPU software instructions and FPGA hardware events Simultaneous debug and trace for Cortex-A9 cores and CoreSight™-compliant cores in FPGA Statistical analysis of software load and bus traffic spanning the CPUs and FPGA

88 DS-5 Altera Edition Productivity-Boosting Features
Industry’s most advanced multicore debugger for ARM JTAG based system-level debugging, gdbserver-based application debugging in one package Yocto plugin to enable Linux based application development Integrated OS-aware analysis and debug capability

89 Visualization of SoC Peripherals
Register views assist the debug of FPGA peripherals File generated by FPGA tool flow Automatically imported in DS-5 Debugger Debug views for debug of software drivers Self-documenting Grouped by peripheral, register and bit-field CMSIS Peripheral register descriptions

90 FPGA-Adaptive, Unified Debugging
FPGA connected to debug and trace buses for non-intrusive capture and visualization of signal events Simultaneous debug and trace connection to CPU cores and compatible IP Correlate FPGA signal events with software events and CPU instruction trace using triggers and timestamps

91 Cross-Domain Debug 1 Trigger from software world to FPGA world
SOFTWARE TRIGGER HARDWARE TRIGGER!

92 Cross-Domain Debug 2 Trigger from FPGA world to software world
HARDWARE TRIGGER EXECUTION STOP OR HW TRACE TRIGGER EXECUTION STOP OR SW TRACE TRIGGER

93 Correlate HW and SW Events
Debug event trigger point set from either: SignalTap™ II Logic Analyzer or DS-5 debugger Captured trace can then be analyzed using timestamp-correlated events ARM® DS-5™ Toolkit Timestamp Correlated SignalTap II Logic Analyzer

94 System-Level Performance Analysis
Performance bottlenecks in SoCs often come from the CPU interaction with the rest of the SoC Streamline visualizes software activity with performance counters from the SoC and FPGA to enable full system-level analysis Streamline only requires a TCP/IP connection to the SoC ARM® DS-5™ Streamline Linux OS Counters Processor Counters, Aggregated, or Per Core Power Consumption FPGA Block Counters Process/Thread Heat Map Application Events

95 Altera SoC EDS- Key Benefits
One-stop shop from Altera All the tools and examples for rapid starts Familiar tools interface, easy to use Share tools and knowledge to increase team productivity Best multicore debugger tools for ARM architecture Unprecedented visibility and control across processor cores and across CPU, FPGA domains Faster time to market, lower development costs! Click to add notes DS-5 has a large ecosystem

96 Target Users and Usages
Web Edition Subscription Board Bring-up Yes Device Drivers Dev OS Porting Baremetal Programming RTOS Based App Dev Linux Based App Dev Multicore App Debugging System Debugging

97 SoC EDS Editions Summary
Component Key Feature Web Edition Subscription Edition 30-Day Evaluation Hardware/Software Handoff Tools Preloader Image Generator x Flash Image Creator Device Tree Generator (Linux) ARM DS-5 Altera Edition Eclipse IDE ARM Compiler* Debugging over Ethernet (Linux) Debugging over USB-Blaster II JTAG Automatic FPGA Register Views Hardware Cross-triggering CPU/FPGA Event Correlation Compiler Tool Chains Linaro Tool Chain (Linux) CodeBench Lite EABI (Bare-metal) Hardware Libraries Bare-metal programming Support SoC Programming Examples Golden System Reference Design *ARM Compiler is available in DS-5 Professional Edition, available directly from ARM

98 Coordinated Multi-Channel Delivery
Altera.com Quartus II Programmer SignalTap II Altera.com Pre-built Binaries Kernel U-Boot Yocto Minimal RFS Tool chains Handoff tools HW Libraries Examples Documentation RocketBoards.org Frequent Updates Kernel source U-Boot source Yocto source RFS source Toolchain source Public git Wiki Mailman Partners BSPs Middleware 3rd Party Tools

99 Altera NIOS Software Design Tools
Nios II SBT for Eclipse key features: New project wizards and software templates Compiler for C and C++ (GNU) Source navigator, editor, and debugger Eclipse project-based tools


Download ppt "Chris Martin Member Technical Staff Embedded Applications"

Similar presentations


Ads by Google