Survey of multicore architectures Marko Bertogna Scuola Superiore S.Anna, ReTiS Lab, Pisa, Italy.

Slides:



Advertisements
Similar presentations
Field Programmable Gate Array
Advertisements

FPGA (Field Programmable Gate Array)
Hao wang and Jyh-Charn (Steve) Liu
ECOE 560 Design Methodologies and Tools for Software/Hardware Systems Spring 2004 Serdar Taşıran.
Implementation Approaches with FPGAs Compile-time reconfiguration (CTR) CTR is a static implementation strategy where each application consists of one.
Lecture 7 FPGA technology. 2 Implementation Platform Comparison.
A reconfigurable system featuring dynamically extensible embedded microprocessor, FPGA, and customizable I/O Borgatti, M. Lertora, F. Foret, B. Cali, L.
Lecture 9: Coarse Grained FPGA Architecture October 6, 2004 ECE 697F Reconfigurable Computing Lecture 9 Coarse Grained FPGA Architecture.
NATIONAL INSTITUTE OF SCIENCE & TECHNOLOGY Presented by: Susman Das Technical Seminar Presentation FPAA for Analog Circuit Design Presented by Susman.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR SRAM-based FPGA n SRAM-based LE –Registers in logic elements –LUT-based logic element.
Digital Signal Processing and Field Programmable Gate Arrays By: Peter Holko.
EECE579: Digital Design Flows
Week 1- Fall 2009 Dr. Kimberly E. Newman University of Colorado.
Introduction Special-purpose processors. Embedded systems. FPGAs.
Embedded Systems: Introduction. Course overview: Syllabus: text, references, grading, etc. Schedule: will be updated regularly; lectures, assignments.
Introduction to Reconfigurable Computing CS61c sp06 Lecture (5/5/06) Hayden So.
Spring 2008 Network On Chip Platform Instructor: Yaniv Ben-Itzhak Students: Ofir Shimon Guy Assedou.
Lecture 26: Reconfigurable Computing May 11, 2004 ECE 669 Parallel Computer Architecture Reconfigurable Computing.
FPGA chips and DSP Algorithms By Emily Fabes. 2 Agenda FPGA Background Reasons to use FPGA’s Advantages and disadvantages of using FPGA’s Sample VHDL.
Spring 07, Jan 16 ELEC 7770: Advanced VLSI Design (Agrawal) 1 ELEC 7770 Advanced VLSI Design Spring 2007 Introduction Vishwani D. Agrawal James J. Danaher.
Configurable System-on-Chip: Xilinx EDK
Programmable logic and FPGA
Dynamically Reconfigurable Architectures: An Overview Juanjo Noguera Dept. Computer Architecture (DAC-UPC)
February 4, 2002 John Wawrzynek
UCB November 8, 2001 Krishna V Palem Proceler Inc. Customization Using Variable Instruction Sets Krishna V Palem CTO Proceler Inc.
Implementation of DSP Algorithm on SoC. Mid-Semester Presentation Student : Einat Tevel Supervisor : Isaschar Walter Accompaning engineer : Emilia Burlak.
Introduction to FPGA and DSPs Joe College, Chris Doyle, Ann Marie Rynning.
1 A survey on Reconfigurable Computing for Signal Processing Applications Anne Pratoomtong Spring2002.
General FPGA Architecture Field Programmable Gate Array.
Programming the Cell Multiprocessor Işıl ÖZ. Outline Cell processor – Objectives – Design and architecture Programming the cell – Programming models CellSs.
Xilinx at Work in Hot New Technologies ® Spartan-II 64- and 32-bit PCI Solutions Below ASSP Prices January
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
Section I Introduction to Xilinx
Lecture 2: Field Programmable Gate Arrays September 13, 2004 ECE 697F Reconfigurable Computing Lecture 2 Field Programmable Gate Arrays.
1 3-General Purpose Processors: Altera Nios II 2 Altera Nios II processor A 32-bit soft core processor from Altera Comes in three cores: Fast, Standard,
Paper Review: XiSystem - A Reconfigurable Processor and System
System Arch 2008 (Fire Tom Wada) /10/9 Field Programmable Gate Array.
Automated Design of Custom Architecture Tulika Mitra
ASIP Architecture for Future Wireless Systems: Flexibility and Customization Joseph Cavallaro and Predrag Radosavljevic Rice University Center for Multimedia.
SPREE RTL Generator RTL Simulator RTL CAD Flow 3. Area 4. Frequency 5. Power Correctness1. 2. Cycle count SPREE Benchmarks Verilog Results 3. Architecture.
J. Christiansen, CERN - EP/MIC
Page 1 Reconfigurable Communications Processor Principal Investigator: Chris Papachristou Task Number: NAG Electrical Engineering & Computer Science.
Reminder Lab 0 Xilinx ISE tutorial Research Send me an if interested Looking for those interested in RC with skills in compilers/languages/synthesis,
Field Programmable Gate Arrays (FPGAs) An Enabling Technology.
“Politehnica” University of Timisoara Course No. 2: Static and Dynamic Configurable Systems (paper by Sanchez, Sipper, Haenni, Beuchat, Stauffer, Uribe)
EE3A1 Computer Hardware and Digital Design
DIPARTIMENTO DI ELETTRONICA E INFORMAZIONE Novel, Emerging Computing System Technologies Smart Technologies for Effective Reconfiguration: The FASTER approach.
Task Graph Scheduling for RTR Paper Review By Gregor Scott.
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
Network On Chip Platform
Development of Programmable Architecture for Base-Band Processing S. Leung, A. Postula, Univ. of Queensland, Australia A. Hemani, Royal Institute of Tech.,
1 Advanced Digital Design Reconfigurable Logic by A. Steininger and M. Delvai Vienna University of Technology.
Delivered by.. Love Jain p08ec907. Design Styles  Full-custom  Cell-based  Gate array  Programmable logic Field programmable gate array (FPGA)
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n FPGA fabric architecture concepts.
FPGA Technology Overview Carl Lebsack * Some slides are from the “Programmable Logic” lecture slides by Dr. Morris Chang.
Computer Organization IS F242. Course Objective It aims at understanding and appreciating the computing system’s functional components, their characteristics,
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
System on a Programmable Chip (System on a Reprogrammable Chip)
Reconfigurable Computing1 Reconfigurable Computing Part II.
Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.
Altera Stratix II FPGA Architecture
Introduction to Programmable Logic
Anne Pratoomtong ECE734, Spring2002
Lecture 41: Introduction to Reconfigurable Computing
Dynamically Reconfigurable Architectures: An Overview
Embedded systems, Lab 1: notes
Characteristics of Reconfigurable Hardware
A High Performance SoC: PkunityTM
HIGH LEVEL SYNTHESIS.
Programmable logic and FPGA
Presentation transcript:

Survey of multicore architectures Marko Bertogna Scuola Superiore S.Anna, ReTiS Lab, Pisa, Italy

Summary CELL processor Reconfigurable devices Software-Hardware co-design Parallel programming problems data dependencies process synchronization memory barriers locking mechanisms Language extensions for parallel programming Real-time multiprocessor scheduling

Cell processor A Cell Processor

Cell History

Cell basic concepts

Cell synergy

Cell Chip

Cell features

Cell Processor Components

Synergistic Processor Element (SPE)

SPE

SPE details

Element Interconnect Bus (EIB)

EIB: Data topology

Example: 8 concurrent transactions

Theoretical peak operations

Cell BE performance

Why is Cell Processor so fast?

CELL software environment

System Level Simulator

SPE management library

CELL parallelism

Typical CELL sw development flow

ARM ’ s MPcore

PicoArray (by PicoChip)

PicoArray scaling

FPGA and Reconfigurable devices

Field Programmable Gate Arrays SRAM-based matrix of integrated elements whose interconnections can be programmed statically or even dynamically Basic block is Logic Element (LE) Chip capacities from 1k to 1000k LEs Each LE is typically composed by logic gates, LUTs, Flip-Flops and latches Need for optimized CAD or pre-binded design libraries

FPGA CSL organization: Basic Logic Element:

Altera ’ s Stratix IV basic block Adaptive Logic Module (ALM)

Flexibility vs efficiency

Reconfigurable devices advantages Efficiency AND Flexibility Time to market Easier upgrade Lower cost (on scale production) Reusable IP Customable interface

Reconfigurable devices parameters Block granularity Coarse grained: Functional Units, Processor Cores, Memory Tiles Fin grained: gate and register level Density Reconfiguration time Compile-Time Reconfiguration (CTR) Run-Time Reconfiguration (RTR) Partial or Total reprogramming

Triscend ’ s A7S chip

Example: multiplier on Altera ’ s Stratix IV

Typical FPGA software development environment FPGA optimized module library IO Editor Generate  file.h Bind (placement and route)  file.csl Config  file.cfg Download

Typical FPGA module library

Altera ’ s Nios II Nios II is a soft-core processor IP that can be downloaded into an Altera ’ s FPGA, obtaining the functionalities of a real RISC CPU Logic elements are programmed so as to behave like gates of classic ASIC processors Different Nios versions are available faster and with full functionalities  bigger size medium sized compact but slower and with limited functionalities

Nios II core

Selecting Nios II e/s/f

Example of a Nios II Processor system

Final global layout

Soft-core processors and FPGAs Possible to have multiple cores on a single chip Customizable hardware can be used to coordinate the various cores Build and test a whole multicore system in a faster time Detect and solve bottlenecks without needing to repeatedly return to the integration phase

Co-design problems with FPGAs A task may be executed by a (soft-core or ASIC) processor or may be entirely implemented in hardware on the reconfigurable logic “ Programming in Space ” versus “ Programming in Time ” Centralized vs Distributed computing Sequential vs Parallel programming Interconnect Network

What is a task in hardware? Software programming c=a+b; result=c/2; Hardware implementation a b c + shifter result Assembler expansion: ldr r0,a ldr r1,b add r0,r0,r1 mov r0,LSR r0 str r0,result 5 operations All in one clock cycle!

Conclusions FPGAs are interesting devices for multicore systems developers Valid benchmark upon which to compare classic serial programming methods and parallel computing approaches Allow reducing time-to-market for next- generation multicore systems Provide common platforms that can easily reproduce any architecture (given a proper VHDL/Verilog description)