Reconfigurable HPC Notes on datastream-based FFT

Slides:



Advertisements
Similar presentations
DSPs Vs General Purpose Microprocessors
Advertisements

Enabling Technologies for Reconfigurable Computing Enabling Technologies for Reconfigurable Computing Part 2: Stream-based Computing for RC Wednesday,
EEL6686 Guest Lecture February 25, 2014 A Framework to Analyze Processor Architectures for Next-Generation On-Board Space Computing Tyler M. Lovelly Ph.D.
Evaluation of Reconstruction Techniques
The von Neumann Syndrome Reiner Hartenstein TU Kaiserslautern TU Delft, Sept 28, (v.2)
Reconfigurable Supercomputing means to brave the paradigm chasm Reiner Hartenstein HiPEAC Workshop on Reconfigurable Computing Ghent, Belgium January 28,
Stonewalled Progress of Computing Efficiency 1 Reiner Hartenstein (keynote) SA - Sep 1 16: :50
Input & Output Machines
Enabling Technologies for Reconfigurable Computing Enabling Technologies for Reconfigurable Computing and Software / Configware Co-Design Part 2: Data-Stream-based.
GPGPU Introduction Alan Gray EPCC The University of Edinburgh.
An Introduction to Reconfigurable Computing Mitch Sukalski and Craig Ulmer Dean R&D Seminar 11 December 2003.
1School of CS&Eng The Hebrew University
The 5th IEEE Workshop on Design & Diagnosis of Electronic Circuits & Systems (DDECS'02)DDECS'02 Configware / Software Co-Design: be prepared for the Next.
Course-Grained Reconfigurable Devices. 2 Dataflow Machines General Structure:  ALU-computing elements,  Programmable interconnections,  I/O components.
IPDPS 2004 Software or Configware? About the Digital Divide of Parallel Computing Reiner Hartenstein TU Kaiserslautern Santa Fe, NM, April , 2004.
Reconfigurable HPC Reconfigurable HPC part 1 Introduction Reiner Hartenstein TU Kaiserslautern May 14, 2004, TU Tallinn, Estonia.
Claude TADONKI Mines ParisTech – LAL / CNRS / INP 2 P 3 University of Oujda (Morocco) – October 7, 2011 High Performance Computing Challenges and Trends.
Wednesday, June 07, 2006 “Unix is user friendly … it’s just picky about it’s friends”. - Anonymous.
EE201A, Spring 2003, Yung-Szu Tu, Chun-Ching, UCLA - Memory Addressing 1 Memory Addressing Organization for Stream-Based Reconfigurable Computing Team.
Seminar at Kyushu University Reconfigurable Technologies (1) Reiner Hartenstein TU Kaiserslautern July 23, 2004, Fukuoka, Japan.
Min-Sheng Lee Efficient use of memory bandwidth to improve network processor throughput Jahangir Hasan 、 Satish ChandraPurdue University T. N. VijaykumarIBM.
Modeling DRS via Rewriting-Logic, 2 nd Dagstuhl Seminar on Dynamically Reconfigurable Systems 1 C. Llanos 2,4,M. Ayala-Rincón 1,4, R. B. Nogueira 2,4,
Data Partitioning for Reconfigurable Architectures with Distributed Block RAM Wenrui Gong Gang Wang Ryan Kastner Department of Electrical and Computer.
Implementation of a noise subtraction algorithm using Verilog HDL University of Massachusetts, Amherst Department of Electrical & Computer Engineering,
1 Chapter 13 Cores and Intellectual Property. 2 Overview FPGA intellectual property (IP) can be defined as a reusable design block (Hard, Firm or soft)
Application of Instruction Analysis/Synthesis Tools to x86’s Functional Unit Allocation Ing-Jer Huang and Ping-Huei Xie Institute of Computer & Information.
An Application-Specific Design Methodology for STbus Crossbar Generation Author: Srinivasan Murali, Giovanni De Micheli Proceedings of the DATE’05,pp ,2005.
How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern DRAFT PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling,
2014 IEEE Aerospace Conference March 2, 2014 A Framework to Analyze Processor Architectures for Next-Generation On-Board Space Computing Tyler M. Lovelly.
DRRA Dynamically Reconfigurable Resource Array
How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization.
How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern IEEE fellow FPL fellow SDPS fellow PATMOS 2015, the 25 th International.
Highest Performance Programmable DSP Solution September 17, 2015.
The PATMOS Workshop Series Reiner Hartenstein TU Kaiserslautern IEEE fellow FPL fellow SDPS fellow PATMOS 2015, the 25 th International.
COMPUTER SCIENCE &ENGINEERING Compiled code acceleration on FPGAs W. Najjar, B.Buyukkurt, Z.Guo, J. Villareal, J. Cortes, A. Mitra Computer Science & Engineering.
Realtime 3D model construction with Microsoft Kinect and an NVIDIA Kepler laptop GPU Paul Caheny MSc in HPC 2011/2012 Project Preparation Presentation.
High Performance Scalable Base-4 Fast Fourier Transform Mapping Greg Nash Centar 2003 High Performance Embedded Computing Workshop
Computer architecture Lecture 11: Reduced Instruction Set Computers Piotr Bilski.
1 Towards Optimal Custom Instruction Processors Wayne Luk Kubilay Atasu, Rob Dimond and Oskar Mencer Department of Computing Imperial College London HOT.
Piranha: A Scalable Architecture Based on Single-Chip Multiprocessing Barroso, Gharachorloo, McNamara, et. Al Proceedings of the 27 th Annual ISCA, June.
Parallelization and Characterization of Pattern Matching using GPUs Author: Giorgos Vasiliadis 、 Michalis Polychronakis 、 Sotiris Ioannidis Publisher:
VLSI-SoC 2001 IFIP - LIRMM Stream-based Arrays: Converging Design Flows for both, Reiner Hartenstein University of Kaiserslautern December 2- 4, 2001,
Lecture 12: Reconfigurable Systems II October 20, 2004 ECE 697F Reconfigurable Computing Lecture 12 Reconfigurable Systems II: Exploring Programmable Systems.
A Hybrid Design Space Exploration Approach for a Coarse-Grained Reconfigurable Accelerator Farhad Mehdipour, Hamid Noori, Hiroaki Honda, Koji Inoue, Kazuaki.
COARSE GRAINED RECONFIGURABLE ARCHITECTURES 04/18/2014 Aditi Sharma Dhiraj Chaudhary Pruthvi Gowda Rachana Raj Sunku DAY
Evaluating and Improving an OpenMP-based Circuit Design Tool Tim Beatty, Dr. Ken Kent, Dr. Eric Aubanel Faculty of Computer Science University of New Brunswick.
Design Space Exploration for a Coarse Grain Accelerator Farhad Mehdipour, Hamid Noori, Morteza Saheb Zamani*, Koji Inoue, Kazuaki Murakami Kyushu University,
High Performance Embedded Computing © 2007 Elsevier Chapter 7, part 3: Hardware/Software Co-Design High Performance Embedded Computing Wayne Wolf.
1 Presenter: Min Yu,Lo 2015/12/21 Kumar, S.; Jantsch, A.; Soininen, J.-P.; Forsell, M.; Millberg, M.; Oberg, J.; Tiensyrja, K.; Hemani, A. VLSI, 2002.
Visualization Four groups Design pattern for information visualization
CS 351/ IT 351 Modeling and Simulation Technologies HPC Architectures Dr. Jim Holten.
Multi-objective Topology Synthesis and FPGA Prototyping Framework of Application Specific Network-on-Chip m Akram Ben Ahmed Xinyu LI, Omar Hammami.
LOGO Parallel computing technique for EM modeling makai 天津大学电子信息工程学院 School of Electronic Information Engineering.
Enabling Technologies for Reconfigurable Computing Enabling Technologies for Reconfigurable Computing and Software / Configware Co-Design Part 2: Data-Stream-based.
REPENTURN July 5, 2015 REPENTURN REPENTURN REPENTURN.
November 25Asian Test Symposium 2008, Nov 24-27, Sapporo, Japan1 Sequential Circuit BIST Synthesis using Spectrum and Noise from ATPG Patterns Nitin Yogi.
Fang Fang James C. Hoe Markus Püschel Smarahara Misra
제 5 장 스테레오.
Memory Organisation for Datastream-based Reconfigurable Computing
CSE-591 Compilers for Embedded Systems Code transformations and compile time data management techniques for application mapping onto SIMD-style Coarse-grained.
Enterprise Architecture Patterns
Streaming Sensor Data Fjord / Sensor Proxy Multiquery Eddy
مدل زنجیره ای در برنامه های سلامت
Modeling and Design of STT-MRAMs
كارگاه آموزشي معماري نرم‌افزار
Embedded Architectures: Configurable, Re-configurable, or what?
Understanding LSTM Networks
Relations.
A connectionist model in action
Run time performance for all benchmarked software.
Presentation transcript:

Reconfigurable HPC Notes on datastream-based FFT Reiner Hartenstein TU Kaiserslautern Baden-Baden,12 June 2013 derived from: R. Hartenstein: Reconfigurable Technologies; 23 July 2004, Seminar given at Kyushu University, Fukuoka, Japan

© 2004, TU Kaiserslautern 2 application-specific distributed memory* Application-specific memory: rapidly growing markets: –IP cores –Module generators –EDA environments Optimization of memory bandwidth for application-specific distributed memory *) see Herz et al.: proc. IEEE ICECS 2002

© 2004, TU Kaiserslautern 3 MoM anti machine an Xputer architecture Multiple Scan Windows data counter memory bank asM asMA distributed memory r DPU smart memory interface example: 4x4 scan windows.....

© 2004, TU Kaiserslautern 4 16 point CGFFT: mapped onto 2-D memory space

© 2004, TU Kaiserslautern 5 output temp coeff. CGFFT: Nested and Parallel Scan Pattern input coeff. in i in i+1 coeff. empty MAC

© 2004, TU Kaiserslautern 6 CGFFT: Parallel Scan Pattern Animation in i in i+1 coeff. empty out k MAC out j 32 steps

© 2004, TU Kaiserslautern 7 CGFFT: Parallel Scan Pattern Animation MAC out j out j+1 out k out k+1 in i in i+1 coeff. empty In i+2 in i+3 coeff. empty MAC 4 MAC units in parallel 8 MAC units in parallel 16 steps 8 steps 4 steps

© 2004, TU Kaiserslautern 8 CGFFT: Nested and Parallel Scan Pattern goto