Verilog to Routing CAD Tool Optimization

Slides:



Advertisements
Similar presentations
A Novel 3D Layer-Multiplexed On-Chip Network
Advertisements

LEMap: Controlling Leakage in Large Chip-multiprocessor Caches via Profile-guided Virtual Address Translation Jugash Chandarlapati Mainak Chaudhuri Indian.
University of Michigan Electrical Engineering and Computer Science 1 A Distributed Control Path Architecture for VLIW Processors Hongtao Zhong, Kevin Fan,
ENGIN112 L38: Programmable Logic December 5, 2003 ENGIN 112 Intro to Electrical and Computer Engineering Lecture 38 Programmable Logic.
Evolution of implementation technologies
Storage Assignment during High-level Synthesis for Configurable Architectures Wenrui Gong Gang Wang Ryan Kastner Department of Electrical and Computer.
Architecture and Routing for NoC-based FPGA Israel Cidon* *joint work with Roman Gindin and Idit Keidar.
The Memory/Logic Interface in FPGA’s with Large Embedded Memory Arrays The Memory/Logic Interface in FPGA’s with Large Embedded Memory Arrays Steven J.
Parallel K-Means Clustering Based on MapReduce The Key Laboratory of Intelligent Information Processing, Chinese Academy of Sciences Weizhong Zhao, Huifang.
SSS 4/9/99CMU Reconfigurable Computing1 The CMU Reconfigurable Computing Project April 9, 1999 Mihai Budiu
Multicore experiment: Plurality Hypercore Processor Performed by: Anton Fulman Ze’ev Zilberman Supervised by: Mony Orbach Characterization presentation.
Performance and Power Efficient On-Chip Communication Using Adaptive Virtual Point-to-Point Connections M. Modarressi, H. Sarbazi-Azad, and A. Tavakkol.
MGR: Multi-Level Global Router Yue Xu and Chris Chu Department of Electrical and Computer Engineering Iowa State University ICCAD
Applying Twister to Scientific Applications CloudCom 2010 Indianapolis, Indiana, USA Nov 30 – Dec 3, 2010.
Yongjoo Kim*, Jongeun Lee**, Jinyong Lee*, Toan Mai**, Ingoo Heo* and Yunheung Paek* *Seoul National University **UNIST (Ulsan National Institute of Science.
Presented By HaeJoon Lee Yanyan Shen, Beng Chin Ooi, Bogdan Marius Tudor National University of Singapore Wei Lu Renmin University Cang Chen Zhejiang University.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
Channel Width Reduction Techniques for System-on-Chip Circuits in Field-Programmable Gate Arrays Marvin Tom University of British Columbia Department of.
Abhishek Pandey Reconfigurable Computing ECE 506.
Automated Design of Custom Architecture Tulika Mitra
Heterogeneous FPGA architecture and CAD Peter Jamieson Supervisor: Jonathan Rose.
Many-SC Project Runtime Environment (RTE) CSAP Lab 2014/10/28.
L11: Lower Power High Level Synthesis(2) 성균관대학교 조 준 동 교수
Design Space Exploration for Application Specific FPGAs in System-on-a-Chip Designs Mark Hammerquist, Roman Lysecky Department of Electrical and Computer.
Autonomic scheduling of tasks from data parallel patterns to CPU/GPU core mixes Published in: High Performance Computing and Simulation (HPCS), 2013 International.
Design of a High-Throughput Low-Power IS95 Viterbi Decoder Xun Liu Marios C. Papaefthymiou Advanced Computer Architecture Laboratory Electrical Engineering.
Test Architecture Design and Optimization for Three- Dimensional SoCs Li Jiang, Lin Huang and Qiang Xu CUhk Reliable Computing Laboratry Department of.
Hardware Accelerator for Combinatorial Optimization Fujian Li Advisor: Dr. Areibi.
Implementing Data Cube Construction Using a Cluster Middleware: Algorithms, Implementation Experience, and Performance Ge Yang Ruoming Jin Gagan Agrawal.
Analysis of Cache Tuner Architectural Layouts for Multicore Embedded Systems + Also Affiliated with NSF Center for High- Performance Reconfigurable Computing.
DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.
Dynamic Traffic Distribution among Hierarchy Levels in Hierarchical Networks-on-Chip Ran Manevich, Israel Cidon, and Avinoam Kolodny Group Research QNoC.
FPGA CAD 10-MAR-2003.
Memory-Efficient and Scalable Virtual Routers Using FPGA Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan,
In-Place Decomposition for Robustness in FPGA Ju-Yueh Lee, Zhe Feng, and Lei He Electrical Engineering Dept., UCLA Presented by Ju-Yueh Lee Address comments.
University of Michigan Electrical Engineering and Computer Science Automatic Synthesis of Customized Local Memories for Multicluster Application Accelerators.
Self-Hosted Placement for Massively Parallel Processor Arrays (MPPAs) Graeme Smecher, Steve Wilton, Guy Lemieux Thursday, December 10, 2009 FPT 2009.
Architecture and algorithm for synthesizable embedded programmable logic core Noha Kafafi, Kimberly Bozman, Steven J. E. Wilton 2003 Field programmable.
Congestion-Driven Re-Clustering for Low-cost FPGAs MASc Examination Darius Chiu Supervisor: Dr. Guy Lemieux University of British Columbia Department of.
Interconnect Characteristics of 2.5-D System Integration Scheme Yangdong (Steven) Deng & Wojciech P. Maly
Power-aware NOC Reuse on the Testing of Core-based Systems* CSCE 932 Class Presentation by Xinwang Zhang April 26, 2007 * Erika Cota, et al., International.
Computer Science and Engineering Parallelizing Feature Mining Using FREERIDE Leonid Glimcher P. 1 ipdps’04 Scaling and Parallelizing a Scientific Feature.
A Study of Data Partitioning on OpenCL-based FPGAs Zeke Wang (NTU Singapore), Bingsheng He (NTU Singapore), Wei Zhang (HKUST) 1.
1 Architecture of Datapath- oriented Coarse-grain Logic and Routing for FPGAs Andy Ye, Jonathan Rose, David Lewis Department of Electrical and Computer.
Optimizing Interconnection Complexity for Realizing Fixed Permutation in Data and Signal Processing Algorithms Ren Chen, Viktor K. Prasanna Ming Hsieh.
Dynamic and On-Line Design Space Exploration for Reconfigurable Architecture Fakhreddine Ghaffari, Michael Auguin, Mohamed Abid Nice Sophia Antipolis University.
Mohamed Abdelfattah Vaughn Betz
BD-Cache: Big Data Caching for Datacenters
A New Logic Synthesis, ExorBDS
ELEC 7770 Advanced VLSI Design Spring 2016 Introduction
Programmable Logic Devices
A Methodology for System-on-a-Programmable-Chip Resources Utilization
Mapping into LUT Structures
Delay Optimization using SOP Balancing
James D. Z. Ma Department of Electrical and Computer Engineering
Andy Ye, Jonathan Rose, David Lewis
ELEC 7770 Advanced VLSI Design Spring 2014 Introduction
Anne Pratoomtong ECE734, Spring2002
Applying Twister to Scientific Applications
CPSC 531: System Modeling and Simulation
ELEC 7770 Advanced VLSI Design Spring 2012 Introduction
ELEC 7770 Advanced VLSI Design Spring 2010 Introduction
University of Wisconsin-Madison
Department of Electrical Engineering Joint work with Jiong Luo
Off-path Leakage Power Aware Routing for SRAM-based FPGAs
Delay Optimization using SOP Balancing
A New Hybrid FPGA with Nanoscale Clusters and CMOS Routing Reza M. P
Robert Brayton Alan Mishchenko Niklas Een
FREERIDE: A Framework for Rapid Implementation of Datamining Engines
FREERIDE: A Framework for Rapid Implementation of Datamining Engines
Presentation transcript:

Verilog to Routing CAD Tool Optimization Yue Zha Department of Electrical and Computer Engineering University of Wisconsin Madison

Background Crossbar-based reconfigurable architecture Tiles can be configured to implement logic and routing

Background CAD framework for mapping applications Verilog Parser Technology mapper Placer and Router

Motivation Long processing time, high cost Few long distance interconnections Route through more than 5 tiles Fewer than 10%, on average Logically partition the architecture Divide the complex problem into multiple simple problems

Implementation Implement the Placer and Router in three parts Group logic gates into clusters Simulated annealing algorithm Minimize inter-cluster interconnect Evenly distribute logic gates among clusters Local Placer and Router Reuse the P&R tool in the original framework Global Placer and Router (not completed)

Experimental Setup MCNC benchmark suite Original CAD framework as the baseline Sequential mode (one core) and parallel mode (four cores) All CAD tool sets run on the same system Core i7 and 8GB DDR3 memory

Experimental Results

Experimental Results

Questions?