0 Functional Verification of the SiCortex Multiprocessor System-on-a-Chip June 7, 2007 Oleg Petlin, Wilson Snyder

Slides:



Advertisements
Similar presentations
SOC Design: From System to Transistor
Advertisements

Using emulation for RTL performance verification
Introduction CSCI 444/544 Operating Systems Fall 2008.
Types of Parallel Computers
Some Thoughts on Technology and Strategies for Petaflops.
Design For Verification Synopsys Inc, April 2003.
1 BGL Photo (system) BlueGene/L IBM Journal of Research and Development, Vol. 49, No. 2-3.
ENGIN112 L38: Programmable Logic December 5, 2003 ENGIN 112 Intro to Electrical and Computer Engineering Lecture 38 Programmable Logic.
11/14/05ELEC Fall Multi-processor SoCs Yijing Chen.
Principle of Functional Verification Chapter 1~3 Presenter : Fu-Ching Yang.
From Concept to Silicon How an idea becomes a part of a new chip at ATI Richard Huddy ATI Research.
Copyright Arshi Khan1 System Programming Instructor Arshi Khan.
Prardiva Mangilipally
Embedded Systems Design at Mentor. Platform Express Drag and Drop Design in Minutes IP Described In XML Databook s Simple System Diagrams represent complex.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Churning the Most Out of IP-XACT for Superior Design Quality Ayon Dey Lead Engineer, TI Anshuman Nayak Senior Product Director, Atrenta Samantak Chakrabarti.
(1) Introduction © Sudhakar Yalamanchili, Georgia Institute of Technology, 2006.
© Copyright Alvarion Ltd. Hardware Acceleration February 2006.
Role of Standards in TLM driven D&V Methodology
1 Chapter 2. The System-on-a-Chip Design Process Canonical SoC Design System design flow The Specification Problem System design.
Codelink Using Embedded Code for Verification Stimulus.
Chap. 1 Overview of Digital Design with Verilog. 2 Overview of Digital Design with Verilog HDL Evolution of computer aided digital circuit design Emergence.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
Why do so many chips fail? Ira Chayut, Verification Architect (opinions are my own and do not necessarily represent the opinion of my employer)
Infrastructure design & implementation of MIPS processors for students lab based on Bluespec HDL Students: Danny Hofshi, Shai Shachrur Supervisor: Mony.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
SystemC: A Complete Digital System Modeling Language: A Case Study Reni Rambus Inc.
1 Integration Verification: Re-Create or Re-Use? Nick Gatherer Trident Digital Systems.
Some Course Info Jean-Michel Chabloz. Main idea This is a course on writing efficient testbenches Very lab-centric course: –You are supposed to learn.
Design Verification An Overview. Powerful HDL Verification Solutions for the Industry’s Highest Density Devices  What is driving the FPGA Verification.
A New Method For Developing IBIS-AMI Models
집적회로 Spring 2007 Prof. Sang Sik AHN Signal Processing LAB.
Lecture 2 1 ECE 412: Microcomputer Laboratory Lecture 2: Design Methodologies.
ATCA based LLRF system design review DESY Control servers for ATCA based LLRF system Piotr Pucyk - DESY, Warsaw University of Technology Jaroslaw.
Chonnam national university VLSI Lab 8.4 Block Integration for Hard Macros The process of integrating the subblocks into the macro.
The Macro Design Process The Issues 1. Overview of IP Design 2. Key Features 3. Planning and Specification 4. Macro Design and Verification 5. Soft Macro.
Developing software and hardware in parallel Vladimir Rubanov ISP RAS.
Functional Verification of Dynamically Reconfigurable Systems Mr. Lingkan (George) Gong, Dr. Oliver Diessel The University of New South Wales, Australia.
02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Industrial Project Course (234313) Virtualization-aware.
March, 2007Intro-1http://csg.csail.mit.edu/arvind Design methods to facilitate rapid growth of SoCs Arvind Computer Science & Artificial Intelligence Lab.
SOC Virtual Prototyping: An Approach towards fast System- On-Chip Solution Date – 09 th April 2012 Mamta CHALANA Tech Leader ST Microelectronics Pvt. Ltd,
Simics: A Full System Simulation Platform Synopsis by Jen Miller 19 March 2004.
Tackling I/O Issues 1 David Race 16 March 2010.
Background Computer System Architectures Computer System Software.
ISCUG Keynote May 2008 Acknowledgements to the TI-Nokia ESL forum (held Jan 2007) and to James Aldis, TI and OSCI TLM WG Chair 1 SystemC: Untapped Value.
Problem: design complexity advances in a pace that far exceeds the pace in which verification technology advances. More accurately: (verification complexity)
Constructing a system with multiple computers or processors 1 ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson. Jan 13, 2016.
1 The user’s view  A user is a person employing the computer to do useful work  Examples of useful work include spreadsheets word processing developing.
April 15, 2013 Atul Kwatra Principal Engineer Intel Corporation Hardware/Software Co-design using SystemC/TLM – Challenges & Opportunities ISCUG ’13.
Benefits of a Virtual SIL
Programmable Logic Devices
ASIC Design Methodology
Mixed-Digital/Analog Simulation and Modeling Research
How to Quick Start Virtual Platform Development
Microprocessor and Assembly Language
ECE 551: Digital System Design & Synthesis
Constructing a system with multiple computers or processors
CSCI/CMPE 3334 Systems Programming
Agenda Why simulation Simulation and model Instruction Set model
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
SiCortex Update IDC HPC User Forum
The performance requirements for DSP applications continue to grow and the traditional solutions do not adequately address this new challenge Paradigm.
A High Performance SoC: PkunityTM
Constructing a system with multiple computers or processors
HIGH LEVEL SYNTHESIS.
Automation of Control System Configuration TAC 18
UNISIM (UNIted SIMulation Environment) walkthrough
Types of Parallel Computers
Presentation transcript:

0

Functional Verification of the SiCortex Multiprocessor System-on-a-Chip June 7, 2007 Oleg Petlin, Wilson Snyder

2 Agenda What we’ve built Verification challenges Verification productivity Substitution modelling Co-verification Verification statistics Conclusions Q&A

3 What We’ve Built Complete computer system –Rethought how a cluster should be built Custom processor chip –Reduced power –Maximized memory performance –Integrated high performance interconnect Software –Open Source: Linux, GNU, and MPI

4 Our Product: SC Gigaflops 7776Gigabytes DDR memory 9726-core 64-bit nodes GByte/s fabric links 500GByte/s bisection bandwidth 270GByte/s PCI-E bandwidth 18KW 1Cabinet There’s a small version too…

5 Verification Challenges High design complexity –1.2 million lines of RTL -> 198 million transistors Both purchased IP and internal developments High programmability - 6 CPUs and DMA Co-verification of Linux kernel and device drivers Control the overall verification cost –Verilog simulator speed and license limitations –How to find more bugs for less w/o compromising the quality of verification? –Project deadlines

6 Verification Productivity Verification Tools –Languages, libraries, and simulators C++, STL SystemC verification library and TLM OSCI SystemC Cadence Incisive –Open source productivity tools ( ) Vregs (Register documentation -> headers & verification code) Verilog-Mode (saves writing 30% of the Verilog lines !) SystemPerl (saves writing 40% of SystemC lines !) Verilator (Verilog RTL -> C++/SystemC cycle-accurate model)

7 Verilator and Incisive Verilator + OSCIncisive Full SystemC Synthesizable Verilog-2005 Full SystemC Fully Verilog-2005 compliant C++ InterfacePLI/VPI compliant interface Two-StateFour-State (0,1,X,Z) and strengths Cycle accurateTiming accurate (thus required for PLL, PHY and gate simulations) Limited PSL assertionsFull PSL assertions Line and Block coverageBlock, FSM, expression coverage Waveforms, GDB/DDDWaveforms, source debugger Faster simulations (2-5x) Limited support Slower simulations Excellent customer support FreeNot quite

8 Verification Productivity (cont.) Code Reuse –C++ encapsulation, inheritance –Verification infrastructure –Test components Test Writing Methodology –Every test is a class that inherits the test base class –The test base class specifies the execution order for a set of virtual methods –Chip-level tests are constructed from subchip tests Regression Testing –Hourly, nightly and weekly runs with random seeds –Background random runs –Automated web-based reporting system (next slide)

9 Rev Num Run History Rev User Rev Description r53331 faildenneyDMA engine support for mandelbrot r5332denneyAdd PCI express test r53312 pass 2 fail w/mod wsnyderDoing something nasty r5330 r53291 passpholmesNew incredible MPI fabric test added Tracking all Tests All tests were tracked in a database with web front-end: –Did this test ever work, and when? –What versions did the test work in? –What changes were made?

10 VerBfm.sp Bus Functional Model Substitution Modeling We allowed many types of modules to be substituted into the same consistent chip model cell, and can compare outputs: VerShad.sp Shadow Module VerBeh.sp Behavioral Module VerRtl.sp RTL Wrapper Module Ver.sp Translated automatically from Ver.v Conversion Wrapper(s)

11 Co-Verification: Booting Linux Our major software goal was to boot Linux –Linux kernel 2.6 with some modifications –Initialization trimmed to 16 million instructions

12 Verification Statistics 20,000 tests 5,000+ per night (<20% of the tests required any license) 22,000,000 test runs over the last 12 months: –230 compute years –2.1 hours of “real chip” time 1,300 critical bugs found, as follows:

13 20,000 tests 5,000+ per night (<20% of the tests required any license) 22,000,000 test runs over the last 12 months: –230 compute years –2.1 hours of “real chip” time 1,300 critical bugs found, as follows: Verification Statistics 230 compute years cost us $250K in Opteron Servers. Our system with 648 cores delivers more than 3x the simulations per dollar, in 1/12 th the space. And this isn’t even in the market it is tuned for. And we’d save $17,000 in power/year! 9 cents /kWH, 24KW for 86 servers drops to 2KW per SC648 Teaser

14 How did we do? Initial debug went smoothly –Dec 28, 2006: Chips arrive –Jan 22, 2007: Linux, NFS, Emacs, make…. –Jan 29, 2007: MPI networking node-to-node. Only a few bugs found in Silicon, all with workarounds –All due to verification holes –Filled now, of course

15 Conclusions Mixing Verilog RTL and SystemC/C++ worked well Fast simulation models enabled early software debug Our strategy provided for a higher level of control over: –Simulation speed –Accuracy –License usage & overall verification cost Open source tools allowed the team to run more tests and find bugs earlier Used the best public domain and commercial tools each for what they do best

16 Public Tool Sources The public domain design tools we used are available at –Make::Cache - Object caching for faster compiles –Schedule::Load – Load Balancing (ala LSF) –SystemPerl - /*AUTOs*/ for SystemC –Verilator – Compile SystemVerilog into SystemC –Verilog-Mode - /*AUTO…*/ Expansion –Verilog-Perl – Verilog preprocessor –Vregs – Extract register and class declarations from documentation The SIMH instruction set simulator is at

17 Compute Node Layout

18 Compute Node Architecture