A Monte Carlo Simulation Accelerator using FPGA Devices Final Year project : LHW0304 Ng Kin Fung && Ng Kwok Tung Supervisor : Professor LEONG, Heng Wai.

Slides:



Advertisements
Similar presentations
VHDL Design of Multifunctional RISC Processor on FPGA
Advertisements

Nios Multi Processor Ethernet Embedded Platform Final Presentation
© 2003 Xilinx, Inc. All Rights Reserved Course Wrap Up DSP Design Flow.
Sumitha Ajith Saicharan Bandarupalli Mahesh Borgaonkar.
1 SECURE-PARTIAL RECONFIGURATION OF FPGAs MSc.Fisnik KRAJA Computer Engineering Department, Faculty Of Information Technology, Polytechnic University of.
Characterization Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets Maxim Zavodchik
© ABB Group Jun-15 Evaluation of Real-Time Operating Systems for Xilinx MicroBlaze CPU Anders Rönnholm.
Team Monte Cristo Joseph Carrafa Sharon Clark Scott Hassett Alex Mason The Deep Fried Game Station.
1 Performed By: Khaskin Luba Einhorn Raziel Einhorn Raziel Instructor: Rivkin Ina Spring 2004 Spring 2004 Virtex II-Pro Dynamical Test Application Part.
Network based System on Chip Students: Medvedev Alexey Shimon Ofir Supervisor: Walter Isaschar (Zigmond) Winter-Spring 2006.
Performed by : Rivka Cohen and Sharon Solomon Instructor : Walter Isaschar המעבדה למערכות ספרתיות מהירות High Speed Digital Systems Laboratory הטכניון.
Configurable System-on-Chip: Xilinx EDK
The Xilinx EDK Toolset: Xilinx Platform Studio (XPS) Building a base system platform.
An FPGA Based Adaptive Viterbi Decoder Sriram Swaminathan Russell Tessier Department of ECE University of Massachusetts Amherst.
Performance Analysis of Processor Characterization Presentation Performed by : Winter 2005 Alexei Iolin Alexander Faingersh Instructor:
Technion – Israel Institute of Technology Department of Electrical Engineering High Speed Digital Systems Lab Project performed by: Naor Huri Idan Shmuel.
Performance Analysis of Processor Midterm Presentation Performed by : Winter 2005 Alexei Iolin Alexander Faingersh Instructor: Evgeny.
Reliable Data Storage using Reed Solomon Code Supervised by: Isaschar (Zigi) Walter Performed by: Ilan Rosenfeld, Moshe Karl Spring 2004 Midterm Presentation.
Technion Digital Lab Project Xilinx ML310 board based on VirtexII-PRO programmable device Students: Tsimerman Igor Firdman Leonid Firdman Leonid.
1 Fast Communication for Multi – Core SOPC Technion – Israel Institute of Technology Department of Electrical Engineering High Speed Digital Systems Lab.
Detector Array Controller Based on First Light First Light PICNIC Array Mux PICNIC Array Mux Image of ESO Messenger Front Page M.Meyer June 05 NGC High.
1 Performed By: Khaskin Luba Einhorn Raziel Einhorn Raziel Instructor: Rivkin Ina Winter 2005 Winter 2005 Virtex II-Pro Dynamical Test Application - Part.
Implementation of DSP Algorithm on SoC. Mid-Semester Presentation Student : Einat Tevel Supervisor : Isaschar Walter Accompaning engineer : Emilia Burlak.
Lecture 7 Lecture 7: Hardware/Software Systems on the XUP Board ECE 412: Microcomputer Laboratory.
System Architecture A Reconfigurable and Programmable Gigabit Network Interface Card Jeff Shafer, Hyong-Youb Kim, Paul Willmann, Dr. Scott Rixner Rice.
GallagherP188/MAPLD20041 Accelerating DSP Algorithms Using FPGAs Sean Gallagher DSP Specialist Xilinx Inc.
FPGA Based Fuzzy Logic Controller for Semi- Active Suspensions Aws Abu-Khudhair.
General Purpose FIFO on Virtex-6 FPGA ML605 board midterm presentation
Final presentation Encryption/Decryption on embedded system Supervisor: Ina Rivkin students: Chen Ponchek Liel Shoshan Winter 2013 Part A.
Students: Oleg Korenev Eugene Reznik Supervisor: Rolf Hilgendorf
 Purpose of our project  Get real world experience in ASIC digital design  Use same tools as industry engineers  Get practical experience in microprocessor.
General Purpose FIFO on Virtex-6 FPGA ML605 board Students: Oleg Korenev Eugene Reznik Supervisor: Rolf Hilgendorf 1 Semester: spring 2012.
Xilinx at Work in Hot New Technologies ® Spartan-II 64- and 32-bit PCI Solutions Below ASSP Prices January
Ross Brennan On the Introduction of Reconfigurable Hardware into Computer Architecture Education Ross Brennan
Hardware Design This material exempt per Department of Commerce license exception TSU.
DLS Digital Controller Tony Dobbing Head of Power Supplies Group.
1 of 23 Fouts MAPLD 2005/C117 Synthesis of False Target Radar Images Using a Reconfigurable Computer Dr. Douglas J. Fouts LT Kendrick R. Macklin Daniel.
Research on Reconfigurable Computing Using Impulse C Carmen Li Shen Mentor: Dr. Russell Duren February 1, 2008.
© 2007 Xilinx, Inc. All Rights Reserved This material exempt per Department of Commerce license exception TSU Hardware Design INF3430 MicroBlaze 7.1.
J. Christiansen, CERN - EP/MIC
® SPARTAN Series High Volume System Solution. ® Spartan/XL Estimated design size (system gates) 30K 5K180K XC4000XL/A XC4000XV Virtex S05/XL.
1 Fly – A Modifiable Hardware Compiler C. H. Ho 1, P.H.W. Leong 1, K.H. Tsoi 1, R. Ludewig 2, P. Zipf 2, A.G. Oritz 2 and M. Glesner 2 1 Department of.
PROCStar III Performance Charactarization Instructor : Ina Rivkin Performed by: Idan Steinberg Evgeni Riaboy Semestrial Project Winter 2010.
1 Abstract & Main Goal המעבדה למערכות ספרתיות מהירות High speed digital systems laboratory The focus of this project was the creation of an analyzing device.
Introduction to FPGA Created & Presented By Ali Masoudi For Advanced Digital Communication Lab (ADC-Lab) At Isfahan University Of technology (IUT) Department.
Part A Presentation Implementation of DSP Algorithm on SoC Student : Einat Tevel Supervisor : Isaschar Walter Accompanying engineer : Emilia Burlak The.
ATtiny23131 A SEMINAR ON AVR MICROCONTROLLER ATtiny2313.
Rinoy Pazhekattu. Introduction  Most IPs today are designed using component-based design  Each component is its own IP that can be switched out for.
Algorithm and Programming Considerations for Embedded Reconfigurable Computers Russell Duren, Associate Professor Engineering And Computer Science Baylor.
Final Presentation Final Presentation OFDM implementation and performance test Performed by: Tomer Ben Oz Ariel Shleifer Guided by: Mony Orbach Duration:
PROJECT - ZYNQ Yakir Peretz Idan Homri Semester - winter 2014 Duration - one semester.
Computer Hardware The Processing Unit.
Tools - LogiBLOX - Chapter 5 slide 1 FPGA Tools Course The LogiBLOX GUI and the Core Generator LogiBLOX L BX.
This material exempt per Department of Commerce license exception TSU Xilinx On-Chip Debug.
© 2003 Xilinx, Inc. All Rights Reserved Course Wrap Up DSP Design Flow.
November 29, 2011 Final Presentation. Team Members Troy Huguet Computer Engineer Post-Route Testing Parker Jacobs Computer Engineer Post-Route Testing.
Somervill RSC 1 125/MAPLD'05 Reconfigurable Processing Module (RPM) Kevin Somervill 1 Dr. Robert Hodson 1
Fail-Safe Module for Unmanned Autonomous Vehicle
DDRIII BASED GENERAL PURPOSE FIFO ON VIRTEX-6 FPGA ML605 BOARD PART B PRESENTATION STUDENTS: OLEG KORENEV EUGENE REZNIK SUPERVISOR: ROLF HILGENDORF 1 Semester:
Survey of Reconfigurable Logic Technologies
Lab Environment and Miniproject Assignment Spring 2009 ECE554 Digital Engineering Laboratory.
FPGA Technology Overview Carl Lebsack * Some slides are from the “Programmable Logic” lecture slides by Dr. Morris Chang.
Maj Jeffrey Falkinburg Room 2E46E
Programmable Logic Devices
Ming Liu, Wolfgang Kuehn, Zhonghai Lu, Axel Jantsch
VBSS Voice over IP Bandwidth Saving System Prototype Demonstration
Serial Data Hub (Proj Dec13-13).
Control Unit Introduction Types Comparison Control Memory
The performance requirements for DSP applications continue to grow and the traditional solutions do not adequately address this new challenge Paradigm.
Portable SystemC-on-a-Chip
Presentation transcript:

A Monte Carlo Simulation Accelerator using FPGA Devices Final Year project : LHW0304 Ng Kin Fung && Ng Kwok Tung Supervisor : Professor LEONG, Heng Wai Philip

Overview

Overview  Objective  Background  Software-only Implementation  Hardware Implementation  FPGA  Soft-Core Micro-Processor

Overview  Background  Interest Rate Modeling  Brace-Gatarek-Musiela (BGM) Model  Motivation and Contribution  System Design  System Design Overview  System Components  System Operations

Overview  Experiment and Result  Resources  Performance  Data Transmission Overhead  Conclusion  Future Improvement  Q & A Section

Objective

Objective  What we achieved in last semester  Study and get familiar with the development related tools  Implement some simple examples to get experience in system development of FPGA with Soft-core Micro- processor   First ever successful port of the Microblaze system to the Celoxica RC200 development board  Study the performance and power consumption of the system

Objective  How about this semester  Build up a Monte Carlo Simulation Accelerator using FPGA technology and Soft-core Micro- processor  Study the speed up and performance  Study the transmission overhead of the transmission channel between user core and Soft-core Micro-processor

FPGA and Soft-Core Micro-Processor

Software only implementation  Theis  The performance is NOT satisfactory  Sequential execution of instruction instead of parallel execution  Slow Memory access   Lack of ability to customize hardware  No way to save power by switching off hardware module  There is a need to solve the problem in another approach

FPGA Technology  More and more in system design  More and more popular in system design   Higher degree of parallelism  Fewer clock cycle required

FPGA Technology  Explicitly hardwired to perform a certain operation   higher performance  Optimized for specific purpose  higher performance  Enable module  Enable customization of hardware module  Power Saving   Reconfigurable  Enable reuse of hardware  Able to simulate and synthesize the circuits from a high level program-like description  system development and system testing  Easy system development and system testing   higher profit  Shorter time to market  higher profit

Soft-Core Micro-Processor  Most systems use a accessed through a  Most systems use a PC+FPGA accessed through a PCI bus  for entire system  Bottleneck for entire system  Use of  Use of Soft-Core Micro-Processor  Everything is implemented in FPGA  Transmission of data is within the FPGA  A and  A higher transmission bandwidth and lower latency

Soft-Core Micro-Processor  Other advantages   Easier to develop  Retain the advantage of using FPGA   Flexible   Retargetable  Conclusion   FPGA technology + Soft-Core Micro- Processor

Interest Rate Modeling

 Important of interest rate modeling  Simulate market behavior with historical parameter values  Explain interest rate movements in terms of an underlying model     decision making on economic policy     risk management

Brace-Gatarek-Musiela (BGM) Model  One of the most popular interest rate models  Base on Monte Carlo Method  Looping Part ( )  Looping Part (most computational expensive)

Implementing BGM Model using FPGA and Soft Core Microprocessor BGM core generate 50 paths with 9 fixed points

Implementing BGM Model using FPGA and Soft Core Microprocessor  Implemented by FPGA in parallel style  Post-processing calculation by Microblaze  Average and Standard error  Fast Simplex Link Bus for data transmission between BGM core and Microblaze

Contribution

Contribution  Improve the performance of the system ImplementationResponsibilityPerformance Software-only On Market Lowest FPGA + PC CSE Research High FPGA + Soft-Core Micro-Processor Our TaskHighest

System Design

System Design Overview

System Component

Microblaze  A soft-core Microprocessor  Delivered as HDL source code for synthesis  Designed in VHDL   Specially optimized for Xilinx FPGAs  A reduced instruction set computer (RISC)  Speed of Microblaze across different devices from Xilinx Statistics Virtex™ -II Pro (-6) 150 MHz 101 D-MIPS Virtex-II (-5) 125 MHz 82 D-MIPS Virtex-E (-7) 75 MHz 49 D-MIPS Spartan-IIE (-6) 75MHz 49 D-MIPS Spartan™ -II (-4) 65 MHz 43 D-MIPS

User Core – BGM  Connect the core designed in VHDL to the Microblaze system   Solve most computational expensive task in fully hardware  Need to follow the signal and timing of the bus connected  A microprocessor description (MPD) file  Defines the interface of the peripheral  Ports, Buses  A Peripheral Analyze Order (PAO) file  A list of HDL files in order of compilation that are needed for synthesis

Fast Simplex Link (FSL)  32 bits wide bus  Unidirectional point-to-point data streaming interfaces  Control and Data communication support  FIFO based communication   Fast Internal data and control transmission   Peak bandwidth 300MB / SEC

Fast Simplex Link (FSL)

Xilinx Fast Simplex Link Channel Product Specification DS449 (v1.1) Aug 06, 2003

Fast Simplex Link (FSL) Xilinx Fast Simplex Link Channel Product Specification DS449 (v1.1) Aug 06, 2003 Use Read Marco microblaze_bread_datafsl(val, id) for reading data from FSL FIFO to Microblaze

On-Chip Memory, Local Memory Bus and Memory Bus Controller  On Chip Memory  Storage medium for the data and instruction  between the Microblaze and the memory  Minimize the transmission overhead between the Microblaze and the memory  Local Memory Bus  to on-chip dual-port block RAM  Single-cycle access to on-chip dual-port block RAM   Performance of 125 MHz  LMB BRAM Interface Controller  Interface between the LMB and the bram_block peripheral  Separate controller for data and control

On-Chip Peripheral Bus (OPB Bus)  Connection between the main system and the peripherals  Make Microblaze System  Make Microblaze System More Functional  In this project  UART  OPB Timer  GPIO

Universal Asynchronous Receiver-Transmitter (UART)  Handles asynchronous serial communication  Libgen allows the mapping of standard input and output  Use of scanf and printf for the  Use of scanf and printf for the communication with user

OPB Timer  Facilitate the  Facilitate the correct measurement of the performance  Initiate timer  Start timer  Stop timer  Get timer value  XStatus XTmrCtr_Initialize  void XTmrCtr_Start  void XTmrCtr_Stop  Xuint32 XTmrCtr_GetValue

General Purpose Input Output (GPIO)   Problem found on FSL Bus  Reset signal connected to Gound  to reset the BGM core through FSL Bus  No way to reset the BGM core through FSL Bus  Solution  Make change to the VHDL source code   Use GPIO

General Purpose Input Output (GPIO) Reset BGM Core MicroblazeFSL Reset X Reset by GPIO Reset by FSL BGM Core MicroblazeGPIO Reset

System Operations BGM Core is reset Microblaze System Start Timer is started BGM Process Any More Data Post-Processing Calculation by Microblaze Timer is stopped Result is printed out End of Microblaze System yes No

System Operations BGM Core in process of generating path BGM Process Start Data transfer from BGM core to Microblaze System Data format transform Temperate storage of data End of Microblaze System

Experimental Results

Resources Selected Device : 2v1000fg456-4 Resources for BGM core alone Device Used number Total Number Percentage Slices % Slice Flip Flops % 4-input LUTs % Bonded IOBs % MULT18X18s374092% GCLKs31618% DCMs1812%  Unable to place whole system to the FPGA board  System Simulation by ModelSim

Performance Comparison of performance for the running of BGM core in FPGA and in PC (By Dr. Zhang) Speed up factor : 19.87

Performance The comparison of performance for the running the BGM core in FPGA and PC with different number of paths generated (By Dr. Zhang) Stable Performance with different path numbers

Performance Simulation of Microblaze system Total time required for generating 50 paths : 2.871ms Speed up factor : 21.94

Transmission Bandwidth Transmission Media Peak Transmission Bandwidth Serial Port 15KB / SEC Parallel Port 150KB / SEC 10M Ethernet 1.2MB / SEC USB 1.5MB / SEC 100M Ethernet 12MB / SEC PCI Bus 100MB / SEC FSL Bus 300MB / SEC

Transmission Bandwidth In FSL Bus 32 bit of data is sent by about 40000ps Transmission bandwidth is around 100MB per second Same significant as the peak transmission bandwidth as stated in specification

Conclusion  A Monte Carlo Simulation Accelerator was implemented using FPGA technology and Xilinx Microblaze Soft-core Micro- processor  A when compared with software only implementation  A speed up factor when compared with software only implementation  can be achieved using FSL Link between Microblaze and BGM core  Higher bandwidth and lower latency can be achieved using FSL Link between Microblaze and BGM core   High performance, the parallelism of execution of instruction, the reconfigurability and reuseability and the short development time……

Future Development  Put the whole system in the FPGA board  Implement other applications which put high performance and short developing time as the major consideration  Study other IP core included and make improvement to the system

Q & A