Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Monte Carlo Simulation Accelerator using FPGA Devices Final Year project : LHW0304 Ng Kin Fung && Ng Kwok Tung Supervisor : Professor LEONG, Heng Wai.

Similar presentations


Presentation on theme: "A Monte Carlo Simulation Accelerator using FPGA Devices Final Year project : LHW0304 Ng Kin Fung && Ng Kwok Tung Supervisor : Professor LEONG, Heng Wai."— Presentation transcript:

1 A Monte Carlo Simulation Accelerator using FPGA Devices Final Year project : LHW0304 Ng Kin Fung && Ng Kwok Tung Supervisor : Professor LEONG, Heng Wai Philip

2 Overview

3 Overview  Objective  Background  Software-only Implementation  Hardware Implementation  FPGA  Soft-Core Micro-Processor

4 Overview  Background  Interest Rate Modeling  Brace-Gatarek-Musiela (BGM) Model  Motivation and Contribution  System Design  System Design Overview  System Components  System Operations

5 Overview  Experiment and Result  Resources  Performance  Data Transmission Overhead  Conclusion  Future Improvement  Q & A Section

6 Objective

7 Objective  What we achieved in last semester  Study and get familiar with the development related tools  Implement some simple examples to get experience in system development of FPGA with Soft-core Micro- processor   First ever successful port of the Microblaze system to the Celoxica RC200 development board  Study the performance and power consumption of the system

8 Objective  How about this semester  Build up a Monte Carlo Simulation Accelerator using FPGA technology and Soft-core Micro- processor  Study the speed up and performance  Study the transmission overhead of the transmission channel between user core and Soft-core Micro-processor

9 FPGA and Soft-Core Micro-Processor

10 Software only implementation  Theis  The performance is NOT satisfactory  Sequential execution of instruction instead of parallel execution  Slow Memory access   Lack of ability to customize hardware  No way to save power by switching off hardware module  There is a need to solve the problem in another approach

11 FPGA Technology  More and more in system design  More and more popular in system design   Higher degree of parallelism  Fewer clock cycle required

12 FPGA Technology  Explicitly hardwired to perform a certain operation   higher performance  Optimized for specific purpose  higher performance  Enable module  Enable customization of hardware module  Power Saving   Reconfigurable  Enable reuse of hardware  Able to simulate and synthesize the circuits from a high level program-like description  system development and system testing  Easy system development and system testing   higher profit  Shorter time to market  higher profit

13 Soft-Core Micro-Processor  Most systems use a accessed through a  Most systems use a PC+FPGA accessed through a PCI bus  for entire system  Bottleneck for entire system  Use of  Use of Soft-Core Micro-Processor  Everything is implemented in FPGA  Transmission of data is within the FPGA  A and  A higher transmission bandwidth and lower latency

14 Soft-Core Micro-Processor  Other advantages   Easier to develop  Retain the advantage of using FPGA   Flexible   Retargetable  Conclusion   FPGA technology + Soft-Core Micro- Processor

15 Interest Rate Modeling

16  Important of interest rate modeling  Simulate market behavior with historical parameter values  Explain interest rate movements in terms of an underlying model     decision making on economic policy     risk management

17 Brace-Gatarek-Musiela (BGM) Model  One of the most popular interest rate models  Base on Monte Carlo Method  Looping Part ( )  Looping Part (most computational expensive)

18 Implementing BGM Model using FPGA and Soft Core Microprocessor BGM core generate 50 paths with 9 fixed points

19 Implementing BGM Model using FPGA and Soft Core Microprocessor  Implemented by FPGA in parallel style  Post-processing calculation by Microblaze  Average and Standard error  Fast Simplex Link Bus for data transmission between BGM core and Microblaze

20 Contribution

21 Contribution  Improve the performance of the system ImplementationResponsibilityPerformance Software-only On Market Lowest FPGA + PC CSE Research High FPGA + Soft-Core Micro-Processor Our TaskHighest

22 System Design

23 System Design Overview

24 System Component

25 Microblaze  A soft-core Microprocessor  Delivered as HDL source code for synthesis  Designed in VHDL   Specially optimized for Xilinx FPGAs  A reduced instruction set computer (RISC)  Speed of Microblaze across different devices from Xilinx Statistics Virtex™ -II Pro (-6) 150 MHz 101 D-MIPS Virtex-II (-5) 125 MHz 82 D-MIPS Virtex-E (-7) 75 MHz 49 D-MIPS Spartan-IIE (-6) 75MHz 49 D-MIPS Spartan™ -II (-4) 65 MHz 43 D-MIPS

26 User Core – BGM  Connect the core designed in VHDL to the Microblaze system   Solve most computational expensive task in fully hardware  Need to follow the signal and timing of the bus connected  A microprocessor description (MPD) file  Defines the interface of the peripheral  Ports, Buses  A Peripheral Analyze Order (PAO) file  A list of HDL files in order of compilation that are needed for synthesis

27 Fast Simplex Link (FSL)  32 bits wide bus  Unidirectional point-to-point data streaming interfaces  Control and Data communication support  FIFO based communication   Fast Internal data and control transmission   Peak bandwidth 300MB / SEC

28 Fast Simplex Link (FSL)

29 Xilinx Fast Simplex Link Channel Product Specification DS449 (v1.1) Aug 06, 2003

30 Fast Simplex Link (FSL) Xilinx Fast Simplex Link Channel Product Specification DS449 (v1.1) Aug 06, 2003 Use Read Marco microblaze_bread_datafsl(val, id) for reading data from FSL FIFO to Microblaze

31 On-Chip Memory, Local Memory Bus and Memory Bus Controller  On Chip Memory  Storage medium for the data and instruction  between the Microblaze and the memory  Minimize the transmission overhead between the Microblaze and the memory  Local Memory Bus  to on-chip dual-port block RAM  Single-cycle access to on-chip dual-port block RAM   Performance of 125 MHz  LMB BRAM Interface Controller  Interface between the LMB and the bram_block peripheral  Separate controller for data and control

32 On-Chip Peripheral Bus (OPB Bus)  Connection between the main system and the peripherals  Make Microblaze System  Make Microblaze System More Functional  In this project  UART  OPB Timer  GPIO

33 Universal Asynchronous Receiver-Transmitter (UART)  Handles asynchronous serial communication  Libgen allows the mapping of standard input and output  Use of scanf and printf for the  Use of scanf and printf for the communication with user

34 OPB Timer  Facilitate the  Facilitate the correct measurement of the performance  Initiate timer  Start timer  Stop timer  Get timer value  XStatus XTmrCtr_Initialize  void XTmrCtr_Start  void XTmrCtr_Stop  Xuint32 XTmrCtr_GetValue

35 General Purpose Input Output (GPIO)   Problem found on FSL Bus  Reset signal connected to Gound  to reset the BGM core through FSL Bus  No way to reset the BGM core through FSL Bus  Solution  Make change to the VHDL source code   Use GPIO

36 General Purpose Input Output (GPIO) Reset BGM Core MicroblazeFSL Reset X Reset by GPIO Reset by FSL BGM Core MicroblazeGPIO Reset

37 System Operations BGM Core is reset Microblaze System Start Timer is started BGM Process Any More Data Post-Processing Calculation by Microblaze Timer is stopped Result is printed out End of Microblaze System yes No

38 System Operations BGM Core in process of generating path BGM Process Start Data transfer from BGM core to Microblaze System Data format transform Temperate storage of data End of Microblaze System

39 Experimental Results

40 Resources Selected Device : 2v1000fg456-4 Resources for BGM core alone Device Used number Total Number Percentage Slices64555120126% Slice Flip Flops 57681024056% 4-input LUTs 1097410240107% Bonded IOBs 4232412% MULT18X18s374092% GCLKs31618% DCMs1812%  Unable to place whole system to the FPGA board  System Simulation by ModelSim

41 Performance Comparison of performance for the running of BGM core in FPGA and in PC (By Dr. Zhang) Speed up factor : 19.87

42 Performance The comparison of performance for the running the BGM core in FPGA and PC with different number of paths generated (By Dr. Zhang) Stable Performance with different path numbers

43 Performance Simulation of Microblaze system Total time required for generating 50 paths : 2.871ms Speed up factor : 21.94

44 Transmission Bandwidth Transmission Media Peak Transmission Bandwidth Serial Port 15KB / SEC Parallel Port 150KB / SEC 10M Ethernet 1.2MB / SEC USB 1.5MB / SEC 100M Ethernet 12MB / SEC PCI Bus 100MB / SEC FSL Bus 300MB / SEC

45 Transmission Bandwidth In FSL Bus 32 bit of data is sent by about 40000ps Transmission bandwidth is around 100MB per second Same significant as the peak transmission bandwidth as stated in specification

46 Conclusion  A Monte Carlo Simulation Accelerator was implemented using FPGA technology and Xilinx Microblaze Soft-core Micro- processor  A when compared with software only implementation  A speed up factor 21.94 when compared with software only implementation  can be achieved using FSL Link between Microblaze and BGM core  Higher bandwidth and lower latency can be achieved using FSL Link between Microblaze and BGM core   High performance, the parallelism of execution of instruction, the reconfigurability and reuseability and the short development time……

47 Future Development  Put the whole system in the FPGA board  Implement other applications which put high performance and short developing time as the major consideration  Study other IP core included and make improvement to the system

48 Q & A


Download ppt "A Monte Carlo Simulation Accelerator using FPGA Devices Final Year project : LHW0304 Ng Kin Fung && Ng Kwok Tung Supervisor : Professor LEONG, Heng Wai."

Similar presentations


Ads by Google