Progetto MAIS - WP5 esplorazione di architetture alternative Resoconto delle attività svolte Andrea Pagni STMicroelectronics Advanced System Architectures Group Milano, Novembre 2004
Resoconto WP5 2 Topics Part 1: VLIW-SIM Overview. Part 2: VLIW-SIM Performance. Part 3: VLIW-SIM Library. Part 4: Next Steps.
Part 1: VLIW-SIM Overview
Resoconto WP5 4 Part 1: VLIW-SIM Overview Simulation Approach (1-7). Modeled Target Architectures. Supported platforms. Simulation functionalities.
Resoconto WP5 5 Simulation Approach 1/7 Overview s Interpretative Simulation Approach s Simulation Technology based on a set of re-usable sub-blocks s Pipeline modeling s Instruction execution s Memory modeling s Register file management s I/O simulation s Efficient Host Resources Allocation s Target Architecture Description capability (IS, TAD) s Challenging compromise between Speed and Accuracy
Resoconto WP5 6 During simulation, the pipeline is represented as a 3-dimensional space (phase, operation, time): operation means the instructions position in the bundle, phase is the pipelines phase and time is the given time stamp. Simulation Approach 2/7 pipeline modelling
Resoconto WP5 7 Simulation Approach 3/7 Pipeline modelling s The pipeline status is modelled via a two-dimension array: s The first index is the pipeline phase and the second one is the position of a certain instruction in the fetch-packet. s The simulation process is based on two arrays like the one described above, to represent the current and the following pipeline statuses. s At each machine cycle the pipeline status is processed: actions depending on which instructions are at that phase and then the instructions are moved to the next pipeline phase.
Resoconto WP5 8 At each machine cycle the pipeline status is processed Simulation Approach 4/7 pipeline status update
Resoconto WP5 9 Instructions execution is simulated through an Instruction Table which contains the instruction-routine address and the instruction latency value. Simulation Approach 5/7 Instruction execution
Resoconto WP5 10 s The simulation environment is based on the progressive pipeline status updating taking into account the data coherence in memory locations and in the register file. s To support data coherence two Register files have been used: one for the current Register File status and the other one for the following. s Each time an instruction is executed its operands are loaded from the current register file and results are stored in the following. s This allows sequential simulation of parallel instruction execution. Simulation Approach 6/7 register file status update
Resoconto WP5 11 Simulation Approach 7/7 I/O simulation s I/O Target Architecture specific features separated from Simulation kernel s The SYSCALL pseudo- instruction manages the interface between internal I/O instruction (processor side) and File System I/O calls (OS side). s SYSCALL handle also the general Exception Handling s This mechanism is transparent to other simulator modules: Performance and data flow are not influenced if I/O operation are not present. Details
Resoconto WP5 12 Modeled Target Architectures s Multi-cluster Architecture s 4-issue VLIW core s I/D-cache memories s 6-stages pipeline s RISC-like Instruction Set s bit General registers, 8 1-bit special registers ST210TI C62x s 8-issue VLIW core s Optional I-cache memory s 11-stages pipeline s RISC-like Instruction Set s bit General registers TI C64x s 8-issue VLIW core s I/D cache memories s 11-stages pipeline s RISC/SIMD Instruction Set s bit General registers
Resoconto WP5 13 Windows OS (Visual C++): text mode: project file in vliw_sim/vliw_sim graphical mode: project file in vliw_sim/gui/gui Windows OS (Cygwin, gcc): text mode: makefile in vliw_sim/vliw_sim graphical mode (with XWindows on Cygwin) Linux OS (RedHat, gcc): text mode: makefile in vliw_sim/vliw_sim graphical mode: makefile in vliw_sim/gui/gui Sun OS (Solaris, gcc) text mode: makefile in vliw_sim/vliw_sim graphical mode: makefile in vliw_sim/gui/gui vliw_sim bin_loader cache gui/gui instruction_set io_interf memory pipeline profdebug registers vliw_sim vliw_sim_dll Supported Platforms
Resoconto WP5 14 Simulation functionalities Debug Support Step-by-step execution Breakpoint Register & Memory access Pipeline Visibility (instruction & addresses) Profiling Application Code region Profile Statistics extraction for profiled code Simulator Dynamic Library Simulation API SoC simulation facilities Exception Handling simulation Efficient I/O interface simulation
Part 2: VLIW-SIM Performance
Resoconto WP5 16 Part 2: VLIW-SIM Performance Tested Applications. SW apps on ST210. SW apps on TI C62x. SW apps on TI C64x. SW apps on ST210 (1-2).
Resoconto WP5 17 Tested Applications ST210. MPEG-2 Intra Video Encoder (0.2s, 5 frames, 15 Mbit/s). MPEG-1 Layer 2 Audio Encoder (1s, 32KHz 256 kbit/s). MPEG-2 M=3 Video Decoder (1s, 25 frames/s, 15 Mbit/s). MPEG-4 QCIF Video Decoder (1s, 25 frames/s, 512 kbit/s). MPEG-4 QCIF Video Encoder (27 frames, 64 kbit/s, QP=12). H.263+ QCIF Video Encoder (10 frames, No rate-control). G Audio Enc-Dec (20 frames, 8 kHz, 5.3 kbit/s). Automatic Speech Recognition (HMM, 5 words, 8 MEL, 50 active words). TI C62x & C64x. H.263+ Video Enc QCIF (5 frames, No rate-control) G.726 Audio Enc-Dec (10 frames, 8kHz, 32 kbit/s)
Resoconto WP5 18 SW apps on TI-C62x Operation = one syllable (elementary 32-bit RISC instruction)
Resoconto WP5 19 SW apps on TI-C64x Bundle = more syllables (max 8 for TI C6xx, max 4 for ST210) per clock cycle
Resoconto WP5 20 SW apps on ST210 1/3 HP ISS configured with: ignore_non_cacheable_areas TRUE profile_gprof_on FALSE
Resoconto WP5 21 SW apps on ST210 2/3 HP ISS configured with: ignore_non_cacheable_areas TRUE profile_gprof_on FALSE
Resoconto WP5 22 SW apps on ST210 3/3 MOPS = Millions Of Operations Per Sec
Part 3: VLIW-SIM Library
Resoconto WP5 24 Part 3: VLIW-SIM Library VLIW-SIM Library (1-2).
Resoconto WP5 25 VLIW-SIM Library 1/2 The VLIW-SIM can be configured as both stand-alone and dynamic library (DLL). extremely useful to interface VLIW-SIM with other applications (system on chip simulation environment, Graphical User Interface, etc.). The simulator-exported functionalities can be divided into two subgroups: Command Functionalities: used to control the simulation (Run, Stop, Insert/remove breakpoint, Continue, Step, etc.) Status Functionality: used to retrieve the simulator internal status and resource allocation (pipeline status and size, register file content and size, etc.)
Resoconto WP5 26 VLIW-SIM Library 2/2 The simulator DLL exports the following functionalities: Control Functions Load Init Step / Step N / Stall Run Restart Debug Support View simulator status ( Pipeline, Register File, Memory ) Breakpoint Utility functions Code profiling Simulated Program Arguments
Part 4: Next Steps
Resoconto WP5 28 Part 4: Next Steps Where we are. VLIW-SIM Developments.
Resoconto WP5 29 Released version 2.0 and 3.0 of VLIW-SIM. A lot of SW engineering work to improve: Modularity Readibility (doxygen generated documentation) Simulation speed Architectural accuracy: ST210: IPU, DPU, Interrupt Controller, Core Memory Controller, I-cache, D-cache TI C6x: I-cache and D-cache for CPU style, program memory and data memory for DSP style Accurate and not invasive flat profiling (GNU format compatible) Architectural flexible re-configurability Host platform independency Future integration into high level system tools Where we are
Resoconto WP5 30 ST220 accurate modelling Integration inside MaxSim system simulation tools and related experiments VLIW-SIM developments
Fine Domande?