Presentation on theme: "Experiments with the Peripheral Virtual Component Interface Roman L. Lysecky, Frank Vahid*, Tony D. Givargis Dept. of Computer Science & Engineering University."— Presentation transcript:
Experiments with the Peripheral Virtual Component Interface Roman L. Lysecky, Frank Vahid*, Tony D. Givargis Dept. of Computer Science & Engineering University of California, Riverside *also with the Center for Embedded Computer Systems, UC Irvine This work was supported by the National Science Foundation under grant # CCR-9811164, and by a Design Automation Conference graduate scholarship.
On-chip system bus MicroprocessorMemory On-chip peripheral bus Bridge System-on-a-chip Introduction Advent of Systems-on-a- Chip (SOC’s) and cores Peripheral cores Microprocessor support components UART’s, DMA controllers, CODECs, off-chip bus interfaces, etc. Core library... Peripheral core To other systems Problem: how integrate cores into different SOC’s having different on-chip peripheral buses?
Introduction: The Core Integration Problem Solution 1: User modifies core for specific bus Could accidentally change the core’s functionality Solution 2: Different core version per bus Can’t consider all buses Solution 3: Standard bus Not likely [VSIA] Solution 4: Bus wrappers Promising -- but how much overhead? Peripheral core Core library Peripheral bus X Peripheral core for X Peripheral core for Z Peripheral core for Y Peripheral core for X Peripheral bus X Peripheral core for std Standard bus( std) Bus wrapper Peripheral core Peripheral bus X Bus wrapper for X Peripheral core
On-chip system bus MicroprocessorMemory On-chip peripheral bus Bridge System-on-a-chip Introduction Peripheral core Bus wrapper Peripheral core internals Peripheral core internals PVCI Bus wrapper approach Proposed by Virtual Socket Interface Alliance Separate core into internals and bus wrapper What overhead comes with a bus-wrapper solution? PVCI: Peripheral Virtual Component Interface -- standard between wrapper and internals Eases integration Only bus wrapper need be modified for different buses
Setup for evaluating PVCI overhead Digital camera example Synthesizable RTL VHDL Synopsys synthesis, simulation and power analysis About 100,000 cells 3 versions of the CCD and CODEC peripherals Integrated Non-PVCI wrapper (bi-direct.) Designed before PVCI PVCI wrapper (uni-direct.) 2 peripheral buses ISA Custom MIPS MEM. BIOS BRIDGE CCD CODEC On-chip peripheral bus System bus Digital camera
PVCI general structure Two uni-directional buses Handshake control Synchronous Bus wrapper Peripheral core internals PVCI Peripheral core On-chip peripheral bus val wdata ack rdata addr ess read clock
Experiments with the ISA bus 23-bit address bus 32-bit bi-directional data bus 4-cycles per access minimum Slower peripherals can extend access time using iochrdy signal Bus Master Peripheral (Bus Slave) isa_ ale isa_ addr isa_ ioch rdy ack_ data isa_ data isa_ ior isa_iowi clock isa_addr isa_ale isa_data isa_ior isa_iow isa_iochrdy data readystart transfer
Experiments with the ISA bus Size overhead of about 1000 gates per peripheral Power overhead of about 0.05 milliwatts (<1%) No performance overhead Since ISA has 4- cycle minimum access delay PVCI vs. Integrated
Experiments with a custom peripheral bus Similar to ISA, but... No 4-cycle minimum Handshake clock bus_addr bus_data bus_ior bus_rdy asserted by core data ready Integrated version asserted by bus wrapper asserted by core internals data ready clock bus_addr bus_data bus_ior bus_rdy wrp_addr wrp_data wrp_read wrp_ack Wrapper version Performance overhead Performance overhead on reads can occur
Experiments with a custom peripheral bus Size overhead of about 1000 gates per peripheral Power overhead of about 0.05 milliwatts (<1%) Performance overhead of about 5% in this example PVCI vs. Integrated
Experiments 1000 gates per core overhead is fairly small Typical peripheral core may have from 5000-20000 gates [Inventra library] 0.05 milliwatts per core overhead is also small No performance overhead with ISA bus Performance overhead of 5% on reads with faster bus Essentially due to reads taking 4 cycles instead of 2 cycles
Conclusions Overheads in size, power and performance of PVCI vs. Integrated core were small Only significant overhead was performance in certain case Our earlier work on pre-fetching can reduce or eliminate this overhead [ISSS’99, DATE’00] Remerging the bus wrapper with core internals can also reduce this overhead PVCI and non-PVCI cores were competitive Integration advantages of bus-wrapper approach seem to come with acceptable overhead