Presenter : Cheng-Ta Wu Kenichiro Anjo, Member, IEEE, Atsushi Okamura, and Masato Motomura IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 39,NO. 5, MAY 2004.

Slides:



Advertisements
Similar presentations
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 4 Computing Platforms.
Advertisements

Nios Multi Processor Ethernet Embedded Platform Final Presentation
SE-292 High Performance Computing
Bus arbitration Processor and DMA controllers both need to initiate data transfers on the bus and access main memory. The device that is allowed to initiate.
1 Dynamic Interconnection Networks Buses CEG 4131 Computer Architecture III Miodrag Bolic.
Chapter 8 Interfacing Processors and Peripherals.
Bus Specification Embedded Systems Design and Implementation Witawas Srisa-an.
System Busses / Networks-on-Chip
Homework Reading Machine Projects Labs
Computer Buses Ref: Burd, Chp – 220 Englander, Chp 7 p
6-April 06 by Nathan Chien. PCI System Block Diagram.
The Bus Architecture of Embedded System ESE 566 Report 1 LeTian Gu.
Bus Design.
Chapter 7: System Buses Dr Mohamed Menacer Taibah University
IT253: Computer Organization
December 2003 DJM DECO_021 CPU Chips & Buses. December 2003 DJM DECO_022 CPU Chips Modern ones are contained on a single chip Each chip has a set of pins.
Chapter Three: Interconnection Structure
Fast A/D sampler FINAL presentation
Computer Architecture
Mr. Gursharan Singh Tatla
3D Graphics Content Over OCP Martti Venell Sr. Verification Engineer Bitboys.
System Integration and Performance
Digital Computer Fundamentals
MEMORY popo.
TECH CH07 Input/Output External Devices I/O Modules Programmed I/O
1 Operating Systems Input/Output Management. 2 What is the I/O System A collection of devices that different sub- systems of a computer use to communicate.
Lecture 21Comp. Arch. Fall 2006 Chapter 8: I/O Systems Adapted from Mary Jane Irwin at Penn State University for Computer Organization and Design, Patterson.
SE-292 High Performance Computing
SE-292 High Performance Computing Memory Hierarchy R. Govindarajan
Chapter 5 The System Unit.
1  1998 Morgan Kaufmann Publishers Interfacing Processors and Peripherals.
Dr. Rabie A. Ramadan Al-Azhar University Lecture 3
Computer Science & Engineering
Datorteknik BusInterfacing bild 1 Bus Interfacing Processor-Memory Bus –High speed memory bus Backplane Bus –Processor-Interface bus –This is what we usually.
Presenter : Cheng_Ta Wu Shunitsu Kohara† Naoki Tomono†,‡ Jumpei Uchida† Yuichiro Miyaoka†,∗Nozomu Togawa‡ Masao Yanagisawa† Tatsuo Ohtsuki† † Department.
FIU Chapter 7: Input/Output Jerome Crooks Panyawat Chiamprasert
Reporter:PCLee With a significant increase in the design complexity of cores and associated communication among them, post-silicon validation.
I/O Channels I/O devices getting more sophisticated e.g. 3D graphics cards CPU instructs I/O controller to do transfer I/O controller does entire transfer.
Hardware Overview Net+ARM – Well Suited for Embedded Ethernet
PHY 201 (Blum) Buses Warning: some of the terminology is used inconsistently within the field.
Computer Architecture Lecture 08 Fasih ur Rehman.
Input/OUTPUT [I/O Module structure].
Cis303a_chapt06_exam.ppt CIS303A: System Architecture Exam - Chapter 6 Name: __________________ Date: _______________ 1. What connects the CPU with other.
Buses Warning: some of the terminology is used inconsistently within the field.
PCI Team 3: Adam Meyer, Christopher Koch,
CHAPTER 3 TOP LEVEL VIEW OF COMPUTER FUNCTION AND INTERCONNECTION
Computer Architecture Lecture10: Input/output devices Piotr Bilski.
CS 342 – Operating Systems Spring 2003 © Ibrahim Korpeoglu Bilkent University1 Input/Output CS 342 – Operating Systems Ibrahim Korpeoglu Bilkent University.
F. Gharsalli, S. Meftali, F. Rousseau, A.A. Jerraya TIMA laboratory 46 avenue Felix Viallet Grenoble Cedex - France Embedded Memory Wrapper Generation.
BUS IN MICROPROCESSOR. Topics to discuss Bus Interface ISA VESA local PCI Plug and Play.
Computer Architecture System Interface Units Iolanthe II approaches Coromandel Harbour.
I/O Computer Organization II 1 Interconnecting Components Need interconnections between – CPU, memory, I/O controllers Bus: shared communication channel.
August 1, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 9: I/O Devices and Communication Buses * Jeremy R. Johnson Wednesday,
ECE 526 – Network Processing Systems Design Computer Architecture: traditional network processing systems implementation Chapter 4: D. E. Comer.
Chapter 4 MARIE: An Introduction to a Simple Computer.
Computer Hardware A computer is made of internal components Central Processor Unit Internal External and external components.
Different Microprocessors Tamanna Haque Nipa Lecturer Dept. of Computer Science Stamford University Bangladesh.
DEVICES AND COMMUNICATION BUSES FOR DEVICES NETWORK– PARALLEL BUS DEVICE PROTOCOLS 1.
IT3002 Computer Architecture
Different Microprocessors Tamanna Haque Nipa Lecturer Dept. of Computer Science Stamford University Bangladesh.
Mohamed Younis CMCS 411, Computer Architecture 1 CMCS Computer Architecture Lecture 26 Bus Interconnect May 7,
1 Device Controller I/O units typically consist of A mechanical component: the device itself An electronic component: the device controller or adapter.
Field Programmable Port Extender (FPX) 1 Modular Design Techniques for the Field Programmable Port Extender John Lockwood and David Taylor Washington University.
“With 1 MB RAM, we had a memory capacity which will NEVER be fully utilized” - Bill Gates.
System on a Programmable Chip (System on a Reprogrammable Chip)
Interconnection Structures
CS 286 Computer Organization and Architecture
Dr. Michael Nasief Lecture 2
Presentation transcript:

Presenter : Cheng-Ta Wu Kenichiro Anjo, Member, IEEE, Atsushi Okamura, and Masato Motomura IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 39,NO. 5, MAY 2004

Abstract Whats the problem Related works The proposed method Experiment Results

A low-cost wrapper-based bus implementation is described that performs well in system-on-chip (SOC) designs. Novel wrapper implementation techniques are used to create wrappers without embedded data buffers. The bus uses 1) a novel slave wrapper interface that supports flow control signals, 2) a write buffer switching technique for the master wrappers to achieve good performance at a small hardware cost, 3) a novel retry management technique called slave designated retry control (SDRC) to enable slow IP core connections and a livelock avoidance scheme using the SDRC technique, and 4) a novel bit-width conversion technique using data-width converters embedded in the bus multiplexers. A CPU-based SOC designed with the proposed bus showed that these techniques can increase throughput by about 14%, and reduce read and write latencies by about 16% and 11% compared to a conventional wrapper- based bus, when running a modeled average traffic pattern for this chip. The implemented results show that these techniques can reduce the hardware costs by 28% or 50% compared with two conventional wrapper-based conversion techniques. The chip is implemented using m CMOS process technologies. The area for the on-chip bus is 3.3 mm2 and the operation clock frequency is 200 MHz.

A reliable scheme to reuse IP cores is thus important, and on-chip communication is considered a key technology for this. On-chip buses can be classified into standard buses and wrapper based buses. Standard buses (such as AMBA bus) specify protocols over wiring connections between IP cores, IP cores designed to comply with one of these protocols can be reused in another SOC using the same bus. However, they cant be connected to a different bus without changing their bus interface logic. Wrapper based bus (such as OCP) is a promising technology for reusing IP cores because it separates the communication logic from the cores thereby avoiding the connectivity problems related to physical bus protocols. Hence, IP cores complying with the interface protocol can be integrated into SOCs that have different physical buses as backbones. However, attaching simple wrapper hardware increases the access latencies, so the wrapper must be optimized and given more hardware logic to optimize its performance.

Standard on-chip bus protocols [1] AMBA [2] Core Connect IBM Wrapper interface definition [3] OCP [4] Virtual Component Interface (VCI) [8] Embed FIFOs to buffer request and write data in the wrapper This paper

Defined unique wrapper interfaces including a flow-based slave interface. Developed wrapper implementation techniques: Write buffer switching(WBS) Slave designated retry control(SDRC) Bit-width conversion technique using data-width converters embedded in the bus multiplexer.

added status signals that indicate busy or not busy for the request buffer and the write data buffer in the slave core and retry interval signals for specifying the back-off interval for retrying.

The CWD bus conveys requests, addresses, commands, sizes, command IDs, master IDs, write data, acknowledgment (ACKs), nonacknowledgements (NACKs), and retry information. The RD bus conveys responses, read data, command IDs, and master IDs.

With a write-data buffer, the master wrapper can output the buffered write data onto the CWD bus as soon as it receives an ACK.

The master wrapper determines whether to store the write data in the buffer, by comparing the requested command size to the buffer size.

CPU-based SOC with developed wrapper-based bus This SOC has 400-MHz 64-bit 2-issue superscalar processor core, 256-KB level-2 cache, and a DDR-SDRAM interface. The SDRAM is accessible through the L2 cache from an on-chip bus. Along with the on-chip bus, a CPU core, a PCI-X interface, two 10/100-base Ethernet MACs, a local bus interface, and a performance monitor are connected as five masters and seven slaves.

75% of the traffic from the master operating as a CPU was for the slow slave core and 25% was for the fast one. The traffic of the other master was randomly generated, and the ratio of read and write commands was even. When slaves became available in 1 and 16 cycles, the throughput decreased as the retry interval was increased. When they became available in 32 and 64 cycles, the tendency changed. When slaves became available in 64 cycles, the bus with a 64-cycle retry interval had the highest throughput.

We evaluated the throughput for three sizes of the write buffer in a master wrapper for four traffic patterns. Traffic pattern A was the average case in which all five masters accessed the slaves using bursts with random sizes from five supported sizes up to 128 bytes. In pattern B, one high-performance master required mostly longer burst transfers, while the other masters required shorter burst transfers. In pattern C, one master generated the same transactions we used in the SDRC evaluation as a CPU model, while the other masters generated shorter burst transfers with request intervals up to 20–30 cycles. Pattern D was the modeled traffic of the targeted SOC. One master was modeled as a CPU, as in pattern C; another master was modeled as a PCI-X interface that required frequent DMA transactions from off-chip I/O devices. The other masters required shorter burst transfers corresponding to those of Ethernet interfaces. With the developed WBS technique, using a 16-byte buffer improved the performance by about 1% to 9%, while a 128-byte buffer improved it by about 6% to 12%.

The system had two 32-bit masters, two 32-bit slaves, three 64-bit masters, three 64-bit slaves, a CWD data- width converter, and an RD data-width converter, with an early bus request (EBR), using our WBS technique and a flow-controlled interface (I/F). The arbitration request is called an early bus request (EBR) and can be asserted several cycles before the read response request is initiated.

Automatic Interface Synthesis based on the Classification of Interface Protocols of IPs Protocol Transducer Synthesis using Divide and Conquer approach Efficient Network Interface Architecture for Network-on-chip Automatic synthesis interface Out of order wrapper architecture An Interface-Circuit Synthesis Method with Configurable Processor Core in IP- Based SoC Designs Wrapper-Based Bus Implementation Techniques for Performance Improvement and Cost Reduction Wrapper-Based Bus Implementation Techniques