Azeddien M. Sllame, Amani Hasan Abdelkader

Slides:



Advertisements
Similar presentations
Presentation of Designing Efficient Irregular Networks for Heterogeneous Systems-on-Chip by Christian Neeb and Norbert Wehn and Workload Driven Synthesis.
Advertisements

1 Lecture 12: Interconnection Networks Topics: dimension/arity, routing, deadlock, flow control.
4-1 Network layer r transport segment from sending to receiving host r on sending side encapsulates segments into datagrams r on rcving side, delivers.
10 - Network Layer. Network layer r transport segment from sending to receiving host r on sending side encapsulates segments into datagrams r on rcving.
1 Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control Final exam reminders:  Plan well – attempt every question.
1 Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control.
Low-Latency Virtual-Channel Routers for On-Chip Networks Robert Mullins, Andrew West, Simon Moore Presented by Sailesh Kumar.
Performance and Power Efficient On-Chip Communication Using Adaptive Virtual Point-to-Point Connections M. Modarressi, H. Sarbazi-Azad, and A. Tavakkol.
Switching, routing, and flow control in interconnection networks.
High Performance Embedded Computing © 2007 Elsevier Lecture 16: Interconnection Networks Embedded Computing Systems Mikko Lipasti, adapted from M. Schulte.
1 The Turn Model for Adaptive Routing. 2 Summary Introduction to Direct Networks. Deadlocks in Wormhole Routing. System Model. Partially Adaptive Routing.
Communication issues for NOC By Farhadur Arifin. Objective: Future system of NOC will have strong requirment on reusability and communication performance.
On-Chip Networks and Testing
High-Performance Networks for Dataflow Architectures Pravin Bhat Andrew Putnam.
1 Lecture 7: Interconnection Network Part I: Basic Definitions Part II: Message Passing Multicomputers.
High-Level Interconnect Architectures for FPGAs An investigation into network-based interconnect systems for existing and future FPGA architectures Nick.
High-Level Interconnect Architectures for FPGAs Nick Barrow-Williams.
Network-on-Chip Introduction Axel Jantsch / Ingo Sander
Computer Networks with Internet Technology William Stallings
Anshul Kumar, CSE IITD CSL718 : Multiprocessors Interconnection Mechanisms Performance Models 20 th April, 2006.
CS 8501 Networks-on-Chip (NoCs) Lukasz Szafaryn 15 FEB 10.
Multiplexing FDM & TDM. Multiplexing When two communicating nodes are connected through a media, it generally happens that bandwidth of media is several.
Cisco 3 - Switching Perrine. J Page 16/4/2016 Chapter 4 Switches The performance of shared-medium Ethernet is affected by several factors: data frame broadcast.
Packet switching network Data is divided into packets. Transfer of information as payload in data packets Packets undergo random delays & possible loss.
Anshul Kumar, CSE IITD ECE729 : Advanced Computer Architecture Lecture 27, 28: Interconnection Mechanisms In Multiprocessors 29 th, 31 st March, 2010.
BZUPAGES.COM Presentation On SWITCHING TECHNIQUE Presented To; Sir Taimoor Presented By; Beenish Jahangir 07_04 Uzma Noreen 07_08 Tayyaba Jahangir 07_33.
Forwarding.
Unit III Bandwidth Utilization: Multiplexing and Spectrum Spreading In practical life the bandwidth available of links is limited. The proper utilization.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies Alvin R. Lebeck CPS 220.
1 Switching and Forwarding Sections Connecting More Than Two Hosts Multi-access link: Ethernet, wireless –Single physical link, shared by multiple.
Effective bandwidth with link pipelining Pipeline the flight and transmission of packets over the links Overlap the sending overhead with the transport.
Network Layer4-1 Chapter 4 Network Layer All material copyright J.F Kurose and K.W. Ross, All Rights Reserved Computer Networking: A Top Down.
Network On Chip Cache Coherency Final presentation – Part A Students: Zemer Tzach Kalifon Ethan Kalifon Ethan Instructor: Walter Isaschar Instructor: Walter.
1 Lecture 22: Interconnection Networks Topics: Routing, deadlock, flow control, virtual channels.
Network-on-Chip Paradigm Erman Doğan. OUTLINE SoC Communication Basics  Bus Architecture  Pros, Cons and Alternatives NoC  Why NoC?  Components 
The Concept of Universal Service
Chapter 2 PHYSICAL LAYER.
Network Layer COMPUTER NETWORKS Networking Standards (Network LAYER)
Network Hardware for Expanding Network
Overview Parallel Processing Pipelining
Point-to-Point Network Switching
The network-on-chip protocol
Dynamic connection system
Lecture 23: Interconnection Networks
ESE532: System-on-a-Chip Architecture
Network Layer Goals: Overview:
Switching Techniques In large networks there might be multiple paths linking sender and receiver. Information may be switched as it travels through various.
Wireless ATM PRESENTED BY : NIPURBA KONAR.
Architecture of Parallel Computers CSC / ECE 506 Summer 2006 Scalable Programming Models Lecture 11 6/19/2006 Dr Steve Hunter.
Switching, routing, and flow control in interconnection networks
Lecture 14: Interconnection Networks
Switching Techniques In large networks there might be multiple paths linking sender and receiver. Information may be switched as it travels through various.
Multiprocessor network topologies
Storage area network and System area network (SAN)
Lecture: Interconnection Networks
Data Communication Networks
PRESENTATION COMPUTER NETWORKS
Switching Techniques.
Low-Latency Virtual-Channel Routers for On-Chip Networks Robert Mullins, Andrew West, Simon Moore Presented by Sailesh Kumar.
Embedded Computer Architecture 5SAI0 Interconnection Networks
RECONFIGURABLE NETWORK ON CHIP ARCHITECTURE FOR AEROSPACE APPLICATIONS
Lecture: Interconnection Networks
CS 6290 Many-core & Interconnect
Optical communications & networking - an Overview
Lecture 25: Interconnection Networks
Switching, routing, and flow control in interconnection networks
Multiprocessors and Multi-computers
Chapter 2 from ``Introduction to Parallel Computing'',
Presentation transcript:

Azeddien M. Sllame, Amani Hasan Abdelkader A COMPARATIVE STUDY BETWEEN FAT TREE AND MESH NETWORK-ON-CHIP INTERCONNECTION ARCHITECTURES Azeddien M. Sllame, Amani Hasan Abdelkader Tripoli University Tripoli, Libya Aziz239@yahoo.com

SOC building systems using billions of transistors on single chip SOC contains many modules with different signals digital, analog, mixed-signal Often includes radio frequency (RF) functions buses, interconnection networks memory elements image processing blocks (e.g. MPEG core) digital signal processing (DSP) cores CPUs, FPGA blocks

SOC examples of SOC systems: mobile phones, portable media devices cable and satellite TV set-top-boxes

Definitions Core (node): defined as any reusable design block, i.e. can be used as building block within chip designs in hardware or a sub-component in software programs. IP (Intellectual Property) refer to copyrights. Switch: is responsible for forwarding (switching and routing) packets from sender to the intended receiver using suitable techniques to guarantee this function with proper flow control and reasonable quality of services.

Definitions Packet: is the smallest unit of communication containing routing information (e.g., destination address) and the sequencing information in its header. Its size is of order of hundreds or thousands of bytes or words. It consists of header flit and data flits

Definitions Flit: it is the smallest unite of information at link layer and it is of size of one of several words. Flits can be several types and flit exchange protocol typically requires several cycles. Phit: It is the smallest unite of information at the physical layer, which is transferred across one physical channel in one cycle.

SOC In the past: point-to-point communication links were used Problems: power dissipation, cross talk delays due to routing wires inside the chip Now :interconnection networks used to route packets between IP cores Advantages: modular, well-structured, flexible, and has efficient performance when one IP core is idle, other IP cores continue to make use of the network resources interconnection networks are already used in many super-computers for many years

NOC The most distinguishing characteristic of SOC Structure and connectivity complexity In practice, most of SOCs are MPSOCs “Network-On-Chip” (NOC) is a communication subsystem on CHIP, typically made between IP cores composing of SOC Interconnection network is adopted from computer architecture best practice, to implement NOC for SOC cores Hence:“route packets, not wires”

Interconnection Networks mesh hypercube 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 0000 Multistage interconnection networks: Butterfly Network Fat tree

Interconnection Networks Topology Static (or direct) point to point links interconnect the network nodes in some fixed topology (regular) mesh and hypercube Dynamic (or indirect) allows the interconnection pattern between the network nodes to be varied dynamically (using SWITCHING) Fat trees and multistage networks

Switching, Routing Routing and switching are the two main factors that control network latency and throughput, and realize the overall network performance The most commonly used switching techniques are including: circuit switching, packet switching, virtual cut-through, and wormhole routing Wormhole routing is the most common switching technique used in commercial machines because it allows simple, small, cheap, and fast routers Wormhole supported by virtual channels

Wormhole Switching Physical channel may support several virtual channels multiplexed (time-multiplex) across physical channel They will all have their own buffers, but they will share one single physical channel medium. Each unidirectional virtual channel can hold, for example, four flits of the same packet, mixing flits from different packets is not allowed (see figure)

Virtual Channels Packets can share the physical channel on a flit-by-flit basis The physical channel protocol must be able to distinguish between the virtual channels Keeping adding virtual channels to further reduce the blocking; will result in increased network throughput in flits/second, due to increased physical channel utilization Increasing channel multiplexing reduces the data rate of individual message and increasing message latency. Nevertheless, general network throughput will be increased,

Typical switch It consists of: -Switching -Routing -Arbitration -Input link controller -Output link controller

IP Switch Performs the functions of routing and switching Ensures the storing of packets (flits) to be transferred to other intermediate switches of the fat network Each switch is bidirectional; every port is associated with a pair of opposite unidirectional channels, one for inputs and one for outputs It consists of: Routing unit Arbitration unit Input link controller unit Output link controller unit

IP Switch: Input Link Controller Unit Responsible for receiving incoming flits from different IP’s (and switches) and forwarding them to the associated units, with the help of using virtual channel technique. It checks the availability of free input virtual channel, if exists then it returns free virtual channel number; It manages sending out the flit that is available in input virtual channel buffer; It helps do routing function by setting the outgoing physical link number that the flit occupying in the virtual channel must follow to reach the destination; It keeps track of the outgoing physical link number that is used by flit occupying the virtual channel now; It does path setting up using the outgoing virtual channel number for the flit occupying the virtual channel ; It sends flit by passing the front flit in the specified input virtual channel on the corresponding input physical link, and. It makes buffer management.

IP Switch: Output Link Controller Unit Responsible for receiving the incoming flits from the input controlling unit (after determining the appropriate output link controller number) Then, it forwards them to destination or to other intermediate switches, with the help of using virtual channel technique Buffers at a specified virtual channel are used to help output link in performing its functions

Procedure of moving a flit from switch output buffer to NEXT switch input buffer

Fat Tree Simulator Structure Packet communication flow for the proposed fat tree based

IP Node: Traffic Generator Traffic generator unit is in charge of creating messages in random lengths and it works in IP node level to generate the random data to pass through the fat tree NOC model Each message has random data and generated at different random time stamps and has random message lengths (different packet sizes)

IP Node IP nodes are placed at the leaves in the level zero and connected with parent switches with two unidirectional physical links Each IP node generates its own messages that are required to be sent to certain destinations

IP Node Each node has: its address, its generated message list to hold the generated messages, received message list to hold the received messages from different nodes

:::::::Results:::::: Example: IP node#3 sends Msg to IP node#6

Fat Tree Simulator Structure Packet communication flow for the proposed fat tree based

Criteria of comparison Topology Out of order reception of packets Network traffic balance Deadlock, livelock, starvation Routing Fault tolerance Congestion control Latency and throughput Network utilization Scalability Energy dissipation Physical realization

MESH vs. FAT tree Mesh networks belongs to direct interconnection networks; point-to-point interconnects the network nodes in fixed regular topology Fat tree is the typical example of the indirect interconnection network; allows changing of the interconnection arrangement among the network nodes dynamically through the use of network’s switches

MESH vs. FAT tree Fat tree designs employ adaptive routing in which there is a possibility of livelock and starvation. Hence, a special care should be taken during switch design process in order to avoid deadlock, livelock and starvation it can adapt to network congestion conditions (can do re-routing, out-of-order transmission) Mesh uses XY deterministic routing which is considered as deadlock and livelock free Deterministic routing in 2D mesh has in-order packet delivery which makes it simple to implement

The simulator: (gpNoCsim) General Purpose Simulator for Network-on-Chip Architectures simulator gpNoCsim is an open-source tool developed in Java, component based simulation framework for NOC architectures. Version 1.0 of gpNoCsim contains the implementation of mesh, torus, butterfly fat tree, extended butterfly fat tree networks Described in :(Hemayet et al. 2007) Fat tree simulator: Described in (Sllame et al 2012)

[Flits leaving Switch]   Mesh Number of flits / Buffer Throughput [Flits leaving Switch] Avg Packet Delay 2 0.1605 0.54325 45.957446808 4 0.2135 0.781312 65.427350427 8 0.185812 0.66375 56.030927835 fat tree 0.16525 1.1795 43.915254237 0.171625 1.185333 43.33620689 0.164812 1.120666 50.234234234

Throughput and Number of Virtual Channels for 16 IP Cores

Average Throughput per Switch (Flits Leaving Switch) and Number of Virtual Channel for 16 IP Cores

Average Packet Delays (ns) with Average Message Length (bytes) for 64 IP Cores

The Relation Between Average Packet Delay and Number of Virtual Channels for 64 IP Cores

Different Values of Buffer Size with Different Virtual Channels for Fat Tree with 64 IP Cores

Different Values of Buffer Size with Different Virtual Channels for Mesh with 64 IP Cores

Conclusions The main goal of this paper was to analyze the 2D-mesh and fat-tree architectures as a NOC interconnection networks candidates. The evaluation process has been done using available open-source simulators. The comparison includes: routing, switching methods used in the switches, effect of buffering, effect of virtual channel technique, effect of packet length. We believe that the scalability and higher bandwidth of the fat tree network makes it the preferred NOC for future massively parallel NOC systems.

Thank you