NETWORK-ON-CHIP (NOC): A New SoC Paradigm

Slides:



Advertisements
Similar presentations
Data Communications and Networking
Advertisements

Presentation of Designing Efficient Irregular Networks for Heterogeneous Systems-on-Chip by Christian Neeb and Norbert Wehn and Workload Driven Synthesis.
Packet Switching COM1337/3501 Textbook: Computer Networks: A Systems Approach, L. Peterson, B. Davie, Morgan Kaufmann Chapter 3.
Module 3.4: Switching Circuit Switching Packet Switching K. Salah.
1 Version 3 Module 8 Ethernet Switching. 2 Version 3 Ethernet Switching Ethernet is a shared media –One node can transmit data at a time More nodes increases.
1 Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control Final exam reminders:  Plan well – attempt every question.
1 Evgeny Bolotin – ClubNet Nov 2003 Network on Chip (NoC) Evgeny Bolotin Supervisors: Israel Cidon, Ran Ginosar and Avinoam Kolodny ClubNet - November.
1 Evgeny Bolotin – ICECS 2004 Automatic Hardware-Efficient SoC Integration by QoS Network on Chip Electrical Engineering Department, Technion, Haifa, Israel.
1 25\10\2010 Unit-V Connecting LANs Unit – 5 Connecting DevicesConnecting Devices Backbone NetworksBackbone Networks Virtual LANsVirtual LANs.
Performance and Power Efficient On-Chip Communication Using Adaptive Virtual Point-to-Point Connections M. Modarressi, H. Sarbazi-Azad, and A. Tavakkol.
Switching, routing, and flow control in interconnection networks.
Switching Techniques Student: Blidaru Catalina Elena.
Data Communications and Networking
High Performance Embedded Computing © 2007 Elsevier Lecture 16: Interconnection Networks Embedded Computing Systems Mikko Lipasti, adapted from M. Schulte.
LECTURE 9 CT1303 LAN. LAN DEVICES Network: Nodes: Service units: PC Interface processing Modules: it doesn’t generate data, but just it process it and.
Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design.
On-Chip Networks and Testing
Introduction to Interconnection Networks. Introduction to Interconnection network Digital systems(DS) are pervasive in modern society. Digital computers.
Brierley 1 Module 4 Module 4 Introduction to LAN Switching.
Networks-on-Chips (NoCs) Basics
High-Level Interconnect Architectures for FPGAs An investigation into network-based interconnect systems for existing and future FPGA architectures Nick.
Dynamic Interconnect Lecture 5. COEN Multistage Network--Omega Network Motivation: simulate crossbar network but with fewer links Components: –N.
CPS 356: Introduction to Computer Networks Lecture 7: Switching technologies Ch 2.8.2, 3.1, 3.4 Xiaowei Yang
High-Level Interconnect Architectures for FPGAs Nick Barrow-Williams.
Data and Computer Communications Chapter 10 – Circuit Switching and Packet Switching (Wide Area Networks)
TELE202 Lecture 5 Packet switching in WAN 1 Lecturer Dr Z. Huang Overview ¥Last Lectures »C programming »Source: ¥This Lecture »Packet switching in Wide.
Switching breaks up large collision domains into smaller ones Collision domain is a network segment with two or more devices sharing the same Introduction.
 Circuit Switching  Packet Switching  Message Switching WCB/McGraw-Hill  The McGraw-Hill Companies, Inc., 1998.
Data and Computer Communications Circuit Switching and Packet Switching.
Overview of computer communication and Networking Communication VS transmission Computer Network Types of networks Network Needs Standards.
Computer Networks with Internet Technology William Stallings
Anshul Kumar, CSE IITD CSL718 : Multiprocessors Interconnection Mechanisms Performance Models 20 th April, 2006.
CS 8501 Networks-on-Chip (NoCs) Lukasz Szafaryn 15 FEB 10.
Anshul Kumar, CSE IITD ECE729 : Advanced Computer Architecture Lecture 27, 28: Interconnection Mechanisms In Multiprocessors 29 th, 31 st March, 2010.
BZUPAGES.COM Presentation On SWITCHING TECHNIQUE Presented To; Sir Taimoor Presented By; Beenish Jahangir 07_04 Uzma Noreen 07_08 Tayyaba Jahangir 07_33.
Run-time Adaptive on-chip Communication Scheme 林孟諭 Dept. of Electrical Engineering National Cheng Kung University Tainan, Taiwan, R.O.C.
Soc 5.1 Chapter 5 Interconnect Computer System Design System-on-Chip by M. Flynn & W. Luk Pub. Wiley 2011 (copyright 2011)
Unit III Bandwidth Utilization: Multiplexing and Spectrum Spreading In practical life the bandwidth available of links is limited. The proper utilization.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
Super computers Parallel Processing
McGraw-Hill©The McGraw-Hill Companies, Inc., 2000 CH. 8: SWITCHING & DATAGRAM NETWORKS 7.1.
Rehab AlFallaj.  Network:  Nodes: Service units: PC Interface processing Modules: it doesn’t generate data, but just it process it and do specific task.
CCNA3 Module 4 Brierley Module 4. CCNA3 Module 4 Brierley Topics LAN congestion and its effect on network performance Advantages of LAN segmentation in.
Virtual-Channel Flow Control William J. Dally
1 Lecture 24: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix F)
1 Switching and Forwarding Sections Connecting More Than Two Hosts Multi-access link: Ethernet, wireless –Single physical link, shared by multiple.
Data Communication Networks Lec 13 and 14. Network Core- Packet Switching.
Univ. of TehranIntroduction to Computer Network1 An Introduction to Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
Data and Computer Communications 8 th and 9 th Edition by William Stallings Chapter 10 – Circuit Switching and Packet Switching.
COMP8330/7330/7336 Advanced Parallel and Distributed Computing Tree-Based Networks Cache Coherence Dr. Xiao Qin Auburn University
1. Layered Architecture of Communication Networks: Circuit Switching & Packet Switching.
Flow Control Ben Abdallah Abderazek The University of Aizu
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
Network-on-Chip Paradigm Erman Doğan. OUTLINE SoC Communication Basics  Bus Architecture  Pros, Cons and Alternatives NoC  Why NoC?  Components 
The network-on-chip protocol
Advanced Computer Networks
Packet Switching Outline Store-and-Forward Switches
Azeddien M. Sllame, Amani Hasan Abdelkader
Switching Techniques In large networks there might be multiple paths linking sender and receiver. Information may be switched as it travels through various.
SWITCHING Switched Network Circuit-Switched Network Datagram Networks
Israel Cidon, Ran Ginosar and Avinoam Kolodny
Data Communication Networks
Switching Techniques.
Low-Latency Virtual-Channel Routers for On-Chip Networks Robert Mullins, Andrew West, Simon Moore Presented by Sailesh Kumar.
Network Architecture for Cyberspace
Packet Switching Outline Store-and-Forward Switches
Computer Networks Protocols
Multiprocessors and Multi-computers
Presentation transcript:

NETWORK-ON-CHIP (NOC): A New SoC Paradigm Dr. Konstantinos Tatas

PRESENTATION OUTLINE Introduction Part A Part B Summary Motivation – SoC Communication Current Solutions NoC Concept Part B Work@MicroLab Summary

THE MANY CORES ERA Source: International Roadmap for Semiconductors 2007 edition (http://www.itrs.net/)

THE GROWING GAP: COMPUTATION VS. COMMUNICATION 2:1 9:1 Taken From ITRS, 2001

GROWING CHIP DENSITY Future? Design complexity - high IP reuse 1998 ASIC - 0.35 mm 2012 SoC - 22nm Memory, I/O P Future? Design complexity - high IP reuse Efficient high performance interconnect Scalability of communication architecture

Traditional SoC Nightmare The architecture is tightly coupled DMA CPU DSP Mem Ctrl. Bridge MPEG I o The “Board-on-a-Chip” Approach C System Bus Control Wires Peripheral Bus Variety of dedicated interfaces Poor separation between computation and communication. Design Complexity Unpredictable performance

Computational demands of future multimedia applications - Memory bandwidth scales proportional K. Uchiyama., “Power-Efficient Heterogeneous Parallelism for Digital Convergence”, VLSI Circuit Digest of Technical Papers, IEEE p 6-9, June 2008 Jian Li, “3D Integration opportunities and challenges”, ISCAS 2008 tutorial on 3D

Shared address space communications

System bus

Cross-bar

Multi-stages network on chip

An NoC example Source: ossum, Intel @ MPSoC’07

NOC Topologies Regular topologies: general-purposed on-chip multiprocessors Custom topologies:

NoC vs. “Off-Chip” Networks What is Different? Routers on Planar Grid Topology Short Point-To-Point Links between routers Unique VLSI Cost Sensitivity: Area-Routers and Links Power

NoC vs. “Off-Chip” Networks No legacy protocols to be compliant with … No software  simple and hardware efficient protocols Different operating env. (no dynamic changes and failures)

Custom Network Design – You design what you need! NoC vs. “Off-Chip” Networks No legacy protocols to be compliant with … No software  simple and hardware efficient protocols Different operating env. (no dynamic changes and failures) Custom Network Design – You design what you need!

Custom Network Design – You design what you need! NoC vs. “Off-Chip” Networks No legacy protocols to be compliant with … No software  simple and hardware efficient protocols Different operating env. (no dynamic changes and failures) Custom Network Design – You design what you need! Example1: Replace modules Replace

Custom Network Design – You design what you need! NoC vs. “Off-Chip” Networks No legacy protocols to be compliant with … No software  simple and hardware efficient protocols Different operating env. (no dynamic changes and failures) Custom Network Design – You design what you need! Example2: Adapt Links Adapt Links

Non-Segmented Bus (NS-Bus) Segmented Bus (S-Bus) Point-To-Point (PTP) NoC Cost Scalability vs. Alternatives Compare the cost of: NoC Non-Segmented Bus (NS-Bus) Segmented Bus (S-Bus) Point-To-Point (PTP)

Why noc? Bus NoC Longer connections  higher parasitic capacitance Performance does not downgrade with network scaling Arbitration grows and becomes a bottleneck Arbitration and routing are distributed Bandwidth is limited and shared by all cores Aggregated bandwidth scales with network size Latency is wire-speed once arbitration granted control Multiple hops increase latency Well-known and simple concepts Further study needed

Which are the main challenges? Communication infrastructure Communication paradigm selection Application mapping optimization Programming model Physical design Design automation/tool-flow integration

Basic Switching Techniques Circuit Switching A real or virtual circuit establishes a direct connection between source and destination. Packet Switching Each packet of a message is routed independently. The destination address has to be provided with each packet. Store and Forward Packet Switching The entire packet is stored and then forwarded at each switch. Cut Through Packet Switching The flits of a packet are pipelined through the network. The packet is not completely buffered in each switch. Virtual Cut Through Packet Switching The entire packet is stored in a switch only when the header flit is blocked due to congestion. Wormhole Switching is cut through switching and all flits are blocked on the spot when the header flit is blocked.

Circuit Switching (are they noc?) Phases: Circuit Setup Transmission Tear Down Disadvantages: Exclusive allocation of resources Long setup phase Advantages: High performance - throughput and latency Low power consumption Low overhead during transmission phase Predictable transmission

Packet Switching vs Circuit Switching

NoC Router

NoC-based MPSoC nodes Routers Links Network Interfaces (NIs) Processing Elements (PEs), such as CPUs, custom IPs, DSPs, etc. storage elements (embedded memory blocks), Routers Links Network Interfaces (NIs) Often a switch together with its host node memory is referred to as a tile.

NoC Topologies Regular/irregular Direct/indirect each node has a direct point-to-point link to a subset of other nodes in the system, called neighboring nodes

2D Mesh simplest and most popular topology for NoCs. Every switch, except those at the edges, is connected to four neighboring switches and one node.

2D Torus layout of a regular mesh except that nodes at the edges are connected to switches at the opposite edge via wrap-around routing channels. Every switch has five ports The limitation of this topology affects the long end-around connections

Octagon well-established direct topology found in NoCs. ring of 8 nodes connected by 12 bi-directional links. links provide two-hop communication between any pair of nodes in the ring simple algorithms for fast yet efficient shortest-path routing. In case a platform consists of more than eight nodes, the octagon is extended to multidimensional space

Fat-tree and butterfly fat-tree nodes are connected to an architecture's external switch switches have point-to-point links to other switches. processing units and memory modules are assigned to the leafs of the trees, switches are placed at the vertices, communication involves climbing up and down some part of the tree. A pair of coordinates is used to label each node, ($l$, $p$), where $l$ denotes a node's level and $p$ gives its position within this level.

Polygon widely accepted topology packets travel in a loop from one router to the next. We can add chords to the circle if chords are inserted only between opposite routers, the topology is called a spidergon.

Star central router in the middle of the star, computational resources, or subnetworks, in the spikes of the star. The capacity requirements of the central router are quite large, significant possibility of congestion in the middle of the star

Flow Control intra-switch switch-to-switch end-to-end Buffered Bufferless end-to-end

ACK/NACK handshaking protocol When a sender puts data on the link, it activates a VALID signal. When the receiver is ready to consume the valid data, it activates the corresponding ACK signal. If the data is corrupt or there is no buffer space to store them, a NACK signal is activated instead. Upon receipt of a NACK, the sender starts resending flits starting from the not acknowledged one inherently supports fault tolerance, additional buffer space required to keep sent flits in case retransmission is required.

Stall/go requires just two control wires one going forward, signifying data availability, one going backward and signaling either a condition of buffers filled ("STALL") or of buffers free ("GO")

Credit-based transmitter has a "credit" counter initialized to the value of empty buffer slots of the receiver decrements it every time a flit is sent. The credit counter must be updated in case the receiver consumes or forwards a flit and therefore increases its buffer space. a credit value that is sent back to the transmitter to be added to the current value of the credit counter. transmitter stalls when the credit value is zero and resumes when its value increases again.

NI Design logic required to connect the nodes to the NoC. NIs can differ significantly depending on the nature of the node Using a NI allows IPs and communication infrastructure to be designed independently One end of a NI is connected to a router using the selected flow control protocol the other to the node IP Since most IPs are designed to communicate through a bus, the NI uses a bus interface NI is not simply a protocol adapter from a processor bus to a router port. Ideally, the NI must offer the processing cores the view of a shared memory system, and the network itself should be transparent.

NI services adaptation services transaction reordering services, packetization/depacketization protocol conversion and clock domain crossing. absolute minimum services required of the NI so that data can be sent and received on the NoC transaction reordering services, error and flow control services error detection and/or correction request retransmission when required route computation services Source routing upper layer services Cache coherence

Typical NoC Packet Format Header routing and network control information. In the case of distributed routing the information required is the destination and source addresses in the case of source routing the complete routing information is written In the case of variable packet size a length field is required Payload Tail sequence number error control fields such as hamming code or CRC fields

Source vs Distributed Routing In source routing the entire routing path is computed at the source and appended to the packet. The routers do not make any routing decisions, in distributed routing, the routing path is decided in a hop-by-hop basis at each router even for deterministic routing algorithms. The only information required to be found in the packet is the destination address. The advantage of source routing is that it requires simple routers and can easily support irregular architectures. Its disadvantage is that it does not provide adaptiveness and requires more complex NIs and packets.

Source vs Distributed Routing