CSC457 Seminar YongKang Zhu December 6 th, 2001 About Network Processor.

Slides:



Advertisements
Similar presentations
Accessing I/O Devices Processor Memory BUS I/O Device 1 I/O Device 2.
Advertisements

4. Shared Memory Parallel Architectures 4.4. Multicore Architectures
Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
Lecture 6: Multicore Systems
AMD OPTERON ARCHITECTURE Omar Aragon Abdel Salam Sayyad This presentation is missing the references used.
1/1/ / faculty of Electrical Engineering eindhoven university of technology Processor support devices Part 1:Interrupts and shared memory dr.ir. A.C. Verschueren.
Khaled A. Al-Utaibi  Computers are Every Where  What is Computer Engineering?  Design Levels  Computer Engineering Fields  What.
©UCR CS 162 Computer Architecture Lecture 8: Introduction to Network Processors (II) Instructor: L.N. Bhuyan
Main Mem.. CSE 471 Autumn 011 Main Memory The last level in the cache – main memory hierarchy is the main memory made of DRAM chips DRAM parameters (memory.
Chapter 8 Hardware Conventional Computer Hardware Architecture.
An Adaptable Benchmark for MPFS Performance Testing A Master Thesis Presentation Yubing Wang Advisor: Prof. Mark Claypool.
© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Chapter 13 Direct Memory Access (DMA)
1 Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles.
Multithreading and Dataflow Architectures CPSC 321 Andreas Klappenecker.
Chess Review May 10, 2004 Berkeley, CA A Comparison of Network Processor Programming Environments Niraj Shah William Plishker Kurt Keutzer.
Min-Sheng Lee Efficient use of memory bandwidth to improve network processor throughput Jahangir Hasan 、 Satish ChandraPurdue University T. N. VijaykumarIBM.
Performance Analysis of the IXP1200 Network Processor Rajesh Krishna Balan and Urs Hengartner.
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Intel IXP1200 Network Processor q Lab 12, Introduction to the Intel IXA q Jonathan Gunner, Sruti.
Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,
How Multi-threading can increase on-chip parallelism
©UCR CS 260 Lecture 1: Introduction to Network Processors Instructor: L.N. Bhuyan
ECE 526 – Network Processing Systems Design IXP XScale and Microengines Chapter 18 & 19: D. E. Comer.
ECE 526 – Network Processing Systems Design
Analysis of a Memory Architecture for Fast Packet Buffers Sundar Iyer, Ramana Rao Kompella & Nick McKeown (sundaes,ramana, Departments.
Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles.
1 Instant replay  The semester was split into roughly four parts. —The 1st quarter covered instruction set architectures—the connection between software.
A Scalable, Cache-Based Queue Management Subsystem for Network Processors Sailesh Kumar, Patrick Crowley Dept. of Computer Science and Engineering.
Gigabit Routing on a Software-exposed Tiled-Microprocessor
Paper Review Building a Robust Software-based Router Using Network Processors.
Computer System Architectures Computer System Software
Introduction to Interconnection Networks. Introduction to Interconnection network Digital systems(DS) are pervasive in modern society. Digital computers.
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
Lecture 18 Lecture 18: Case Study of SoC Design ECE 412: Microcomputer Laboratory.
1 Multi-core processors 12/1/09. 2 Multiprocessors inside a single chip It is now possible to implement multiple processors (cores) inside a single chip.
1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah
Frank Casilio Computer Engineering May 15, 1997 Multithreaded Processors.
Buffer-On-Board Memory System 1 Name: Aurangozeb ISCA 2012.
CASH: REVISITING HARDWARE SHARING IN SINGLE-CHIP PARALLEL PROCESSOR
Winter 2006EE384x1 EE384x: Packet Switch Architectures I Parallel Packet Buffers Nick McKeown Professor of Electrical Engineering and Computer Science,
IXP Lab 2012: Part 1 Network Processor Brief. NCKU CSIE CIAL Lab2 Outline Network Processor Intel IXP2400 Processing Element Register Memory Interface.
Nick McKeown1 Building Fast Packet Buffers From Slow Memory CIS Roundtable May 2002 Nick McKeown Professor of Electrical Engineering and Computer Science,
Computer Science/Ch.3 Data Manipulation 3-1 Chapter 3 Data Manipulation.
ECE 526 – Network Processing Systems Design Network Processor Introduction Chapter 11,12: D. E. Comer.
Operating System Issues in Multi-Processor Systems John Sung Hardware Engineer Compaq Computer Corporation.
Performance Analysis of Packet Classification Algorithms on Network Processors Deepa Srinivasan, IBM Corporation Wu-chang Feng, Portland State University.
Sunpyo Hong, Hyesoon Kim
Exploiting Task-level Concurrency in a Programmable Network Interface June 11, 2003 Hyong-youb Kim, Vijay S. Pai, and Scott Rixner Rice Computer Architecture.
Contemporary DRAM memories and optimization of their usage Nebojša Milenković and Vladimir Stanković, Faculty of Electronic Engineering, Niš.
Univ. of TehranIntroduction to Computer Network1 An Introduction to Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
On-chip Parallelism Alvin R. Lebeck CPS 220/ECE 252.
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
Page 1 2P13 Week 1. Page 2 Page 3 Page 4 Page 5.
Techniques for Fast Packet Buffers Sundar Iyer, Nick McKeown Departments of Electrical Engineering & Computer Science, Stanford.
Univ. of TehranIntroduction to Computer Network1 An Introduction to Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
Submitted To: Submitted By: Seminar On 8086 Microprocessors.
Niagara: A 32-Way Multithreaded Sparc Processor Kongetira, Aingaran, Olukotun Presentation by: Mohamed Abuobaida Mohamed For COE502 : Parallel Processing.
Fall 2012 Parallel Computer Architecture Lecture 4: Multi-Core Processors Prof. Onur Mutlu Carnegie Mellon University 9/14/2012.
Multi-core processors
CS 286 Computer Organization and Architecture
Cache Memory Presentation I
Hyperthreading Technology
Computer Architecture Lecture 4 17th May, 2006
Operating Systems (CS 340 D)
Instructor: L.N. Bhuyan CS 213 Computer Architecture Lecture 7: Introduction to Network Processors Instructor: L.N. Bhuyan.
Author: Xianghui Hu, Xinan Tang, Bei Hua Lecturer: Bo Xu
William Stallings Computer Organization and Architecture
Techniques for Fast Packet Buffers
Presentation transcript:

CSC457 Seminar YongKang Zhu December 6 th, 2001 About Network Processor

Outline 1. What is a NP, why we need it and its features 2. Benchmarks for NP evaluation 3. Several issues on NP design (a). Processing unit architecture (b). Handling I/O events (c). Memory (buffer) organization and management

What is a network processor? A network processor is a highly programmable processor, which is suitable for performing intelligent and flexible packet processing and traffic management functions at line speed in various networking devices, such as routers and switches, etc.

A typical router architecture

Why NP and their features?  Fast growth in transmission technology  Advanced packet processing functions  Traditional methods: using ASIC or off-the-shelf CPU  Performance  Programmability, flexibility  Design and implementation complexity  Value proposition

Benchmarks for NP evaluation Major metrics include: Throughput: bps, pps, connections per second, transactions per second Latency: time for a packet passing through NP Jitter: variation in latency Loss Rate: ratio of lost packets

Commbench - by Mark Franklin 1. Two categories of typical applications: Header processing applications: RTR, FRAG, DRR, TCP Payload processing applications: CAST, REED, ZIP, JPEG 2. Selecting appropriate input mix to represent different workload and traffic pattern 3. Design implications ( computational complexity )

Importance of selecting input mix

Some Issues on NP design  Processing unit architecture  Fast handling I/O events  Memory organization and management

Processing unit architecture Four architecture reviewed: 1. a super scalar microprocessor (SS) 2. a fine-grained multithreading microprocessor (FGMT) 3. a chip multiprocessor (CMP) 4. a simultaneous multiprocessor (SMP)

Comparison among four architectures 1. CMP and SMP can explore more instruction level parallelism and packet level parallelism 2. However, other problems are introduced, as how to efficiently handling cache coherency and memory consistency

Handling I/O  Make equal sized internal flits  Higher level pipeline for packet processing  Using coprocessor

Higher (task) level pipeline

Memory organization & management 1. Using novel DRAM architectures:  page mode DRAM  Synchronous DRAM  Direct Rambus DRAM 2. Using slow DRAM in parallel:  Ping-pong buffering  ECQF-MMA (earliest critical queue first)

Ping-pong buffering Buffer Organization Buffer Usage

ECQF-MMA (earliest critical queue first)  Using slow DRAM and fast SRAM to organize buffer structure  total Q FIFO queues  memory bus width is b cells  memory random access time is 2T  the size of each SRAM is bounded to Q * (b - 1) cells  Arbiter selects which cells from which FIFO queue will depart in future  requests to DRAM for replenishing SRAM FIFOs are sent after being accumulated to a certain amount  guarantee a maximum latency experienced by each cell

Intel's IXP1200  1 StrongArm core and 6 RISC micro engine  can manage up to 24 independent threads  two interfaces: IX bus and PCI  IX bus for connecting MAC ports  PCI bus for connecting master processor  register files replicated in each micro engine  on-chip scratch SRAM and I/O buffers  two sets of register files each micro engine  128 GPRs and 128 transfer registers  instruction set architecture  specified field for context switch  specified instruction for reading on- chip scratch SRAM

One application of Intel's IXP1200

Conclusions 1. what is a NP, why we need it and its features 2. benchmarks 3. processing unit architectures: CMP or SMP 4. fast handling I/O: task pipeline, coprocessor 5. memory architectures -- only a small part of a huge design space