1 Progress Report (4/14/04) Virtual Prototyping of Advanced Space System Architectures based on RapidIO Sponsor: Honeywell Space Systems, Clearwater, FL.

Slides:



Advertisements
Similar presentations
CSE 413: Computer Networks
Advertisements

System Integration and Performance
A 2 -MAC: An Adaptive, Anycast MAC Protocol for Wireless Sensor Networks Hwee-Xian TAN and Mun Choon CHAN Department of Computer Science, School of Computing.
Jaringan Komputer Lanjut Packet Switching Network.
1 CONGESTION CONTROL. 2 Congestion Control When one part of the subnet (e.g. one or more routers in an area) becomes overloaded, congestion results. Because.
CSCI 465 D ata Communications and Networks Lecture 20 Martin van Bommel CSCI 465 Data Communications & Networks 1.
Traffic Shaping Why traffic shaping? Isochronous shaping
28 September 2004 Virtual Prototyping and Performance Analysis of RapidIO-based System Architectures for Space-Based Radar David Bueno, Adam Leko, Chris.
Improving TCP Performance over Mobile Ad Hoc Networks by Exploiting Cross- Layer Information Awareness Xin Yu Department Of Computer Science New York University,
12/9/04 1 Virtual Prototyping of Advanced Space System Architectures based on RapidIO Sponsor: Honeywell DSES Space, Clearwater, FL Principal Investigator:
Multiple Processor Systems
What's inside a router? We have yet to consider the switching function of a router - the actual transfer of datagrams from a router's incoming links to.
1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Transport Protocols.
Beneficial Caching in Mobile Ad Hoc Networks Bin Tang, Samir Das, Himanshu Gupta Computer Science Department Stony Brook University.
Multiple constraints QoS Routing Given: - a (real time) connection request with specified QoS requirements (e.g., Bdw, Delay, Jitter, packet loss, path.
1 Version 3 Module 8 Ethernet Switching. 2 Version 3 Ethernet Switching Ethernet is a shared media –One node can transmit data at a time More nodes increases.
In-Band Flow Establishment for End-to-End QoS in RDRN Saravanan Radhakrishnan.
A General approach to MPLS Path Protection using Segments Ashish Gupta Ashish Gupta.
Gursharan Singh Tatla Transport Layer 16-May
Performance and Power Efficient On-Chip Communication Using Adaptive Virtual Point-to-Point Connections M. Modarressi, H. Sarbazi-Azad, and A. Tavakkol.
Connecting LANs, Backbone Networks, and Virtual LANs
Distributed Quality-of-Service Routing of Best Constrained Shortest Paths. Abdelhamid MELLOUK, Said HOCEINI, Farid BAGUENINE, Mustapha CHEURFA Computers.
Semester 1 Module 8 Ethernet Switching Andres, Wen-Yuan Liao Department of Computer Science and Engineering De Lin Institute of Technology
Chapter 4: Managing LAN Traffic
On-Chip Networks and Testing
Lecture 18 Lecture 18: Case Study of SoC Design ECE 412: Microcomputer Laboratory.
Institute of Computer and Communication Network Engineering OFC/NFOEC, 6-10 March 2011, Los Angeles, CA Lessons Learned From Implementing a Path Computation.
The University of New Hampshire InterOperability Laboratory Introduction To PCIe Express © 2011 University of New Hampshire.
A Distributed Scheduling Algorithm for Real-time (D-SAR) Industrial Wireless Sensor and Actuator Networks By Kiana Karimpour.
Networks-on-Chips (NoCs) Basics
1 Adaptive QoS Framework for Wireless Sensor Networks Lucy He Honeywell Technology & Solutions Lab No. 430 Guo Li Bin Road, Pudong New Area, Shanghai,
Chapter 2 – X.25, Frame Relay & ATM. Switched Network Stations are not connected together necessarily by a single link Stations are typically far apart.
Firmware based Array Sorter and Matlab testing suite Final Presentation August 2011 Elad Barzilay & Uri Natanzon Supervisor: Moshe Porian.
QoS Support in High-Speed, Wormhole Routing Networks Mario Gerla, B. Kannan, Bruce Kwan, Prasasth Palanti,Simon Walton.
High-Level Interconnect Architectures for FPGAs Nick Barrow-Williams.
Data and Computer Communications Chapter 10 – Circuit Switching and Packet Switching (Wide Area Networks)
Data and Computer Communications Circuit Switching and Packet Switching.
Computer Networks with Internet Technology William Stallings
08/06/04 1 Virtual Prototyping of Advanced Space System Architectures Based on RapidIO: Phase I Report Sponsor: Honeywell Space Systems, Clearwater, FL.
1 © 2003, Cisco Systems, Inc. All rights reserved. CCNA 3 v3.0 Module 4 Switching Concepts.
Cisco 3 - Switching Perrine. J Page 16/4/2016 Chapter 4 Switches The performance of shared-medium Ethernet is affected by several factors: data frame broadcast.
1 Mid-project report: Virtual Prototyping of Advanced Space System Architectures based on RapidIO Sponsor: Honeywell Space Systems, Clearwater, FL Principal.
Simulative Analysis of the RapidIO Embedded Interconnect Architecture for Real-Time, Network-Intensive Applications David Bueno, Adam Leko, Chris Conger,
ECE 526 – Network Processing Systems Design Computer Architecture: traditional network processing systems implementation Chapter 4: D. E. Comer.
Performed by:Yulia Turovski Lior Bar Lev Instructor: Mony Orbach המעבדה למערכות ספרתיות מהירות High speed digital systems laboratory הטכניון - מכון טכנולוגי.
1 RapidIO Testbed Update Chris Conger Honeywell Project 1/25/2004.
McGraw-Hill©The McGraw-Hill Companies, Inc., 2004 Connecting Devices CORPORATE INSTITUTE OF SCIENCE & TECHNOLOGY, BHOPAL Department of Electronics and.
A Bandwidth Estimation Method for IP Version 6 Networks Marshall Crocker Department of Electrical and Computer Engineering Mississippi State University.
12/10/04 1 Virtual Prototyping of Advanced Space System Architectures Based on RapidIO: Phase II Report Sponsor: Honeywell Space Systems, Clearwater, FL.
Tufts Wireless Laboratory School Of Engineering Tufts University Paper Review “An Energy Efficient Multipath Routing Protocol for Wireless Sensor Networks”,
Unit III Bandwidth Utilization: Multiplexing and Spectrum Spreading In practical life the bandwidth available of links is limited. The proper utilization.
1 3/3/04 Update Virtual Prototyping of Advanced Space System Architectures based on RapidIO Principal Investigator: Dr. Alan D. George OPS Graduate Assistants:
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
06/19/06 1 Simulation Case Studies Update David Bueno June 19, 2006 HCS Research Laboratory, ECE Department University of Florida.
SYSTEM ADMINISTRATION Chapter 2 The OSI Model. The OSI Model was designed by the International Standards Organization (ISO) as a structural framework.
Queuing Delay 1. Access Delay Some protocols require a sender to “gain access” to the channel –The channel is shared and some time is used trying to determine.
An Efficient Gigabit Ethernet Switch Model for Large-Scale Simulation Dong (Kevin) Jin.
1 IEX8175 RF Electronics Avo Ots telekommunikatsiooni õppetool, TTÜ raadio- ja sidetehnika inst.
110 February 2006 RapidIO FT Research Update: Dynamic Routing David Bueno February 10, 2006 HCS Research Laboratory Dept. of Electrical and Computer Engineering.
DDRIII BASED GENERAL PURPOSE FIFO ON VIRTEX-6 FPGA ML605 BOARD PART B PRESENTATION STUDENTS: OLEG KORENEV EUGENE REZNIK SUPERVISOR: ROLF HILGENDORF 1 Semester:
Virtual-Channel Flow Control William J. Dally
Courtesy Piggybacking: Supporting Differentiated Services in Multihop Mobile Ad Hoc Networks Wei LiuXiang Chen Yuguang Fang WING Dept. of ECE University.
Coping with Link Failures in Centralized Control Plane Architecture Maulik Desai, Thyagarajan Nandagopal.
Deterministic Communication with SpaceWire
Chapter 8 Switching Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
COMP755 Advanced Operating Systems
Multiprocessors and Multi-computers
Presentation transcript:

1 Progress Report (4/14/04) Virtual Prototyping of Advanced Space System Architectures based on RapidIO Sponsor: Honeywell Space Systems, Clearwater, FL Principal Investigator: Dr. Alan D. George OPS Graduate Assistants: David Bueno, Ian Troxel RA Graduate Assistants: Chris Conger, Adam Leko Modeling and Simulation (MS) Group HCS Research Laboratory, ECE Department University of Florida

2 Presentation Outline Project Motivation and Goals Project Task Outline – Literature Review – RIO Component Modeling – GMTI Modeling – Systems Modeling – Test Plan Conclusions Future Collaboration Possibilities

3 Project Motivation and Goals Determine the optimal means by which to develop RapidIO for space systems – Perform RIO switch, board and system tradeoff studies – Identify limitations of space-based RIO design – Determine design feasibility using SBR case study – Provide assistance for Honeywell proposal efforts Lay the groundwork for future Honeywell system prototyping

4 Project Tasks Literature Review – RIO spec, RIO components, SBR, misc. RIO Component Modeling – Layers, endpoints, switches, processors, etc. – GMTI traffic models, memory boards, backplanes RIO System Modeling – Proposed systems models, test plan Simulation Experiments Data Analysis and Report

5 Literature Review: Overview Scope – Encompass major issues surrounding RapidIO-based systems – Goal: find previous work related to issues in this project Overview of results – RapidIO information and specifications – RapidIO implementation spec sheets – General information on space-based radar (GMTI and SAR) – Switch issues Multicasting Load balancing Shared-memory switch issues (buffer management) Deliverable available on Honeywell HCS page: –

6 RapidIO Specifications, Extensions Protocol Issues, Extensions Not specified or not supported by RapidIO protocol Main topics considered thus far include: –Dynamic load balancing Papers regarding dynamic load balancing schemes in other protocols, e.g. InfiniBand –RapidFabric Protocol Extensions End-to-end flow control, support for thousands of nodes Data streaming and traffic management “ Next Generation Physical Layer ” –Multicast algorithms ACM paper discusses multicasting in switch-based parallel systems Dr. Sarp Oral ’ s dissertation Other papers discuss general multicast issues, deadlock avoidance RapidFabric specs also address multicast –System-level failure recovery RapidIO Specs All information by RapidIO Trade Association –MM, MP, GSM logical layers –Common transport layer –Parallel, serial physical layer –Error management extensions Specifications extremely detailed, describes all requirements for each layer and feature Used as main reference for model design Motorola technical whitepaper –Brief but thorough discussion of all specifications above –Also good reference, much shorter than official specs with many diagrams –[RIO Spec] techwhitepaper_rev3

7 Literature Review: Switch Issues Multicasting – Switches being considered are store + forward; no deadlock possible – Modify switch internal routing table/Command and Status Registers to support multi-valued destinations – May also use RapidIO’s built-in “Multicast event” if no payload is needed Buffer management – Static thresholds for different traffic classes good, but will require tuning for load – Dynamic thresholds better and require only slightly more logic – Pushout best, but requires lots of additional logic (and silicon) Load balancing – Simple method: static load balancing based on routing tables – Methods proposed for use with InfiniBand may be adapted for RapidIO with extensions to protocol

8 Literature Review: Space-Based Radar (GMTI) Found solid background information on GMTI (Ground- Moving Target Indicator) and STAP (Space-Time Adaptive Processing) Several different GMTI algorithm variants – Pre-Doppler – Post-Doppler – Post-Doppler PRI-staggered Different partitioning methods proposed – Parallel pipelined – Staggered iterations Interesting note: Honeywell baseline spec for input data size much larger than those used in existing literature

9 RapidIO Component Modeling: Overview RapidIO packet formats – Message-passing requests and responses – Flow control symbols – Packet control symbols RapidIO endpoint model – Message-passing logical layer Expandable to include I//O logical layer with minimal effort – Common transport layer – Parallel physical layer RapidIO central memory switch model – Common transport layer – Parallel physical layer GMTI traffic models – Memory board model sources/sinks traffic Statistics gathering – RIO request stats (average latency and BW) – RIO response stats (average latency and BW) – Additional statistics components to be developed as necessary

10 RapidIO Component Modeling: Endpoint Model Key Features – Message-Passing Logical Layer – Common Transport Layer – Parallel Physical Layer Receiver-controlled flow control Transmitter-controlled flow control Error detection and recovery Priority scheme for buffer management Key Adjustable Parameters – Packet assembly delay – Packet disassembly delay – Clock frequency – Link width – Input queue length – Output queue length – Four priority thresholds Determine the maximum number of packets that may be in a buffer to still accept a packet of a given priority Example: If threshold for priority 0 packets is 4, incoming priority 0 packets will be rejected if there are 5 or more packets currently in the input buffer – Number of device ID bytes in packet (affects packet size and max number of devices in system) – Buffer memory copy delay per byte Note:As there is no “link model,” parameters such as clock frequency and link width are incorporated into the endpoint model. High-level Endpoint Model

11 RapidIO Component Modeling: Endpoint Verification tests 2-node BW/latency results shown to right → 2 test cases: single packet, continuous packet stream Single-packet test – Average latency, BW for all possible packet sizes – Theoretical BW limit: 4 Gbps – Latency determined by: transmit time + packet disassembly time – Error detection and correction Insert packet CRC error, control symbol parity bit error, ack error Observe/verify that link partners correct error Also insert error in control symbol related to error recovery, verify behavior Packet-stream test (256-byte packets) – Around 3.5 Gbps data generation rate, link becomes saturated – Once saturated, average latency increases with time

12 RapidIO Component Modeling: Central-Memory Switch Model Key Features – Selectable cut-through or store- and-forward routing – Non-blocking architecture – Routes packets based solely on destination ID (read from a routing table file) as per RIO spec – RIO Common Transport Layer – RIO Parallel Physical Layer

13 RapidIO Component Modeling: Central-Memory Switch Model Key Adjustable Parameters – Cut-through/store-and- forward behavior – Average central- memory read latency – Average central- memory write latency – Central-memory size – Link width, clock frequency, and other RapidIO physical layer parameters – Static priority threshold scheme based on free memory in switch

14 RapidIO Component Modeling: Switch/Small-System Verification Verified N-hop latency for N switches (endpoint-to-endpoint latency) – cut-through Latency = AT + XT + N x WL + DT – store-and-forward Latency = AT + XT + N(WL + XT) + DT – Passed packets through multiple switches, observe timestamps at various points compared w/ expected values Error correction, flow control verification (Rx/Tx) – Error tests similar to 2-node, except using switch port as partner – Flow control verified by using various window sizes, observing link partner chatter – Adjusted endpoint and switch priority thresholds, verify correct acceptance/denial All-to-one, One-to-all delivery verification – Using system shown to right, packets from generator sent to all nodes round-robin – All nodes configured to receive packets and redirect to single memory sink – Even with saturated links, each node eventually receives expected packets XT – transmission time WL – switch memory write latency DT – packet disassembly delay AT – packet assembly delay N – number of hops between endpoints

15 GMTI Modeling: Algorithm Description Global memory board sends groups of RapidIO packets to processing boards Each processing board has 4 processors GMTI algorithm used is a post-Doppler variant with 4 stages: – Pulse compression – Doppler processing – Weight computation/Beamforming (STAP) – CFAR After processing, packets containing detection information get sent back to main memory

16 GMTI Modeling: Algorithm Flow Overview

17 GMTI Modeling: Processing Board Overview

18 GMTI Modeling: Data Cube Generator Key features – Can control amount of data generated, as well as rate – Evenly spreads out packets over entire CPI Diagram to right shows effect of increasing generation rate while keeping data size constant Blocks represent packets, dashed lines represent CPI Must be careful to not saturate → balance data size/CPI – No endpoint, only generates data Legend Blue oval – signal new CPI, calculate number of packets to create Red square – loop over number of packets Purple square – time delay between each packet Green square – create a packet, fill out necessary fields Pink oval – create last packet, may be smaller size

19 GMTI Modeling: Global Memory Board Key features – May serve as sensor data source, global memory endpoint (port), or both – One endpoint and one generator per model – Using multiple instances in a system represents: Multiple sensors each generating part of the complete data cube Multiple ports to the same global memory – 3 simple blocks in yellow circle give significant flexibility in controlling data traffic Key adjustable parameters – Ranges, channels, pulses – CPI – Size per element, packet size – Packet group size (message length) – Destination ID(s) to send to – Memory delay per byte – All RapidIO endpoint parameters BLUE – data cube generatorYELLOW – traffic shaperRED – RapidIO Endpoint

20 GMTI Modeling: Baseline Board/Backplane Models Baseline Processor Board Model – Four processors Compute node ASIC + RIO endpoint – One 8-port switch – One link to each endpoint – Four links available for backplane connection Baseline Backplane Model – Four 8-port switches Minimal number for a symmetric configuration – Two links to each processor board “Wastes” 2 links per board – One link to memory board – Many additional configurations to explore

21 GMTI Modeling: System Model Baseline configuration – 125MHz DDR RIO links – Receiver-controlled flow control – Two-link trunk between each switch – Static, store-and-forward routing – 10kB switch central-memory size – 72ns average read/write latency per packet for switch memory – Baseline GMTI algorithm Pulses = 126 Ranges = Beams = 6 CPI = 256ms

22 Additional Proposed System Model #1 Generic Backplane Model Double links for up to 12 boards Routing tables and switch setup independent of number of boards actually used, or application Two levels of switches – Each 1 st- level switch has two links to level 2 – 2 free links per switch, currently forms ring – Each 2 nd- level switch is connected to all 1 st level, as well as double global memory link (red circles, bottom) Uniform, dual-link access to all other boards – Logical neighbors one less hop with ring N-Board Configuration (N = 1 … 12) Diagram to left shows system with 6 boards inserted All 4 global-memory links are used (bottom), assuming 4 ports/endpoints to memory; input data cube from global memory also System purposefully built to be application independent, reusable – Attempt to maintain performance with increased versatility – GMTI may be pipelined or staggered Easy to simulate applications with different numbers of processors Each board also has two free links, could be used

23 Additional Proposed System Model #2 Blue circles GMTI algorithm tasks Red circles Global memory ports Custom System Configuration – Pipelined Algorithm – Currently assumes # boards as specified in GMTI spreadsheet Changing # procs/boards may require remaking routing tables, switch layout – For tasks with > 1 board allocated, form star topology among boards for ↑ intra-task communication If not necessary/worthwhile, use extra board links for double connections to switches All switches have free links for small architectural adjustments – Algorithm will be pipelined – Not all global memory ports must be used; currently have switch support for up to 4 input, 3 output

24 Proposed Experiments Motivation – Maximize performance Minimize the time needed for one CPI of the GMTI algorithm (must be < 256ms for baseline configuration) – Minimize cost/power Number of boards/processors/switches Independent variables – System configuration – Routing tables – Store-and-forward vs. cut-through routing – Flow control method (tx-controlled vs. rx-controlled) – Link rate (125MHz vs. 250 MHz) – Priority thresholds – Endpoint queue lengths – Switch central memory size Initial experiments will examine the effects of varying these one at a time Experiments to follow will seek correlations between the parameters

25 Additional Features to Explore Multicast and flow control extensions for RIO – Spec to be released this year – Part of RapidFabric extensions Dynamic load balancing – RIO spec technically allows only static load balancing Must accomplish using clever routing tables – Possible to extend the spec to allow some dynamic load balancing Literature search on this topic revealed several promising directions Primary challenge is RIO packet delivery ordering rules – Experiments can be conducted to determine optimal method for dynamic load balancing, if Honeywell’s applications warrant SBR algorithm alternatives – Pipelining of GMTI algorithm – Staggering of GMTI algorithm – SAR (our initial focus directed at GMTI) RapidIO I/O Logical Layer – Remote reads/writes instead of message passing – Can be easily added to the models if deemed important by Honeywell for their applications Diagram courtesy of: RapidFabric RapidIO Extensions Whitepaper, March RapidIO Architecture Layers Highlighting RapidFabric Extensions

26 Project Tasks Literature Review – RIO spec, RIO components, SBR, misc. RIO Component Modeling – Layers, endpoints, switches, processors, etc. – GMTI traffic models, memory boards, backplanes RIO System Modeling – Proposed systems models, test plan Simulation Experiments Data Analysis and Report

27 Project Timetable Literature Review[complete, yet ongoing] – RIO spec, RIO components, SBR, misc. RIO Component Modeling[complete] – Layers, endpoints, switches, processors, etc. – GMTI traffic models, memory boards, backplanes RIO System Modeling [in progress] – Proposed systems models, test plan Simulation Experiments[to begin May 10 th ] Data Analysis and Report [to begin June 20 th ]

28Conclusions Completed literature review – Will include future topics as required RIO components and systems are well underway GMTI case study is well underway Experiments identified Foundation for future integrated payload prototyping is well underway

29 Future Collaboration Possibilities Expanding the current project – Jeremy Ramos’s suggestions for ST-8 Examine latency sensitivity in RIO systems with data and control packets Examine RIO’s ability to throttle flow control for buffered pipeline processing with systems that have limited software support (i.e. RC devices) – Include other aspects of UF’s Fast and Accurate Simulation Environment (FASE) Algorithm profiling, system tradeoff analysis New directions – Wireless RC interconnects proposal – ST-8 / Integrated Payload middleware study Green Hills and other RTOS providers (evaluation / collaboration) Monitoring and management of RC components (light version of UF’s CARMA) Algorithm / system prototyping – Other possibilities?