Download presentation
Presentation is loading. Please wait.
Published byLester Hines Modified over 9 years ago
1
12/9/04 1 Virtual Prototyping of Advanced Space System Architectures based on RapidIO Sponsor: Honeywell DSES Space, Clearwater, FL Principal Investigator: Dr. Alan D. George Graduate Research Assistants: David Bueno, Ian Troxel, Chris Conger, Adam Leko HCS Research Laboratory, ECE Department University of Florida
2
12/9/04 2 Outline I.Project Overview II.Background I.RapidIO (RIO) II.Ground-Moving Target Indicator (GMTI) III.Synthetic Aperture Radar (SAR) III.Partitioning Methods IV.Modeling Environment and Models V.Experiments and Results VI.Conclusions
3
12/9/04 3 Project Overview Simulative analysis of Space-Based Radar (SBR) systems based on RapidIO interconnection networks RapidIO (RIO) is a high-performance, switched interconnect for embedded systems High system scalability (boards, processors, memory units, sensors, etc.) Far better bisection bandwidth than existing bus-based technologies Study optimal method of constructing scalable RIO-based systems for Ground Moving Target Indicator (GMTI) and Synthetic Aperture Radar (SAR) Identify system-level tradeoffs in system designs Discrete-event simulation of RapidIO network, processing elements, and GMTI/SAR algorithms Identify limitations of RIO design for SBR Determine effectiveness of various GMTI/SAR algorithm partitionings over RIO network Image courtesy http://www.afa.org/magazine/aug2002/0802radar.asphttp://www.afa.org/magazine/aug2002/0802radar.asp
4
12/9/04 4 Background- RapidIO Three-layered, embedded system interconnect architecture Logical – memory mapped I/O, message passing, and globally shared memory Transport – controls routing of packet from source to destination Physical – serial and parallel Point-to-point, packet-switched interconnect Peak single-link throughput ranging from 2 to 64 Gb/s Focus on 16-bit parallel LVDS RIO implementation for satellite systems Image courtesy G. Shippen, “RapidIO Technical Deep Dive 1: Architecture & Protocol,” Motorola Smart Network Developers Forum, 2003.
5
12/9/04 5 Background- GMTI GMTI used to track moving targets on ground Estimated processing requirements range from 40 (aircraft) to 280 (satellite) GFLOPs GMTI broken into four stages: Pulse Compression (PC) Doppler Processing (DP) Space-Time Adaptive Processing (STAP) Constant False-Alarm Rate detection (CFAR) Incoming data organized as 3-D matrix (data cube) Data reorganization (“corner turn”) necessary between stages for processing efficiency Size of each cube dictated by Coherent Processing Interval (CPI)
6
12/9/04 6 GMTI – Parallel Partitioning Straightforward partitioning Entire system works in parallel on a single data cube All-to-all personalized communication used to perform corner-turn Result latency must be ≤ 1 CPI Staggered partitioning Processors work in small groups, one data cube for each group Incoming data cubes are sent to groups via round-robin distribution Result latency must be ≤ N × CPI N = number of processor groups Pipelined partitioning Processors work in small groups, each group responsible for one stage of algorithm Corner turns “for free” Result latency must be ≤ N × CPI N = number of stages Straightforward Partitioning Pipelined Partitioning Staggered Partitioning
7
12/9/04 7 SAR Algorithm Flow and Partitioning SAR composed of 7 sub-tasks Processing of 2-D radar data Each sub-task’s optimal data partitioning varies Processed out of global memory Required data gathering and image processing times are extensive 16-second Coherent Processing Interval (CPI) Gigabytes of data, processed in smaller chunks (kB to MB range) Long processing time places short deadline on tolerable time to report results Data size stays constant throughout algorithm
8
12/9/04 8 Model Library Overview RIO system modeling library created using Mission Level Designer (MLD), a commercial discrete-event simulation modeling tool C++-based, block-level, hierarchical modeling tool Algorithm modeling accomplished via script-based processing All processing nodes read from a global script file to determine when/where to send data, and when/how long to compute Our model library includes: RIO central-memory switch Compute node with RIO endpoint SBR traffic source/sink RIO logical message-passing layer Transport and parallel physical layers Model of Compute Node with RIO Endpoint
9
12/9/04 9 GMTI Processor Board Models System contains many processor boards connected via backplane Each processor board contains one RIO switch and four processors Processors modeled with three-stage finite state machine Send data Receive data Compute Behavior of processors controlled with script files Script generator converts high-level GMTI parameters to script Script is fed into simulations Model of Four-Processor Board Script generator Simulation GMTI & system parameters Processor script send… receive…
10
12/9/04 10 System Design Constraints 16-bit parallel 250MHz DDR RapidIO link (1 GB/s) Expected basis of RadHard component performance by time RIO and SBR ready to fly in ~2008 to 2010 Systems composed of processor boards interconnected by RIO backplane 4 processors per board 8 Floating-Point Units (FPUs) per processor One 8-port central-memory switch per board; implies 4 connections to backplane per board
11
12/9/04 11 7-Board System Backplane and System Models High bandwidth requirements for GMTI algorithm dictate architecture for SAR systems Expectation that systems must eventually support both SAR and GMTI Same backplane efficiently supports four-, five-, six-, and seven-board configurations Can use smaller switches on backplane to conserve power if fewer than seven boards needed Three-board system possible using similar configuration with only two backplane switches 4-Switch Non-blocking Backplane Backplane-to-Board 0, 1, 2, 3 Connections Backplane-to-Board 4, 5, 6, and Data Source/GM Connections
12
12/9/04 12 GMTI Performance Results Studied RapidIO system designs for space-based GMTI Important conclusions: Non-blocking backplane extremely important for GMTI GMTI not sensitive to latency of individual packets Cut-through routing unnecessary in switches RapidIO transmitter- and receiver-controlled flow control perform almost identically Straightforward partitioning method provides lowest latency, but with least efficient use of resources Staggered partitioning (by board) method is very efficient, but with very high latency Pipelined method is a compromise
13
12/9/04 13 SAR Performance Results Message-passing logical layer involves higher overhead All operations have responses CPI latency is constant as chunk size increases Low levels of contention in network for all cases Logical I/O layer is inherently better for current mapping (global-memory based) Approximately ½ of operations are writes, which require no response Contention increases as chunk size increases May be possible to use synchronization to reduce contention, and double-buffering for latency hiding Small processing times reduce benefit of double-buffering Smart use of double-buffering only on certain tasks with long computation may maximize benefit of this approach
14
12/9/04 14 Conclusions RapidIO is an emerging high-performance interconnect technology that promises to be an important enabling-technology for future generations of latency- or throughput-bound applications Strong industry support, but very little scholarly research on RapidIO to date Powerful SBR/RapidIO simulation and modeling capabilities developed at UF GMTI models SAR models RapidIO “Construction Set” Altia-MLD interface library GMTI is extremely network-intensive Various partitioning options, with straightforward partitioning producing lowest latency Staggered partitioning has highest latency but uses network most efficiently Network requirements of SAR easily handled by networks designed for GMTI Biggest challenge of SAR is memory requirements Could benefit from further in-depth study on processor/memory architecture issues RIO Logical I/O layer found to benefit SAR CPI completion latency via responseless write operations Well-suited to distributed-memory approach Double-buffering may improve performance of SAR application Increases local memory requirements by factor of 2 to 3 Cut-through routing significantly benefits neither GMTI nor SAR in studies thus far Experimental testbed for RIO system research is under development First step will be 2-node RIO testbed: Xilinx RIO cores and two small FPGA boards recently acquired Looking into options to build larger and more powerful testbed (comments welcome)
15
12/9/04 15 Project Publications in 2004 RIO Systems Paper at HPEC’04 D. Bueno, C. Conger, A. Leko, I. Troxel and A. George, “Virtual Prototyping and Performance Analysis of RapidIO-based System Architectures for Space-Based Radar,” Proc. of High- Performance Embedded Computing (HPEC) Workshop, MIT Lincoln Lab, Lexington, MA, Sep. 28-30, 2004. RIO Networking Paper at LCN’04 D. Bueno, A. Leko, C. Conger, I. Troxel, and A. George, “Simulative Analysis of the RapidIO Embedded Interconnect Architecture for Real-Time, Network-Intensive Applications,” Proc. of 29 th IEEE Conference on Local Computer Networks (LCN) via the IEEE Workshop on High-Speed Local Networks (HSLN), Tampa, FL, Nov. 16-18, 2004.
16
12/9/04 16 Future Work Potential follow-ons to current RapidIO work Produce “Day in the life of SBR” simulation (with SAR and GMTI) Develop GMTI-specific Altia GUI (the current Altia simulation is SAR-specific) Develop “Day in the life of SBR” GUI Explore ways to increase the performance of SAR through double-buffering and synchronization Examine the effects of per-node memory size on SAR and GMTI traffic Development of RapidIO system testbed Couple RIO work with new research in reconfigurable computing (RC) for advanced space systems Future projects are in planning stages
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.