GTRI_B-1 ECRB - HPC - 1 Using GPU VSIPL & CUDA to Accelerate RF Clutter Simulation 2010 High Performance Embedded Computing Workshop 23 September 2010.

Slides:



Advertisements
Similar presentations
OUTLINE Motivation Modeling Simulation Experiments Conclusions.
Advertisements

Your Name Your Title Your Organization (Line #1) Your Organization (Line #2) Week 8 Update Joe Hoatam Josh Merritt Aaron Nielsen.
Beamforming Issues in Modern MIMO Radars with Doppler
IGARSS 2011, July , Vancouver, Canada Demonstration of Target Vibration Estimation in Synthetic Aperture Radar Imagery Qi Wang 1,2, Matthew Pepin.
PACE: An Autofocus Algorithm For SAR
Folie 1 Ambiguity Suppression by Azimuth Phase Coding in Multichannel SAR Systems DLR - Institut für Hochfrequenztechnik und Radarsysteme F. Bordoni, M.
Folie 1 Performance Investigation on the High-Resolution Wide-Swath SAR System Operating in Stripmap Quad-Pol and Ultra-Wide ScanSAR Mode DLR - Institut.
Multi Agent Simulation and its optimization over parallel architecture using CUDA™ Abdur Rahman and Bilal Khan NEDUET(Department Of Computer and Information.
Top Level System Block Diagram BSS Block Diagram Abstract In today's expanding business environment, conference call technology has become an integral.
Lecture 5: Linear Systems and Convolution
Problem: Ground Clutter Clutter: There is always clutter in signals and it distorts the purposeful component of the signal. Getting rid of clutter, or.
SeaSonde Overview.
Radar: Acronym for Radio Detection and Ranging
Use of FOS for Airborne Radar Target Detection of other Aircraft Example PDS Presentation for EEE 455 / 457 Preliminary Design Specification Presentation.
Doppler Radar From Josh Wurman NCAR S-POL DOPPLER RADAR.
Adaptive Signal Processing Class Project Adaptive Interacting Multiple Model Technique for Tracking Maneuvering Targets Viji Paul, Sahay Shishir Brijendra,
1 1 © 2011 The MathWorks, Inc. Accelerating Bit Error Rate Simulation in MATLAB using Graphics Processors James Lebak Brian Fanous Nick Moore High-Performance.
Inputs to Signal Generation.vi: -Initial Distance (m) -Velocity (m/s) -Chirp Duration (s) -Sampling Info (Sampling Frequency, Window Size) -Original Signal.
11/18/02Technical Interchange Meeting Progress in FY-02 Research RDA –Capability to collect time series data –Control of phase shifter Phase coding –Sigmet’s.
Competence Centre on Information Extraction and Image Understanding for Earth Observation Matteo Soccorsi (1) and Mihai Datcu (1,2) A Complex GMRF for.
COMPUTED ENVELOPE LINEARITY OF SEVERAL FM BROADCAST ANTENNA ARRAYS J. Dane Jubera 2008 NAB Engineering Conference.
Surface Simplification Using Quadric Error Metrics Michael Garland Paul S. Heckbert.
Chapter 6 Control Using Wireless Throttling Valves.
Target Tracking with Binary Proximity Sensors: Fundamental Limits, Minimal Descriptions, and Algorithms N. Shrivastava, R. Mudumbai, U. Madhow, and S.
Computational Technologies for Digital Pulse Compression
Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty.
DLS Digital Controller Tony Dobbing Head of Power Supplies Group.
Dr A VENGADARAJAN, Sc ‘F’, LRDE
1 of 22 Glaciers and Ice Sheets Interferometric Radar (GISIR) Center for Remote Sensing of Ice Sheets, University of Kansas, Lawrence, KS
An Introduction to Programming and Algorithms. Course Objectives A basic understanding of engineering problem solving process. A basic understanding of.
How Does GPS Work ?. Objectives To Describe: The 3 components of the Global Positioning System How position is obtaining from a radio timing signal Obtaining.
GISMO Simulation Study Objective Key instrument and geometry parameters Surface and base DEMs Ice mass reflection and refraction modeling Algorithms used.
1 The OSKAR Simulator (Version 2!) AAVP Workshop, ASTRON, 15 th December 2011 Fred Dulwich, Ben Mort, Stef Salvini.
Dense Image Over-segmentation on a GPU Alex Rodionov 4/24/2009.
Testing of A/D Converters István Kollár Budapest University of Technology and Economics Dept. of Measurement and Information Systems Budapest, Hungary.
Parallelization of likelihood functions for data analysis Alfio Lazzaro CERN openlab Forum on Concurrent Programming Models and Frameworks.
QCAdesigner – CUDA HPPS project
PARALLEL FREQUENCY RADAR VIA COMPRESSIVE SENSING
Hardware Benchmark Results for An Ultra-High Performance Architecture for Embedded Defense Signal and Image Processing Applications September 29, 2004.
Graduate Institute of Astrophysics, National Taiwan University Leung Center for Cosmology and Particle Astrophysics Chia-Yu Hu OSU Radio Simulation Workshop.
Distributed Adaptive Control and Metrology for Large Radar Apertures PI: James Lux Co-Is: Adam Freedman, John Huang, Andy Kissil, Kouji Nishimoto, Farinaz.
PlaneMon: Airplane Detection Monitor Evan Goetz & Keith Riles (University of Michigan) DCC: G Z.
GPU Accelerated MRI Reconstruction Professor Kevin Skadron Computer Science, School of Engineering and Applied Science University of Virginia, Charlottesville,
GISMO Simulation Status Objective Radar and geometry parameters Airborne platform upgrade Surface and base DEMs Ice mass reflection and refraction modeling.
Encoding How is information represented?. Way of looking at techniques Data Medium Digital Analog Digital Analog NRZ Manchester Differential Manchester.
Chapter 20 Speech Encoding by Parameters 20.1 Linear Predictive Coding (LPC) 20.2 Linear Predictive Vocoder 20.3 Code Excited Linear Prediction (CELP)
Image Processing A Study in Pixel Averaging Building a Resolution Pyramid With Parallel Computing Denise Runnels and Farnaz Zand.
Sunpyo Hong, Hyesoon Kim
0 7th ESWW, Bruges, Ionospheric Scintillations Propagation Model Y. Béniguel, J-P Adam IEEA, Courbevoie, France.
Extremum seeking without external dithering and its application to plasma RF heating on FTU University of Rome “Tor Vergata” Luca Zaccarian, Daniele Carnevale,
GTRI_B-1 ECRB - HPC - 1 GPU SAS - 1 Using Graphics Processors to Accelerate Synthetic Aperture Sonar Imaging via Backpropagation 2010 High Performance.
Distortion Correction ECE 6276 Project Review Team 5: Basit Memon Foti Kacani Jason Haedt Jin Joo Lee Peter Karasev.
GPU VSIPL: Core and Beyond Andrew Kerr 1, Dan Campbell 2, and Mark Richards 1 1 Georgia Institute of Technology 2 Georgia Tech Research Institute.
Graphics Processor Clusters for High Speed Backpropagation 2011 High Performance Embedded Computing Workshop 22 September 2011 Daniel P. Campbell, Thomas.
Discriminative n-gram language modeling Brian Roark, Murat Saraclar, Michael Collins Presented by Patty Liu.
Report on the FCT MDC Stuart Anderson, Kent Blackburn, Philip Charlton, Jeffrey Edlund, Rick Jenet, Albert Lazzarini, Tom Prince, Massimo Tinto, Linqing.
Astronomical Institute University of Bern 1 Astronomical Institute, University of Bern, Switzerland * now at PosiTim, Germany 5th International GOCE User.
DISPLACED PHASE CENTER ANTENNA SAR IMAGING BASED ON COMPRESSED SENSING Yueguan Lin 1,2,3, Bingchen Zhang 1,2, Wen Hong 1,2 and Yirong Wu 1,2 1 National.
Dense-Region Based Compact Data Cube
Atindra Mitra Joe Germann John Nehrbass
presented by: Reham Mahmoud AbD El-fattah ali
A Moment Radar Data Emulator: The Current Progress and Future Direction Ryan M. May.
GPU VSIPL: High Performance VSIPL Implementation for GPUs
JIVE UniBoard Correlator (JUC) Firmware
Efficient Estimation of Residual Trajectory Deviations from SAR data
Using Tensorflow to Detect Objects in an Image
Open book, open notes, bring a calculator
Autonomous Cyber-Physical Systems: Dynamical Systems
Final Project: Phase Coding to Mitigate Range/Velocity Ambiguities
Conversation between Analogue and Digital System
Presentation transcript:

GTRI_B-1 ECRB - HPC - 1 Using GPU VSIPL & CUDA to Accelerate RF Clutter Simulation 2010 High Performance Embedded Computing Workshop 23 September 2010 Dan Campbell, Mark McCans, Mike Davis, Mike Brinkmann

GTRI_B-2 ECRB - HPC - 2 Outline  RF Clutter Simulation  Validation Approach  GPU VSIPL  Precision Issues  VSIPL Port, Optimization, and Results

GTRI_B-3 ECRB - HPC - 3 Outline  RF Clutter Simulation  Validation Approach  GPU VSIPL  Precision Issues  VSIPL Port, Optimization, and Results

GTRI_B-4 ECRB - HPC - 4 Radar Clutter Radar will observe echo from object… Strong returns from the ground, called “clutter”, often limit the performance of radars in air-to-air and air-to-ground operations. …as well as a strong return from the ground.

GTRI_B-5 ECRB - HPC - 5 Synthetic Air-to-Air Clutter 7,500 Hz10,000 Hz12,500 Hz MPRFLook-Down MPRFRG-HPRFHPRF Targets at same range/Doppler as clutter will be obscured.

GTRI_B-6 ECRB - HPC - 6 RF Clutter Simulation Approach : Sub-divide ground into number of unresolvable clutter patches and compute contribution of each.

GTRI_B-7 ECRB - HPC - 7 RF Clutter Simulation Radar clutter data is sum of delayed and phase shifted versions of radar waveform. Phase ShiftDelayed Signal

GTRI_B-8 ECRB - HPC - 8 RF Clutter Simulation Air-to-AirSAR Imaging (Air-to-Ground) Our Test # of Range Bins # of Pulses # of Clutter Patches 6,800 Rng x 96 Az = 6.5 x ,500 Rng x 26,812 Az = 3.8 x rng x 52 az = 29,432 Notional Parameters Computational load depends on radar parameters and collection geometry (e.g., high resolution scenarios require a large number of independent clutter patches)

GTRI_B-9 ECRB - HPC - 9 RF Clutter Simulation Algorithm: Inputs Radar Parameters (waveform, antenna, etc.) Location of platform for each pulse Output Simulated radar data cube (sample voltage for each pulse, each channel, and each range bin) For each pulse and for each range bin… For each clutter patch in this range ring… 1.Compute range, azimuth, and elevation from platform to clutter patch. 2.Scale contribution of this clutter patch according to the radar range equation. 3.Accumulate the contribution of this clutter patch to the simulated data cube.

GTRI_B-10 ECRB - HPC - 10 Outline  RF Clutter Simulation  Validation Approach  GPU VSIPL  Precision Issues  VSIPL Port, Optimization, and Results

GTRI_B-11 ECRB - HPC - 11 Validation Needs Porting MATLAB  C introduces changes Random Number Generator Double  Single Implementation of some functions e.g. transcendentals Reordering of operations Programmer Error Identical output too costly Derive acceptance criteria from expected usage needs

GTRI_B-12 ECRB - HPC - 12 Validation Approach Modify sim to capture RNG stream from MATLAB Automate large number of runs for golden data Accelerated port optionally ingests RNG stream Capture port output and compare to golden data Acceptance Criteria: CNR∆ = ( CNR M – CNR T ) / CNR M < ECR = 20 log10( norm(M(:) - T(:)) / norm(M(:)) ) < -60dB ADMSE = Mean( | fft2(M(:)) - fft2(T(:)) | 2 ) <

GTRI_B-13 ECRB - HPC - 13 Outline  RF Clutter Simulation  Validation Approach  GPU VSIPL  Precision Issues  VSIPL Port, Optimization, and Results

GTRI_B-14 ECRB - HPC - 14 GPU VSIPL   Industry standard C API for portable dense linear algebra & signal processing  Also C++, Python  Accelerated implementations for many platforms, primarily embedded, coprocessor-based systems  VSIPL implementation that exploits Graphics Processing Units to accelerate VSIPL applications – developed at GTRI 

GTRI_B-15 ECRB - HPC - 15 Outline  RF Clutter Simulation  Validation Approach  GPU VSIPL  Precision Issues  VSIPL Port, Optimization, and Results

GTRI_B-16 ECRB - HPC - 16 Original Validation Results  VSIPL versions compared to MATLAB version VSIPL DoubleVSIPL SingleThreshold CNR Consistent Yes CNR ∆ < ECR -152 dB2.9 dB< -60 dB ADMSE <

GTRI_B-17 ECRB - HPC - 17 Single Precision  Single precision errors caused by high dynamic range in platform to clutter patch range calculation:  d(Platform  clutter) >>> d(clutter patch  clutter patch)  Solution: use far-field approximation technique Double precision used to compute a base range Single precision for sets of ∆R values Small number of double precision calculations has negligible affect on performance

GTRI_B-18 ECRB - HPC - 18 Far Field Approx. via Taylor Expansion Range between platform at x and clutter patch at y Linear approximation near x 0 Quadratic Term Distance from center of scene, Unit vector from CPI center to clutter patch Distance travelled in direction orthogonal to “lines” of constant range

GTRI_B-19 ECRB - HPC - 19 Bounding Error Approximation Error Case 1: Air-to-Air 128 pulses, 20 kHz PRF, 300 m/s velocity  10 km Altitude  error < 50  m < 0.06° phase at X band Case 2: SAR 10 second dwell, 100 m/s velocity  10 km Altitude  error > at X band!!! Linear approximation to range may be appropriate for typical air-to-air scenarios.

GTRI_B-20 ECRB - HPC - 20 Validation Results  Comparison to original MATLAB version Approximation technique used in each version listed MATLAB Single VSIPL Double VSIPL Single Threshold CNR Consistent Yes CNR ∆ < ECR -101 dB-130 dB-98 dB< -60 dB ADMSE <

GTRI_B-21 ECRB - HPC - 21 Outline  RF Clutter Simulation  Validation Approach  GPU VSIPL  Precision Issues  VSIPL Port, Optimization, and Results

GTRI_B-22 ECRB - HPC - 22 VSIPL PORT MATLAB to VSIPL port made easier due to VSIPL functions that emulate MATLAB operations Original MATLAB code very complex, particularly for radar novice First pass of the port was done with almost no attempts at optimizations GPU transition required some additional changes Single vs Double precision issues Time cost of operations differ TASP  GPU VSIPL needs “sample” function

GTRI_B-23 ECRB - HPC - 23 Optimization Issues MATLAB code written for readability over speed Too many nested loops, operations involving small datasets Many redundant calculations Original code was very flexible, due to large user base Most optimizations required removing some generality Assumptions need to be made about the scenario Abstraction barrier issues Small operations less costly on CPU than GPU Operation fusion, coarser operations, and leaving small things in C each helped

GTRI_B-24 ECRB - HPC - 24 HPC Port – Performance  Optimization progression of single precision VSIPL: Reduced generality; Dynamic  Static Small ops VSIPL  C Reduced generality; simplified operations Hoisted invariants; reordered for fusion Stride consciousness; coarser VSIPL ops; loop fusion

GTRI_B-25 ECRB - HPC - 25 HPC Port – Performance  Performance Timing Results: VersionRuntime(s)Speedup MATLAB162.51x TASP VSIPL Double x TASP VSIPL Single x GPU VSIPL Single x CUDA Native1.3125x  GTX 480/Q6600 TASP single core only