The HPEC Challenge Benchmark Suite

Slides:



Advertisements
Similar presentations
SSCA #3 Sensor Processing Knowledge Formation and Data I/O Serial v1.0
Advertisements

Distinctive Image Features from Scale-Invariant Keypoints
Doc.: IEEE xxx a Submission May 2005 Zafer Sahinoglu, Mitsubishi Electric Research Labs Slide 1 Project: IEEE P Working Group for Wireless.
Spectral Analysis of Function Composition and Its Implications for Sampling in Direct Volume Visualization Steven Bergner GrUVi-Lab/SFU Torsten Möller.
MIT 2.71/2.710 Optics 10/25/04 wk8-a-1 The spatial frequency domain.
The imaging problem object imaging optics (lenses, etc.) image
Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol Li Fan, Pei Cao and Jussara Almeida University of Wisconsin-Madison Andrei Broder Compaq/DEC.
RESPONSIVE AND MOBILE DESIGN BY: CHRIS PASQUARETTE APRIL, 2013.
Introduction Lesson 1 Microsoft Office 2010 and the Internet
Kasumi Block Cipher Data Encryptors Darshan Gandhi Rushabh Pasad.
MySQL Users Conf MIT Lincoln Laboratory Measuring MySQL Server Performance for Sensor Data Stream Processing Jacob Nikom MIT Lincoln Laboratory.
PML Semiconductor Electronics Division, NIST
Research Challenges in the CarTel Mobile Sensor System Samuel Madden Associate Professor, MIT.
DAQmx下多點(Multi-channels)訊號量測
Database System Concepts and Architecture
IPM THEORY CHALLENGE QUIZ NUMBER 1. Q1 - We are able to place organisations into which of the following categories based on their prime purpose A.Profit.
 The environment is everything that influences or is influenced by an information system and its purpose. It includes any factors that affect the system.
CSCE 643 Computer Vision: Template Matching, Image Pyramids and Denoising Jinxiang Chai.
-1- Barcelona, June 27 th, nd ITFG Workshop THz Signal & Image Processing.
Computer Vision Lecture 7: The Fourier Transform
Remote - DSP Lab for Distance Education
Panoptes: A Scalable Architecture for Video Sensor Networking Applications Wu-chi Feng, Brian Code, Ed Kaiser, Mike Shea, Wu-chang Feng (OGI: The Oregon.
23057page 1 Physics of SAR Summer page 2 Synthetic-Aperture Radar SAR Radar - Transmits its own illumination a "Microwave flashlight" RAdio.
Sampling, Reconstruction, and Elementary Digital Filters R.C. Maher ECEN4002/5002 DSP Laboratory Spring 2002.
Through Wall Radar ECE 480 Fall 2008 Design Day Presentation.
Computer Vision Introduction to Image formats, reading and writing images, and image environments Image filtering.
Application of Digital Signal Processing in Computed tomography (CT)
Use of FOS for Airborne Radar Target Detection of other Aircraft Example PDS Presentation for EEE 455 / 457 Preliminary Design Specification Presentation.
Synthetic-Aperture Radar (SAR) Image Formation Processing
BENCHMARK SUITE RADAR SIGNAL & DATA PROCESSING CERES EPC WORKSHOP
Fernando Ortiz EM Photonics, Inc. Newark, DE
SAR Imaging Radar System A fundamental problem in designing a SAR Image Formation System is finding an optimal estimator as an ideal.
MULTITEMP 2005 – Biloxi, Mississippi, USA, May 16-18, 2005 Remote Sensing Laboratory Dept. of Information and Communication Technology University of Trento.
MIT Lincoln Laboratory Ind. Day 3/10/ For Official Use Only / ITAR exempt from mandatory disclosure under the FOIA, Exemption 3 applies. Not approved.
MIT Lincoln Laboratory HPEC 2004 KASSPER Testbed-1 GES 4/21/2004 A KASSPER Real-Time Embedded Signal Processor Testbed Glenn Schrader Massachusetts Institute.
XYZ 10/2/2015 MIT Lincoln Laboratory Sidecar Network Adapters: Open Architecture for ISR Data Sources Craig McNally, Larry Barbieri, Charles Yee.
Dr A VENGADARAJAN, Sc ‘F’, LRDE
HPEC_NDP-1 MAH 10/11/2015 MIT Lincoln Laboratory Efficient Multidimensional Polynomial Filtering for Nonlinear Digital Predistortion Matthew Herman, Benjamin.
Slide 1 MIT Lincoln Laboratory Toward Mega-Scale Computing with pMatlab Chansup Byun and Jeremy Kepner MIT Lincoln Laboratory Vipin Sachdeva and Kirk E.
GISMO Simulation Study Objective Key instrument and geometry parameters Surface and base DEMs Ice mass reflection and refraction modeling Algorithms used.
Radar Pulse Compression Using the NVIDIA CUDA SDK
Detection of Obscured Targets: Signal Processing James McClellan and Waymond R. Scott, Jr. School of Electrical and Computer Engineering Georgia Institute.
HPEC SMHS 9/24/2008 MIT Lincoln Laboratory Large Multicore FFTs: Approaches to Optimization Sharon Sacco and James Geraci 24 September 2008 This.
Slide-1 LACSI Extreme Computing MITRE ISIMIT Lincoln Laboratory This work is sponsored by the Department of Defense under Army Contract W15P7T-05-C-D001.
HPEC2002_Session1 1 DRM 11/11/2015 MIT Lincoln Laboratory Session 1: Novel Hardware Architectures David R. Martinez 24 September 2002 This work is sponsored.
Haney - 1 HPEC 9/28/2004 MIT Lincoln Laboratory pMatlab Takes the HPCchallenge Ryan Haney, Hahn Kim, Andrew Funk, Jeremy Kepner, Charles Rader, Albert.
Range-wavenumber representation of SAR Signal
MITSUBISHI ELECTRIC RESEARCH LABORATORIES Cambridge, Massachusetts High resolution SAR imaging using random pulse timing Dehong Liu IGARSS’ 2011 Vancouver,
Hardware Benchmark Results for An Ultra-High Performance Architecture for Embedded Defense Signal and Image Processing Applications September 29, 2004.
SAR-ATR-MSTAR TARGET RECOGNITION FOR MULTI-ASPECT SAR IMAGES WITH FUSION STRATEGIES ASWIN KUMAR GUTTA.
MIT Lincoln Laboratory HPEC JML 28 Sep 2004 Mapping Signal Processing Kernels to Tiled Architectures Henry Hoffmann James Lebak [Presenter] Massachusetts.
Sponsored By Abstract 1 Ritamar Siurano – Undergraduate Student Prof. Domingo Rodriguez – Advisor Abigail Fuentes – Graduate StudentProf. Ana B. Ramirez.
Sponsored By Abstract 1 Ritamar Siurano – Undergraduate Student Prof. Domingo Rodriguez – Advisor Abigail Fuentes – Graduate Student Prof. Ana B. Ramirez.
Backprojection and Synthetic Aperture Radar Processing on a HHPC Albert Conti, Ben Cordes, Prof. Miriam Leeser, Prof. Eric Miller
HPEC HN 9/23/2009 MIT Lincoln Laboratory RAPID: A Rapid Prototyping Methodology for Embedded Systems Huy Nguyen, Thomas Anderson, Stuart Chen, Ford.
High Performance Flexible DSP Infrastructure Based on MPI and VSIPL 7th Annual Workshop on High Performance Embedded Computing MIT Lincoln Laboratory
PTCN-HPEC-02-1 AIR 25Sept02 MIT Lincoln Laboratory Resource Management for Digital Signal Processing via Distributed Parallel Computing Albert I. Reuther.
The objective of Nonlinear Equalization (NLEQ) is to extend the dynamic range of RF receivers by reducing in-band nonlinear distortions. This enables detection.
MIT Lincoln Laboratory 1 of 4 MAA 3/8/2016 Development of a Real-Time Parallel UHF SAR Image Processor Matthew Alexander, Michael Vai, Thomas Emberley,
HPEC-1 SMHS 7/7/2016 MIT Lincoln Laboratory Focus 3: Cell Sharon Sacco / MIT Lincoln Laboratory HPEC Workshop 19 September 2007 This work is sponsored.
DISPLACED PHASE CENTER ANTENNA SAR IMAGING BASED ON COMPRESSED SENSING Yueguan Lin 1,2,3, Bingchen Zhang 1,2, Wen Hong 1,2 and Yirong Wu 1,2 1 National.
Atindra Mitra Joe Germann John Nehrbass
Albert I. Reuther & Joel Goodman HPEC Sept 2003
M. Richards1 ,D. Campbell1 (presenter), R. Judd2, J. Lebak3, and R
Session 4: Reconfigurable Computing
Image Pyramids and Applications
Parallel Processing in ROSA II
Matthew Lyon James Montgomery Lucas Scotta 13 October 2010
Targeted Searches using Q Pipeline
Introduction To Wavelets
Presentation transcript:

The HPEC Challenge Benchmark Suite Ryan Haney, Theresa Meuse, Jeremy Kepner and James Lebak Massachusetts Institute of Technology Lincoln Laboratory HPEC 2005 The title of this talk is, “The HPEC Challenge Benchmark Suite.” This work is sponsored by the Defense Advanced Research Projects Agency under Air Force Contract FA8721-05-C-0002. Opinions, interpretations, conclusions, and recommendations are those of the authors and are not necessarily endorsed by the United States Government. 1 1

Acknowledgements Lincoln Laboratory PCA Team Shomo Tech Systems Matthew Alexander Jeanette Baran-Gale Hector Chan Edmund Wong Shomo Tech Systems Marti Bancroft Silicon Graphics Incorporated William Harrod Sponsor Robert Graybill, DARPA PCA and HPCS Programs

HPEC Challenge Benchmark Suite PCA program kernel benchmarks Single-processor operations Drawn from many different DoD applications Represent both “front-end” signal processing and “back-end” knowledge processing HPCS program Synthetic SAR benchmark Multi-processor compact application Representative of a real application workload Designed to be easily scalable and verifiable The Polymorphous Computing Architectures (PCA), and High Productivity Computing Systems (HPCS) programs have combined work to form a new set of benchmarks called the “High Performance Embedded Computing Challenge Benchmarks.” From the PCA program, eight single-processor kernel-level benchmarks were drawn from a survey of DoD applications. These kernels are representative of operations found in “front-end” signal processing and “back-end” knowledge processing. From the HPCS program, a Synthetic SAR multi-processor application was developed. The application was designed to be easily scalable in terms of its parallelism as well as computation sizes, easily verifiable, and representative of workloads found in real-world applications.

Outline Introduction Kernel Level Benchmarks SAR Benchmark Overview System Architecture Computational Components Release Information Summary

Spotlight SAR System Principal performance goal: Throughput Maximize rate of results Overlapped IO and computing Principal performance goal is throughput (rate at which answers are produced by a supercomputer). Overlapping IO and computing is allowed. Intent of the Compact Application: Scalable – operates on a range of systems, from workstation to petascale computer High compute fidelity – representative computations of SAR processing Low physical fidelity – not a full spotlight SAR system (reduces unneeded benchmark complexity) Self-verifying Benchmark is serial (sequential processing), and must be parallelizable. Intent of Compact App: Scalable High Compute Fidelity Self-Verifying

SAR System Architecture Front-End Sensor Processing Scalable Data and Template Generator SAR Image Kernel #1 Data Read and Image Formation Template Insertion Kernel #2 Image Storage SAR Image Templates Raw SAR Files Raw SAR File SAR Image Files Groups of Template Files Template Files File IO Computation Raw SAR Data Files Image Files Groups of Template Files Sub-Image Detection Files SAR Image Files The full system architecture of the SAR system benchmark involves stressful computational and I/O requirements. However, the benchmark can be run in various modes turning on or off I/O or computation. For the purposes of this talk, we’ll be concentrating on the computational components only. Template Files Sub-Image Detection Files Template Files Kernel #3 Image Retrieval SAR Images Kernel #4 Detection Validation HPEC community has traditionally focused on Computation … Detections … but File IO performance is increasingly important Templates Back-End Knowledge Formation

Data Generation and Computational Stages SAR Image Knowledge Formation Files Raw File Template Groups of Kernel #2 Storage Detection Kernel #3 Retrieval Sub-Image Sensor Processing Raw SAR Data Files Raw SAR Templates Image Template Insertion Scalable Data and Template Generator Kernel #1 Formation There are 4 major computational components of the SAR system benchmark. The components are introduced here and discussed in more detail in the subsequent slides. The Scalable Data Generator produces raw SAR data as well as template images that will later be inserted as targets. The raw data is taken into the Image Formation kernel, converted from the temporal to the spatial domain, and interpolated from a polar to a rectangular swath, to form the desired image. The SAR images and target templates are passed to the Template Insertion block where the template targets are pseudorandomly inserted into the SAR image. Next the SAR image with templates is passed to the Detection kernel where, with simple image differencing, thresholding and small correlations, target detections are found and reported. These detections are passed on to a Validation block which requires 100% object recognition with no false positives. <see animation with three left mouse clicks> Validation Detections Kernel #4 Detection SAR Image Templates

SAR Overview Radar captures echo returns from a ‘swath’ on the ground Notional linear FM chirp pulse train, plus two ideally non-overlapping echoes returned from different positions on the swath Summation and scaling of echo returns realizes a challengingly long antenna aperture along the flight path Synthetic Aperture, L Fixed to Broadside . . . The Scalable Synthetic Data Generator produces raw SAR data approximating what would be obtained from a real SAR system. As an airplane flies adjacent to the field of interest or ‘swath,’ pulse trains are transmitted. The echo returns are scaled (to mimic different reflection coefficients at various points on the swath) and time delayed (to mimic different times at which echoes are returned from different points on the swath) and summed. The size of the SAR synthetic aperture is then determined by the distance that the sensor flies while the radar is capturing returns from the ground; this realizes a challengingly long antenna aperture length. To alleviate its coding complexity, the benchmark makes some simplifications of the SAR problem (which are shown in red): Broadside only processing (instead of being able to process different angular looks) Its synthetic aperture is set equal to its swath’s cross-range Range, X = 2X0 delayed transmitted SAR waveform reflection coefficient scale factor, different for each return from the swath Cross-Range, Y = 2Y0 received ‘raw’ SAR

Scalable Synthetic Data Generator Generates synthetic raw SAR complex data Data size is scalable to enable rigorous testing of high performance computing systems User defined scale factor determines the size of images generated Generates ‘templates’ that consist of rotated and pixelated capitalized letters Cross-Range Range Spotlight SAR Returns The raw complex data generated by the Data Generator is scalable. A user defined scale factor determines the size of the ‘swath’ that an echo return is gathered from, determining the size of the images generated. This allows the user to scale data sizes to better stress high performance systems. Along with the raw complex SAR data, target templates are generated that consist of rotated pixelated capitalized letters. The templates will be passed on so that later they can be inserted as random targets into the raw SAR images; these templates are also passed on to the Detection kernel.

Kernel 1 — SAR Image Formation Spatial Frequency Domain Interpolation s(w,ku) f(x,y) F(kx,ky) Interpolation kx = sqrt(4k2 –ku2) ky = ku Matched Filtering Fourier Transform (t,u)B(w,ku) Inverse Fourier Transform (kx,ky) B (x,y) s*0(w,ku) s(t,u) Received Samples Fit a Polar Swath Processed Samples Fit a Rectangular Swath f o kx ky Spotlight SAR Reconstruction Cross-Range, Pixels The SAR image formation step, kernel 1, consists of: fast Fourier transform (FFT) into the frequency domain - to ease processing further down the processing chain pulse compression (also called matched filtering) – to remove the transmitted waveform’s spectral components spatial frequency domain interpolation – to change the swath’s representation from a polar to rectangular coordinate system and two-dimensional inverse fast Fourier transform (IFFT) – to produce an observable spatial-domain image The synthesized SAR image displays a grid pattern of lobes corresponding to what would have been the placement of reflectors on the swath. Range, Pixels

Template Insertion (untimed) Inserts rotated pixelated capital letter templates into each SAR image Non-overlapping locations and rotations Randomly selects 50% Used as ideal detection targets in Kernel 4 If inserted with %100 Templates Image only inserted with %50 random Templates The template insertion step inserts pixelated rotated capital letter templates into the SAR image: these letters are the “targets” that will later be detected. Templates are placed at and non-overlapping locations and rotations (to later forgo image alignment problems), and in the valleys of the SAR lobes. Template magnitude is set to the average power of the original raw SAR data (well below the magnitude of the SAR peaks). [To make them visible, template magnitude is shown here is exaggerated.] Y Pixels Y Pixels X Pixels X Pixels

Thresholded Difference Kernel 4 — Detection Detects targets in SAR images Image difference Threshold Sub-regions Correlate with every template  max is target ID Computationally difficult Many small correlations over random pieces of a large image 100% recognition no false alarms Image A Image Difference Thresholded Difference Sub-region The detection kernel compares two images to detect changes that represent moving targets. Two images are differenced, effectively removing the SAR components. 1. If SAR reflector peaks are not formed (e.g.: under focused or blurred), templates may be irrecoverably buried in the blurred SAR image. 2. If SAR reflector peaks are not well formed (floor-to-peak power ratio, and placement) part/all of a template could be irrecoverably buried under a SAR lobe (so it becomes unidentifiable). Image is threshold, to distinguish newly visible targets from targets that moved out of the picture and thus ceased to be of interest. Image is broken-up into sub-regions; each region is correlated against each template; max correlation decides the target’s ID. Requirement: Throughput cannot be meaningfully measured until 100% recognition and no false alarms is achieved. Image B

Benchmark Summary and Computational Challenges Front-End Sensor Processing Back-End Knowledge Formation Raw SAR Templates Image Template Insertion Scalable Data and Template Generator Kernel #1 Formation Validation Detections Kernel #4 Detection SAR Image Templates SAR Image Formation, has large scale parallel two-dimensional (2D) Inverse Fast Fourier Transform (IFFT); may require a ‘corner turn’ or a ‘gather scatter’ (depending on architecture), with large quantities of data. Polar interpolation is known to be even more computationally intense than IFFT. Streaming image data storage to an I/O device (write) may involve large block data transfers, storing one large image after another (Kernel 2) Random location image sequence retrieval from an I/O device (read) also involving large quantities of data, with stressful memory access patterns, and locality issues (Kernel 3) Kernel 4, detection, involves performing many small correlations on random pieces of large images. Scalable synthetic data generation Pulse compression Polar Interpolation FFT, IFFT (corner turn) Sequential store Non-sequential retrieve Large & small IO Large Images difference & Threshold Many small correlations on random pieces of large image

Outline Introduction Kernel Level Benchmarks SAR Benchmark Release Information Summary

HPEC Challenge Benchmark Release http://www.ll.mit.edu/HPECChallenge/ Future site of documentation and software Initial release is available to PCA, HPCS, and HPEC SI program members through respective program web pages Documentation ANSI C Kernel Benchmarks Single processor MATLAB SAR System Benchmark Complete release will be made available to the public in first quarter of CY06 The HPEC challenge benchmarks will be made available to the public and released on the web site listed in the first quarter of calendar year 2006. In the meantime, a preliminary release of the benchmarks will be made through respective program web pages for PCA, HPCS, and HPEC-SI members. This release will include documentation, source code for the ANSI C Kernel Benchmarks, and a single processor Matlab SAR system benchmark.

Summary The HPEC Challenge is a publicly available suite of benchmarks for the embedded space Representative of a wide variety of DoD applications Benchmarks stress computation, communication and I/O Benchmarks are provided at multiple levels Kernel: small enough to easily understand and optimize Compact application: representative of real workloads Single-processor and multi-processor For more information, see http://www.ll.mit.edu/HPECChallenge/