Sonar Challenge Problem Updated May 23, 2001. System Overview Use existing MFP demo design –Insert adaptive front end (MSEDR) between incoming data stream.

Slides:



Advertisements
Similar presentations
EUFORIA FP7-INFRASTRUCTURES , Grant JRA4 Overview and plans M. Haefele, E. Sonnendrücker Euforia kick-off meeting 22 January 2008 Gothenburg.
Advertisements

Commercial FPGAs: Altera Stratix Family Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.
A NOVEL APPROACH TO SOLVING LARGE-SCALE LINEAR SYSTEMS Ken Habgood, Itamar Arel Department of Electrical Engineering & Computer Science GABRIEL CRAMER.
Vector Processing. Vector Processors Combine vector operands (inputs) element by element to produce an output vector. Typical array-oriented operations.
Computer Organization and Architecture
Test Logging and Automated Failure Analysis Why Weak Automation Is Worse Than No Automation Geoff Staneff
Development of Parallel Simulator for Wireless WCDMA Network Hong Zhang Communication lab of HUT.
 Understanding the Sources of Inefficiency in General-Purpose Chips.
Examples of Two- Dimensional Systolic Arrays. Obvious Matrix Multiply Rows of a distributed to each PE in row. Columns of b distributed to each PE in.
Analysis and Performance Results of a Molecular Modeling Application on Merrimac Erez, et al. Stanford University 2004 Presented By: Daniel Killebrew.
Operating Systems Simulator Jessica Craddock Kelvin Whyms CPSC 410.
Computer Science 1620 Programming & Problem Solving.
Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved The Instruction Set Architecture.
ARINDAM GOSWAMI ERIC HUNEKE MERT USTUN ADVANCED EMBEDDED SYSTEMS ARCHITECTURE SPRING 2011 HW/SW Implementation of JPEG Decoder.
Color Aware Switch algorithm implementation The Computer Communication Lab (236340) Spring 2008.
Technion – Israel Institute of Technology Department of Electrical Engineering High Speed Digital Systems Lab Written by: Haim Natan Benny Pano Supervisor:
Stream Clustering CSE 902. Big Data Stream analysis Stream: Continuous flow of data Challenges ◦Volume: Not possible to store all the data ◦One-time.
03/12/20101 Analysis of FPGA based Kalman Filter Architectures Arvind Sudarsanam Dissertation Defense 12 March 2010.
Introduction to MATLAB Session 1 Prepared By: Dina El Kholy Ahmed Dalal Statistics Course – Biomedical Department -year 3.
Experimental Performance Evaluation For Reconfigurable Computer Systems: The GRAM Benchmarks Chitalwala. E., El-Ghazawi. T., Gaj. K., The George Washington.
1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University.
Graphics on Key by Eyal Sarfati and Eran Gilat Supervised by Prof. Shmuel Wimer, Amnon Stanislavsky and Mike Sumszyk 1.
Predictive Runtime Code Scheduling for Heterogeneous Architectures 1.
Ch2. Instruction-Level Parallelism & Its Exploitation 2. Dynamic Scheduling ECE562/468 Advanced Computer Architecture Prof. Honggang Wang ECE Department.
SLAAC Hardware Status Brian Schott Provo, UT September 1999.
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
Upgrade to Real Time Linux Target: A MATLAB-Based Graphical Control Environment Thesis Defense by Hai Xu CLEMSON U N I V E R S I T Y Department of Electrical.
SLAAC SV2 Briefing SLAAC Retreat, May 2001 Heber, UT Brian Schott USC Information Sciences Institute.
Efficient FPGA Implementation of QR
REXAPP Bilal Saqib. REXAPP  Radio EXperimentation And Prototyping Platform Based on NOC  REXAPP Compiler.
ECEn 191 – New Student Seminar - Session 9: Microprocessors, Digital Design Microprocessors and Digital Design ECEn 191 New Student Seminar.
HW/SW PARTITIONING OF FLOATING POINT SOFTWARE APPLICATIONS TO FIXED - POINTED COPROCESSOR CIRCUITS - Nalini Kumar Gaurav Chitroda Komal Kasat.
1 Advanced topics in OpenCIM 1.CIM: The need and the solution.CIM: The need and the solution. 2.Architecture overview.Architecture overview. 3.How Open.
Quadratic Programming Solver for Image Deblurring Engine Rahul Rithe, Michael Price Massachusetts Institute of Technology.
Vectors and Matrices In MATLAB a vector can be defined as row vector or as a column vector. A vector of length n can be visualized as matrix of size 1xn.
Distributed computing using Projective Geometry: Decoding of Error correcting codes Nachiket Gajare, Hrishikesh Sharma and Prof. Sachin Patkar IIT Bombay.
Ming-Feng Yeh1 CHAPTER 16 AdaptiveResonanceTheory.
Lecture 16: Reconfigurable Computing Applications November 3, 2004 ECE 697F Reconfigurable Computing Lecture 16 Reconfigurable Computing Applications.
RICE UNIVERSITY “Joint” architecture & algorithm designs for baseband signal processing Sridhar Rajagopal and Joseph R. Cavallaro Rice Center for Multimedia.
A Platform For Adaptive Processing In Machine Tool Vibration Monitoring Mike Dillon The Modal Shop, Inc. June 18, 2002 The 20 th Transducer Workshop.
Lecture Set 14 B new Introduction to Databases - Database Processing: The Connected Model (Using DataReaders)
Algorithm and Programming Considerations for Embedded Reconfigurable Computers Russell Duren, Associate Professor Engineering And Computer Science Baylor.
1 Marijn Bartel Schreuders Supervisor: Dr. Ir. M.B. Van Gijzen Date:Monday, 24 February 2014.
Project Deliverables CEN Engineering of Software 2.
Lab 2 Parallel processing using NIOS II processors
Hardware Benchmark Results for An Ultra-High Performance Architecture for Embedded Defense Signal and Image Processing Applications September 29, 2004.
USC GOVERNMENT ELECTRONIC SYSTEMS L O C K H E E D M A R T I N SLAAC / ECMA Demonstration DARPA Thursday 25 March 1999.
RICE UNIVERSITY DSPs for future wireless systems Sridhar Rajagopal.
SLAAC Systems Level Applications of Adaptive Computing Delivering ACS Technology to Applications DARPA/ITO Adaptive Computing Systems PI Meeting San Juan,
SLAAC SLD Update Steve Crago USC/ISI September 14, 1999 DARPA.
13- 1 Chapter 13.  Overview of Sequential File Processing  Sequential File Updating - Creating a New Master File  Validity Checking in Update Procedures.
Company LOGO Final presentation Spring 2008/9 Performed by: Alexander PavlovDavid Domb Supervisor: Mony Orbach GPS/INS Computing System.
3/12/2013Computer Engg, IIT(BHU)1 CONCEPTS-1. Pipelining Pipelining is used to increase the speed of processing It uses temporal parallelism In pipelining,
BASIC STRUCTURE OF PLC.
K. Salah1 Security Protocols in the Internet IPSec.
Nonlinear Adaptive Kernel Methods Dec. 1, 2009 Anthony Kuh Chaopin Zhu Nate Kowahl.
ME 120: Arduino Programming Arduino Programming Part II ME 120 Mechanical and Materials Engineering Portland State University
Netherlands Institute for Radio Astronomy 1 APERTIF beamformer and correlator requirements Laurens Bakker.
Backprojection Project Update January 2002
Memory Management.
LOW-COMPLEXITY ARBITRARY SAMPLE-RATE CONVERTER
Hiba Tariq School of Engineering
Cache Memory Presentation I
Implementation of IDEA on a Reconfigurable Computer
Serial Data Hub (Proj Dec13-13).
Progress Update Henry Chen December 17, 2010.
How Computers Work Part 1 6 February 2008.
Management From the memory view, we can list four important tasks that the OS is responsible for ; To know the used and unused memory partitions To allocate.
Paging and Segmentation
♪ Embedded System Design: Synthesizing Music Using Programmable Logic
Presentation transcript:

Sonar Challenge Problem Updated May 23, 2001

System Overview Use existing MFP demo design –Insert adaptive front end (MSEDR) between incoming data stream and existing matched field design it will adapt the weights used in the existing k-  beamformer –Use the PE1 and PE2 subvoxel beamformers as-is as much as possible.

Challenge Overview Form covariance matrix at 3KHz rate –Requires 96GB/sec memory bandwidth !!!!!! Inversion of covariance matrix happens every 3-5 seconds –Don’t do inversion Determine K smallest eigenvalues –K << N, K ~= 10? –Use Lanczos’ method to find eigenvalues A subspace method which hopefully gives good approximation to R –Do this part on Pentium

Forming Covariance Matrix (R) Forming the covariance matrix (R) –16MB memory locations 3KHz rate Requires 96GB/sec memory bandwidth SV2 has 6GB/sec memory bandwidth –Solutions Downsample input data Multiple boards –8 SV2 boards can form R 10 PE’s per SV2 read/MAC/write is kernel (2 cycles) pack 2 data elements per memory location one SV2 can do 10 X 1 kernel per cycle = 1500M kernels/sec requirement is 4M 3KHz rate = 12000M kernels/sec

Forming Covariance Matrix (continued) Bernecky has recently (since the Midway meeting) suggested a way of doing the R matrix computation in 50x50 chunks, and avoiding most off-chip memory accesses This is an interesting idea that may or may not work better than the previous slide’s approach. It will be investigated.

Rest of Algorithm Normalization may be required with this algorithm –to be determined by NUWC –prior work (2 years ago) showed normalization doubled computation required, may be the case here Other things may be need to be included depending on results of NUWC Matlab experiments

Statement of 1-Year Goals Do a lab demonstration of AMFP system –Use recorded data at NUWC –In the absence of a full-up lab demo, have enough of the pieces completed (sizing, design, simulation, or execution) to make a case for or against the approach.

Demonstration Architecture 5 SV2 boards –4 for creating/updating R (PE0 replacement) –1 for subvoxel interpolation (PE1 & PE2 replacement) External Pentium Requires ACS API + Myrinet

Tasks Determine whether subspace methods will work in this problem (NUWC) Determine time requirements for Lanczos method on a Pentium (Athanas) Determine effects of 2X downsampling on algorithm performance (NUWC) End-to-end MATLAB simulation (NUWC)

More Tasks MSEDR floating point requirements –Formats, rounding modes, etc. (VTech/BYU) Determine communications requirements/patterns (all) Create single board MSEDR FPGA design (BYU) Expand to 4 boards (VTech) Port 4K subvoxel beamformer design to SV2 (NUWC)

Yet More Tasks Integrate Lanzcos code (VTech) System integration/test (NUWC)

Schedule Determine whether subspace methods will work in this problem (NUWC) (2 months) Determine time requirements for Lanczos method on a Pentium (Athanas) (2 months) Determine effects of 2X downsampling on algorithm performance (NUWC) (3 months) End-to-end MATLAB simulation (NUWC) (3 months)

More Schedule MSEDR floating point requirements –Formats, rounding modes, etc. (VTech/BYU) (3 months) Determine communications requirements/patterns (all) (3 months) Virtex-II JHDL support (3 months) Create single board MSEDR FPGA design (BYU) (4 months) Expand to 4 boards (VTech) (10 months) Port 4K subvoxel beamformer design to SV2 (NUWC) (10 months)

Yet More Schedule Integrate Lanzcos code (VTech) (11 months) System integration/test (VTech/NUWC) (12 months)