Vector Multiplication & Color Convolution Team Members Vinay Chinta Sreenivas Patil EECC - 731 VLSI Design Projects Dr. Ken Hsu.

Slides:



Advertisements
Similar presentations
Digital System Design Subject Name : Digital System Design Course Code : IT-314.
Advertisements

Programmable FIR Filter Design
1 ECE734 VLSI Arrays for Digital Signal Processing Chapter 3 Parallel and Pipelined Processing.
Chapter 9 Computer Design Basics. 9-2 Datapaths Reminding A digital system (or a simple computer) contains datapath unit and control unit. Datapath: A.
ECE 3110: Introduction to Digital Systems
H.264 Intra Frame Coder System Design Özgür Taşdizen Microelectronics Program at Sabanci University 4/8/2005.
Implementation Approaches with FPGAs Compile-time reconfiguration (CTR) CTR is a static implementation strategy where each application consists of one.
VLSI Communication SystemsRecap VLSI Communication Systems RECAP.
Imperium Accelero 9K Group Members Ian Ferguson Nathan Liesch Luis Ramirez Mark Willson.
Team Morphing Architecture Reconfigurable Computational Platform for Space.
Combinational Logic Design Sections 3-1, 3-2 Mano/Kime.
Pipelining and Retiming 1 Pipelining  Adding registers along a path  split combinational logic into multiple cycles  increase clock rate  increase.
Computer Vision Introduction to Image formats, reading and writing images, and image environments Image filtering.
VHDL Coding Exercise 4: FIR Filter. Where to start? AlgorithmArchitecture RTL- Block diagram VHDL-Code Designspace Exploration Feedback Optimization.
An Analog Wavelet Transform CMOS Imager Chip
A Novel Approach For Color Matrixing & 2-D Convolution By Siddharth Sail Srikanth Katrue.
Distributed Arithmetic: Implementations and Applications
A Performance and Energy Comparison of FPGAs, GPUs, and Multicores for Sliding-Window Applications From J. Fowers, G. Brown, P. Cooke, and G. Stitt, University.
Conventional Image Processing. grids Digital Image Notation Digital images are typically stored with the first index representing the row number and.
CS448f: Image Processing For Photography and Vision Denoising.
296.3Page :Algorithms in the Real World Convolutional Coding & Viterbi Decoding.
GallagherP188/MAPLD20041 Accelerating DSP Algorithms Using FPGAs Sean Gallagher DSP Specialist Xilinx Inc.
Lecture 1: Images and image filtering CS4670/5670: Intro to Computer Vision Kavita Bala Hybrid Images, Oliva et al.,
DARPA Digital Audio Receiver, Processor and Amplifier Group Z James Cotton Bobak Nazer Ryan Verret.
Introduction to Adaptive Digital Filters Algorithms
VLSI Arithmetic Adders & Multipliers Prof. Vojin G. Oklobdzija University of California
Matrix Multiplication on FPGA Final presentation One semester – winter 2014/15 By : Dana Abergel and Alex Fonariov Supervisor : Mony Orbach High Speed.
High Speed, Low Power FIR Digital Filter Implementation Presented by, Praveen Dongara and Rahul Bhasin.
CPS120: Introduction to Computer Science
1 of 23 Fouts MAPLD 2005/C117 Synthesis of False Target Radar Images Using a Reconfigurable Computer Dr. Douglas J. Fouts LT Kendrick R. Macklin Daniel.
Introduction of Intel Processors
3. ISP Hardware Design & Verification
P. 4.1 Digital Technology and Computer Fundamentals Chapter 4 Digital Components.
1/8/ L3 Data Path DesignCopyright Joanne DeGroat, ECE, OSU1 ALUs and Data Paths Subtitle: How to design the data path of a processor.
J. Christiansen, CERN - EP/MIC
Sequential Arithmetic ELEC 311 Digital Logic and Circuits Dr. Ron Hayne Images Courtesy of Cengage Learning.
VHDL Project Specification Naser Mohammadzadeh. Schedule  due date: Tir 18 th 2.
EKT 221/4 DIGITAL ELECTRONICS II  Registers, Micro-operations and Implementations - Part3.
designKilla: The 32-bit pipelined processor Brought to you by: Victoria Farthing Dat Huynh Jerry Felker Tony Chen Supervisor: Young Cho.
J. Greg Nash ICNC 2014 High-Throughput Programmable Systolic Array FFT Architecture and FPGA Implementations J. Greg.
Tamal Bose, Digital Signal and Image Processing © 2004 by John Wiley & Sons, Inc. All rights reserved. Figure 8-1 (p. 491) Adaptive channel equalizer.
Area: VLSI Signal Processing.
1 Implementation in Hardware of Video Processing Algorithm Performed by: Yony Dekell & Tsion Bublil Supervisor : Mike Sumszyk SPRING 2008 High Speed Digital.
Rinoy Pazhekattu. Introduction  Most IPs today are designed using component-based design  Each component is its own IP that can be switched out for.
CSC508 Convolution Operators. CSC508 Convolution Arguably the most fundamental operation of computer vision It’s a neighborhood operator –Similar to the.
COARSE GRAINED RECONFIGURABLE ARCHITECTURES 04/18/2014 Aditi Sharma Dhiraj Chaudhary Pruthvi Gowda Rachana Raj Sunku DAY
Implementing algorithms for advanced communication systems -- My bag of tricks Sridhar Rajagopal Electrical and Computer Engineering This work is supported.
Different Microprocessors Tamanna Haque Nipa Lecturer Dept. of Computer Science Stamford University Bangladesh.
CDA 4253 FPGA System Design RTL Design Methodology 1 Hao Zheng Comp Sci & Eng USF.
Recursive Architectures for 2DLNS Multiplication RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR 11 Recursive Architectures for 2DLNS.
ECE 448 Lecture 6 Finite State Machines State Diagrams vs. Algorithmic State Machine (ASM) Charts.
High Performance Flexible DSP Infrastructure Based on MPI and VSIPL 7th Annual Workshop on High Performance Embedded Computing MIT Lincoln Laboratory
ECE DIGITAL LOGIC LECTURE 15: COMBINATIONAL CIRCUITS Assistant Prof. Fareena Saqib Florida Institute of Technology Fall 2015, 10/20/2015.
EE3A1 Computer Hardware and Digital Design Lecture 9 Pipelining.
Hiba Tariq School of Engineering
Computer Design Basics
Pipelining and Retiming 1
Hardware Testing and Designing for Testability
Swamynathan.S.M AP/ECE/SNSCT
DESIGN AND IMPLEMENTATION OF DIGITAL FILTER
VLSI Testing Lecture 14: Built-In Self-Test
ECE 448 Lecture 6 Finite State Machines State Diagrams vs. Algorithmic State Machine (ASM) Charts.
Array Processor.
Multiplier-less Multiplication by Constants
ARM implementation the design is divided into a data path section that is described in register transfer level (RTL) notation control section that is viewed.
Final Project presentation
Digital Image Processing Week IV
UNIVERSITY OF MASSACHUSETTS Dept
Computer Design Basics
UNIVERSITY OF MASSACHUSETTS Dept
Presentation transcript:

Vector Multiplication & Color Convolution Team Members Vinay Chinta Sreenivas Patil EECC VLSI Design Projects Dr. Ken Hsu

Goal A VLSI chip capable of 3*3 matrix multiplication or 3*3 digital convolution Design to operate at a frequency suitable for real-time video and image processing applications.

Applications Typically used in applications such as digital copiers where incoming color data needs to undergo unsharp masking for quality color output.  Data is converted into chrominance and luminance channels.  The RGB signals captured by the camera are linearly matrixed and processed for color sensitivity correction using 3*3 matrices.  After this, convolution operation is applied to luminance channel to enhance sharpness.

Multi-functionality Noise reduction, feature extraction, image enhancement, restoration and various other operations performed by linear filter (3*3 convolvers) Providing different kernels or matrices, it can be used for all these image processing operations. The VMCC is the single solution for all these functions.

Modes of Operation Mode 0: Color Matrixing Each output is weighted sum of three input words R, G, B which are the attributes of a single pixel. The output is valid each clock cycle if data is presented to the chip continuously

Color Matrixing

Modes of operation Mode 1: 3*3 2-D convolution Receives image data three pixels at a time in three consecutive pixel times, multiplies the 9 pixel values by the corresponding 9 coefficients stored in the static registers and computes the sum.

2-D Convolution

Functional Block Diagram

Functional Blocks used in the project 10 - bit registers 10 x 10 - bit multipliers 20 - bit adders 10 - bit multiplexers Clock divider (÷3) Shift Register (3 stage 10 bits)

Results for the multiplier Multiplier is the largest combinational block in the design Combinational Area: Non-combinational Area: 0 Total Cell Area: Total Dynamic power : mW Number of cells: 419 Number of gates in critical path: 51

9 Multiplier Design Latency = 6 Clock cycles

9 Multiplier Design Precompiled structure Hardware used 10 - bit registers x 10 - bit multipliers bit adders bit multiplexers - 7

9 Multiplier Design Post - Compiled structure (with low effort)

9 Multiplier Design Post - Compiled structure (with high effort)

Modified Circuit - Block Diagram

3 Multiplier Design Latency = 4 Clock cycles

3 Multiplier Design Precompiled structure Hardware used 10 - bit registers x 10 - bit multipliers bit adders – bit Shift registers - 3 ÷ 3 Clock Dividers- 1

3 Multiplier Design Post - Compiled structure (with high effort)

Real-time operation - with a latency of six clock cycles (in original design) - with a latency of 4 clock cycles (in modified design) The operating speed suitable for real-time NTSC video processing. The MODE input selects either 3*3 matrix multiplication or 2-D convolution. Results Note: This speed has to be divided by factor of 3.

Features Two operations on a single chip. Matrix operations for color processing Convolution for filtering and enhancement. Real-time operation for NTSC signals.

Further Work Design of BIST for the design. Control Interface / Communicator Implementation of High Speed Multiplication and Additions algorithms. Optimize for Higher Speed

References 1. A Pipelined ASIC for Color Matrixing and Convolution, K. Hsu, LJ. D'Luna, H. Yeh, W.A. Cook, G.W. Brown 2. L. J. D'Luna, et al., A Digital Video Signal Post-Processor for Color Image Sensors, Proceedings of CICC Digital integrated circuits : a design perspective, Rabaey,Pearson Education, c W. Wesley Peterson and E. J. Weldon, Jr., Error-Corecting Codes (2nd ed.) 1972, The MIT Press,Cambridge, Massachusetts, 1972, 5. VHDL : Programming By Example, Perry, Douglas L, McGraw-Hill, c x3 Convolver with Run-Time Reconfigurable Vector Multiplier in Atmel AT6000 FPGAs. AT6000 FPGAs. Application Note ( 7. Digital Image Processing, Gonzalez, Rafael C., Woods, E. Richards Introduction To VLSI Circuits And Systems, John P. Uyemura, 1952, J. Wiley, c Fundamentals of digital logic with VHDL design, Brown, Stephen D, McGraw-Hill, c Wikipedia, the free encyclopedia (

Questions ???