“Low-Power, Real-Time Object- Recognition Processors for Mobile Vision Systems”, IEEE Micro 2012. Jinwook Oh ; Gyeonghoon Kim ; Injoon Hong ; Junyoung.

Slides:



Advertisements
Similar presentations
Distinctive Image Features from Scale-Invariant Keypoints
Advertisements

Multi-core SoC Design is the Challenge! What is the Solution? Drew Wingard CTO Sonics, Inc.
Complex Networks for Representation and Characterization of Images For CS790g Project Bingdong Li 9/23/2009.
3D Model Matching with Viewpoint-Invariant Patches(VIP) Reporter :鄒嘉恆 Date : 10/06/2009.
A Novel 3D Layer-Multiplexed On-Chip Network
Dynamic Thread Assignment on Heterogeneous Multiprocessor Architectures Pree Thiengburanathum Advanced computer architecture Oct 24,
International Symposium on Low Power Electronics and Design Energy-Efficient Non-Minimal Path On-chip Interconnection Network for Heterogeneous Systems.
Presented by Xinyu Chang
Foreground-Background Separation on GPU using order based approaches Raj Gupta, Sailaja Reddy M., Swagatika Panda, Sushant Sharma and Anurag Mittal Indian.
Multithreaded FPGA Acceleration of DNA Sequence Mapping Edward Fernandez, Walid Najjar, Stefano Lonardi, Jason Villarreal UC Riverside, Department of Computer.
Software Architecture of High Efficiency Video Coding for Many-Core Systems with Power- Efficient Workload Balancing Muhammad Usman Karim Khan, Muhammad.
Mohamed Hefeeda 1 School of Computing Science Simon Fraser University, Canada Multimedia Streaming in Dynamic Peer-to-Peer Systems and Mobile Wireless.
Real-time Embedded Face Recognition for Smart Home Fei Zuo, Student Member, IEEE, Peter H. N. de With, Senior Member, IEEE.
Chia-Yen Hsieh Laboratory for Reliable Computing Microarchitecture-Level Power Management Iyer, A. Marculescu, D., Member, IEEE IEEE Transaction on VLSI.
Chapter Hardwired vs Microprogrammed Control Multithreading
Distinctive image features from scale-invariant keypoints. David G. Lowe, Int. Journal of Computer Vision, 60, 2 (2004), pp Presented by: Shalomi.
Object Recognition Using Distinctive Image Feature From Scale-Invariant Key point D. Lowe, IJCV 2004 Presenting – Anat Kaspi.
Scale Invariant Feature Transform (SIFT)
Power Aware Solutions for NoC Architecture Yaniv Ben-Itzhak Noc Seminar Winter 08.
A Performance and Energy Comparison of FPGAs, GPUs, and Multicores for Sliding-Window Applications From J. Fowers, G. Brown, P. Cooke, and G. Stitt, University.
A Low-Power Low-Memory Real-Time ASR System. Outline Overview of Automatic Speech Recognition (ASR) systems Sub-vector clustering and parameter quantization.
Mahesh Sukumar Subramanian Srinivasan. Introduction Face detection - determines the locations of human faces in digital images. Binary pattern-classification.
By- Jaideep Moses, Ravi Iyer , Ramesh Illikkal and
© 2013 IBM Corporation Efficient Multi-stage Image Classification for Mobile Sensing in Urban Environments Presented by Shashank Mujumdar IBM Research,
Large-Scale Content-Based Image Retrieval Project Presentation CMPT 880: Large Scale Multimedia Systems and Cloud Computing Under supervision of Dr. Mohamed.
Single-Chip Multi-Processors (CMP) PRADEEP DANDAMUDI 1 ELEC , Fall 08.
1 Presenter: Ming-Shiun Yang Sah, A., Balakrishnan, M., Panda, P.R. Design, Automation & Test in Europe Conference & Exhibition, DATE ‘09. A Generic.
Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British Columbia Presented by: Tim Havinga, Joël van Neerbos.
Introduction Due to the recent advances in smart grid as well as the increasing dissemination of smart meters, the electricity usage of every moment in.
MACHINE VISION GROUP Graphics hardware accelerated panorama builder for mobile phones Miguel Bordallo López*, Jari Hannuksela*, Olli Silvén* and Markku.
Multi Core Processor Submitted by: Lizolen Pradhan
Computer Science Department University of Pittsburgh 1 Evaluating a DVS Scheme for Real-Time Embedded Systems Ruibin Xu, Daniel Mossé and Rami Melhem.
Mahesh Sukumar Subramanian Srinivasan. Introduction Embedded system products keep arriving in the market. There is a continuous growing demand for more.
Programming Concepts in GPU Computing Dušan Gajić, University of Niš Programming Concepts in GPU Computing Dušan B. Gajić CIITLab, Dept. of Computer Science.
[Tim Shattuck, 2006][1] Performance / Watt: The New Server Focus Improving Performance / Watt For Modern Processors Tim Shattuck April 19, 2006 From the.
Computer Vision Lab Seoul National University Keyframe-Based Real-Time Camera Tracking Young Ki BAIK Vision seminar : Mar Computer Vision Lab.
Radix-2 2 Based Low Power Reconfigurable FFT Processor Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Gin-Der Wu and Yi-Ming Liu Department.
Towards the Design of Heterogeneous Real-Time Multicore System Adaptive Systems Laboratory, Master of Computer Science and Engineering in the Graduate.
A Parallel Implementation of MSER detection GPGPU Final Project Lin Cao.
Designing for energy-efficient vision-based interactivity on mobile devices Miguel Bordallo Center for Machine Vision Research.
MAPLD 2005/254C. Papachristou 1 Reconfigurable and Evolvable Hardware Fabric Chris Papachristou, Frank Wolff Robert Ewing Electrical Engineering & Computer.
Summary Background –Why do we need parallel processing? Moore’s law. Applications. Introduction in algorithms and applications –Methodology to develop.
Parallel Event Processing for Content-Based Publish/Subscribe Systems Amer Farroukh Department of Electrical and Computer Engineering University of Toronto.
Puzzle Solver Sravan Bhagavatula EE 638 Project Stanford ECE.
Abstract A Structured Approach for Modular Design: A Plug and Play Middleware for Sensory Modules, Actuation Platforms, Task Descriptions and Implementations.
Hybrid Multi-Core Architecture for Boosting Single-Threaded Performance Presented by: Peyman Nov 2007.
1 November 11, 2015 A Massively Parallel, Hybrid Dataflow/von Neumann Architecture Yoav Etsion November 11, 2015.
Scale Invariant Feature Transform (SIFT)
SIFT DESCRIPTOR K Wasif Mrityunjay
University of Michigan Electrical Engineering and Computer Science 1 Embracing Heterogeneity with Dynamic Core Boosting Hyoun Kyu Cho and Scott Mahlke.
Computer Architecture Lecture 24 Parallel Processing Ralph Grishman November 2015 NYU.
A 1.2V 26mW Configurable Multiuser Mobile MIMO-OFDM/-OFDMA Baseband Processor Motivations –Most are single user, SISO, downlink OFDM solutions –Training.
ECE 692 Power-Aware Computer Systems Final Review Prof. Xiaorui Wang.
Philipp Gysel ECE Department University of California, Davis
Distinctive Image Features from Scale-Invariant Keypoints Presenter :JIA-HONG,DONG Advisor : Yen- Ting, Chen 1 David G. Lowe International Journal of Computer.
Quantifying Acceleration: Power/Performance Trade-Offs of Application Kernels in Hardware WU DI NOV. 3, 2015.
Face Detection 蔡宇軒.
1 Munther Abualkibash University of Bridgeport, CT.
MAHARANA PRATAP COLLEGE OF TECHNOLOGY SEMINAR ON- COMPUTER PROCESSOR SUBJECT CODE: CS-307 Branch-CSE Sem- 3 rd SUBMITTED TO SUBMITTED BY.
“Temperature-Aware Task Scheduling for Multicore Processors” Masters Thesis Proposal by Myname 1 This slides presents title of the proposed project State.
Lecture 5: Feature detection and matching CS4670 / 5670: Computer Vision Noah Snavely.
M AESTRO : Orchestrating Predictive Resource Management in Future Multicore Systems Sangyeun Cho, Socrates Demetriades Computer Science Department University.
Parallel Image Processing: Active Contour Algorithm
“Temperature-Aware Task Scheduling for Multicore Processors”
Spare Register Aware Prefetching for Graph Algorithms on GPUs
Hui Chen, Shinan Wang and Weisong Shi Wayne State University
Geometric Hashing: An Overview
Final Project presentation
ECE734 Project-Scale Invariant Feature Transform Algorithm
Presented by Xu Miao April 20, 2005
Presentation transcript:

“Low-Power, Real-Time Object- Recognition Processors for Mobile Vision Systems”, IEEE Micro Jinwook Oh ; Gyeonghoon Kim ; Injoon Hong ; Junyoung Park ; Seungjin Lee ; Joo-Young Kim ; Jeong-Ho Woo ; Hoi-Jun Yoo Presenter: Juseong Lee,

Outline Introduction Background Main Idea Implementation Conclusion Evaluation 2 Object Recognition by Juseong Lee

Outline Introduction Background Main Idea Implementation Conclusion Evaluation 3 Object Recognition by Juseong Lee

Introduction 4 Source by MBN News

Introduction 5 Object recognition system –Require real-time operation High performance Low power in mobile system How can implement? –Find suitable algorithm SIFT algorithm –Hardware optimization Algorithm optimization Make exclusive processor –Parallel computation Multi-threading NoC SIFT - Scale Invariant Feature Transform NoC - Network on Chip Source by VOLVO

Outline Introduction Background Main Idea Implementation Conclusion Evaluation 6 Object Recognition by Juseong Lee

Background Knowledge 7 What is SIFT algorithm? –Scale Invariant Feature Transform –The most popular candidate For how to extract some interest points out of the object and describe them – Robust against changes in translation, scaling, and rotation. Image matching by SIFT

Background Knowledge 8 What’s the problem in SIFT-based object recognition? –Consumes a lot of power Owing to the heavy computation required in descriptor Gen. and matching –Today’s high-resolution image sensors & tight power budgets Make real-time SIFT implementation in mobile device even harder Scare resources problem

Outline Introduction Background Main Idea Implementation Conclusion Evaluation 9 Object Recognition by Juseong Lee

Main Idea 10 How can we solve the problem? –Make an object-recognition processor Using an attention-based recognition algorithm –For energy efficiency A heterogeneous multicore architecture –For data and thread parallelism Network-on-Chip(NoC) communication –For high bandwidth The processor determines Regions of Interest(ROI) part of image –For minimizing unnecessary computations Heterogeneous multicore architecture –provides several types of parallelism –achieves high throughput –low power consumption High-bandwidth NoC plays a role as the communications backbone

Why find ROI? 11 Image processing algorithm has no regard throughput Image size 480 x 360 Objects have feature! 172,800 computations! Example) Edge detection You can select part for reducing computation!

Main Idea – BONE V 12 Using Conventional method Using Main Idea

Main Idea – Algorithm 13 Attention-based object recognition

Main Idea – Architecture 14 Pixel level parallel Very long instruction word 3 stage task level pipeline 1.5x↓ power consumption 5 stage fine-grained pipeline 3.45x↑ pipeline throughput

SMT-enabled heterogeneous multicore processor 15 Throughput-optimized SFEC –Find ROI tile for energy efficiency –Memory locality with high bandwidth utilization Latency-optimized FMP –ROI tile and NoC help latency Power-optimized MLE –Changes the core’s thread allocation –and operating voltage and frequency dynamically BONE-V5: SFEC: SMT-enabled Feature Extraction Cluster FMP: Feature Matching Processor MLE: Machine Learning Engine

Outline Introduction Background Main Idea Implementation Conclusion Evaluation 16 Object Recognition by Juseong Lee

Implementation 17

Implementation - Comparing 18

19 Implementation - Comparing

Outline Introduction Background Main Idea Implementation Conclusion Evaluation 20 Object Recognition by Juseong Lee

Conclusion Energy efficient system is important to improve performance Algorithm and architecture have to optimize at the same time BONE-V multicore processors can apply real- time object recognition system Future BONE-V processors will further lower the power consumption. 21

Outline Introduction Background Main Idea Implementation Conclusion Evaluation 22 Object Recognition by Juseong Lee

Evaluation Table 3 has to contain the result that comparing other recognition processor When hardware optimization, Not only overall algorithm but particular algorithm block optimization are needed –CORDIC based gradient and magnitude computation 23

Thanks for Ur listening! Thanks! 24