Apr 14,2003CPE 631 Project Performance Analysis and Power Estimation of ARM Processor Team: Ajayshanker Krishnamurthy Swathi Tanjore Gurumani Zexin Pan.

Slides:



Advertisements
Similar presentations
Micro controllers introduction. Areas of use You are used to chips like the Pentium and the Athlon, but in terms of installed machines these are a small.
Advertisements

Tuning of Loop Cache Architectures to Programs in Embedded System Design Susan Cotterell and Frank Vahid Department of Computer Science and Engineering.
Using Instruction Block Signatures to Counter Code Injection Attacks Milena Milenković, Aleksandar Milenković, Emil Jovanov The University of Alabama in.
University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science University of Michigan.
Embedded System Lab. What is an embedded systems? An embedded system is a computer system designed for specific control functions within a larger system,
Sim-alpha: A Validated, Execution-Driven Alpha Simulator Rajagopalan Desikan, Doug Burger, Stephen Keckler, Todd Austin.
Computer Systems Nat 4/5 Computing Science Types of Computer and Performance.
Branch Prediction in SimpleScalar
SimpleScalar CS401. A Computer Architecture Simulator Primer What is an architectural simulator? – Tool that reproduces the behavior of a computing device.
Thin Servers with Smart Pipes: Designing SoC Accelerators for Memcached Bohua Kou Jing gao.
Chia-Yen Hsieh Laboratory for Reliable Computing Microarchitecture-Level Power Management Iyer, A. Marculescu, D., Member, IEEE IEEE Transaction on VLSI.
Power Analysis of WEP Encryption Jack Kang Benjamin Lee CS252 Final Project Fall 2003.
Embedded Systems Introduction CS423 Dick Steflik.
Performance and Energy Bounds for Multimedia Applications on Dual-processor Power-aware SoC Platforms Weng-Fai WONG 黄荣辉 Dept. of Computer Science National.
Energy Efficient Instruction Cache for Wide-issue Processors Alex Veidenbaum Information and Computer Science University of California, Irvine.
Exploiting Load Latency Tolerance for Relaxing Cache Design Constraints Ramu Pyreddy, Gary Tyson Advanced Computer Architecture Laboratory University of.
Compilation Techniques for Energy Reduction in Horizontally Partitioned Cache Architectures Aviral Shrivastava, Ilya Issenin, Nikil Dutt Center For Embedded.
The Effect of Data-Reuse Transformations on Multimedia Applications for Different Processing Platforms N. Vassiliadis, A. Chormoviti, N. Kavvadias, S.
Enhancing Embedded Processors with Specific Instruction Set Extensions for Network Applications A. Chormoviti, N. Vassiliadis, G. Theodoridis, S. Nikolaidis.
Orion: A Power-Performance Simulator for Interconnection Networks Presented by: Ilya Tabakh RC Reading Group4/19/2006.
Author: D. Brooks, V.Tiwari and M. Martonosi Reviewer: Junxia Ma
ECE 510 Brendan Crowley Paper Review October 31, 2006.
Architectural and Compiler Techniques for Energy Reduction in High-Performance Microprocessors Nikolaos Bellas, Ibrahim N. Hajj, Fellow, IEEE, Constantine.
Prardiva Mangilipally
11 Establishing the Framework for Datacenter of the Future Richard Curran Director Product Marketing, Intel EMEA.
Computer performance.
Chapter 1 CSF 2009 Computer Abstractions and Technology.
EMBEDDED SYSTEMS G.V.P.COLLEGE OF ENGINEERING Affiliated to J.N.T.U. By By D.Ramya Deepthi D.Ramya Deepthi & V.Soujanya V.Soujanya.
Introduction to SimpleScalar (Based on SimpleScalar Tutorial) TA: Kyung Hoon Kim CSCE614 Texas A&M University.
Introduction to Computers. Objectives Overview Describe the five components of a computer Discuss the advantages and disadvantages that users experience.
1 Introduction to SimpleScalar (Based on SimpleScalar Tutorial) CPSC 614 Texas A&M University.
Energy saving in multicore architectures Assoc. Prof. Adrian FLOREA, PhD Prof. Lucian VINTAN, PhD – Research.
Uncovering the Multicore Processor Bottlenecks Server Design Summit Shay Gal-On Director of Technology, EEMBC.
Technical Seminar Introduction to networking with Linux Administration Amit Kumar Sahoo EC ADVANCED EMBEDDED MICROPROCESSORS AND APPLICATIONS.
Sogang University Advanced Computing System Chap 1. Computer Architecture Hyuk-Jun Lee, PhD Dept. of Computer Science and Engineering Sogang University.
André Seznec Caps Team IRISA/INRIA HAVEGE HArdware Volatile Entropy Gathering and Expansion Unpredictable random number generation at user level André.
PRESENTED BY :BIREN KUMAR SAMAL ADMISSION NO:22I&E/2000.
Mahesh Sukumar Subramanian Srinivasan. Introduction Embedded system products keep arriving in the market. There is a continuous growing demand for more.
High Performance Embedded Computing © 2007 Elsevier Chapter 1, part 2: Embedded Computing High Performance Embedded Computing Wayne Wolf.
ACMSE’04, ALDepartment of Electrical and Computer Engineering - UAH Execution Characteristics of SPEC CPU2000 Benchmarks: Intel C++ vs. Microsoft VC++
Embedded Systems Design: A Unified Hardware/Software Introduction 1 Chapter 3 General-Purpose Processors: Software.
Chapter 1 — Computer Abstractions and Technology — 1 The Computer Revolution Progress in computer technology – Underpinned by Moore’s Law Makes novel applications.
MILAN: Technical Overview October 2, 2002 Akos Ledeczi MILAN Workshop Institute for Software Integrated.
Introduction to SimpleScalar (Based on SimpleScalar Tutorial) CSCE614 Hyunjun Jang Texas A&M University.
ARM offers a broad range of processor cores to address a wide variety of applications while delivering optimum performance, power consumption and system.
Simulation of Decode Filter Cache using SimpleScalar simulator Presented by Fei Hong.
Chapter 5: Computer Systems Design and Organization Dr Mohamed Menacer Taibah University
CISC Machine Learning for Solving Systems Problems Microarchitecture Design Space Exploration Lecture 4 John Cavazos Dept of Computer & Information.
Sunpyo Hong, Hyesoon Kim
On the Importance of Optimizing the Configuration of Stream Prefetches Ilya Ganusov Martin Burtscher Computer Systems Laboratory Cornell University.
Architectural Effects on DSP Algorithms and Optimizations Sajal Dogra Ritesh Rathore.
Introduction to SimpleScalar Tool Set CPEG323 Tutorial Long Chen September, 2005.
Industrial Automation Part I Real Time Control Embedded Systems.
??? ple r B Amulya Sai EDM14b005 What is simple scalar?? Simple scalar is an open source computer architecture simulator developed by Todd.
Fundamental of Information Communication Technology (ICT)
ECE354 Embedded Systems Introduction C Andras Moritz.
Microarchitecture.
Evaluating Register File Size
THE PROCESS OF EMBEDDED SYSTEM DEVELOPMENT
AS PER OBE SYLLABUS DTE KARNATAKA SHANTHU M.Tech SAI JAYANI ACADEMY (R) SIXTH SEMESTER Diploma in ELECTRONICS AND COMMUNICATION ENGINEERING.
Why microcontrollers in embedded systems?
Objectives Overview Explain why computer literacy is vital to success in today's world Describe the five components of a computer Discuss the advantages.
Understanding Performance Counter Data - 1
Stephen Hines, David Whalley and Gary Tyson Computer Science Dept.
Detailed Analysis of MiBench benchmark suite
Christophe Dubach, Timothy M. Jones and Michael F.P. O’Boyle
A High Performance SoC: PkunityTM
Understanding Essential Computer Concepts
Computer Evolution and Performance
Phase based adaptive Branch predictor: Seeing the forest for the trees
Presentation transcript:

Apr 14,2003CPE 631 Project Performance Analysis and Power Estimation of ARM Processor Team: Ajayshanker Krishnamurthy Swathi Tanjore Gurumani Zexin Pan Project Advisor: Dr.Alexander Milenkovic

Apr 14,2003CPE 631 Project Agenda Overview Tools Used Performance Analysis - Results Power Estimation - Results Conclusion

Apr 14,2003CPE 631 Project Overview MiBench SimpleScalar PowerAnalyzer Power Dissipated Target Binaries Benchmarks Exe Compile Simulator Power Estimator Performance Metrics Performance Metrics

Apr 14,2003CPE 631 Project Tools Used Benchmarks: Critical part of design process due to performance based designs Embedded Benchmarks:Fastest growing market segment in the u-processor industry MiBench: (University of Michigan) Free, commercially representative embedded benchmark suite Set of 35 embedded applications of six categories Security –Automotive and Industrial Control, Network, Security, Consumer Devices, Office Automation and Telecommunications Security Algorithms– Rijndael, Blowfish, Sha, Pgp Small data set represents a light-weight, useful embedded application Large data set provides a more stressful, real-world application

Apr 14,2003CPE 631 Project Tools Used… SimpleScalar SimpleScalar: (Born University of Wisconsin) Provides an infrastructure for simulation and architectural modeling Can model a variety of platforms - unpipelined processors to detailed micro architectures Suited to the needs of researchers and instructors - meets the critical requirements: Performance, Flexibility & Detail ARM Supports popular instruction sets -Alpha, Power PC, x86 & ARM Baseline simulator models: - Sim-safe, Sim-fast, Sim-cache, Sim-profile, Sim-bpred, Sim-outorder Sim-fuz, Sim-outorder

Apr 14,2003CPE 631 Project Tools Used… PowerAnalyzer: PowerAnalyzer: SimpleScalar-Arm Power Modeling Project Joint venture of U Michigan & U Colorado Estimator that allows power/performance trade-offs to be examined Tightly Coupled with SimpleScalar Toolset for ARM Gives Power dissipation for each component individually –Switching, Internal &Leakage Can be configured based on two models: –Analytical –Analytical & Empirical

Apr 14,2003CPE 631 Project Measurement Methodology Configured for Current (SA 110) and Next (PXA 250) generation Input: Same dataset (>3M) for all algorithms to achieve fair comparison and reliable result Output: raw data related to performance and power consumption are obtained from PowerAnalyzer report Data Processing (digesting) and visualizing

Apr 14,2003CPE 631 Project Performance Analysis Configured Sim-outorder to represent current and next generation of embedded processors Intel SA-110 for current generation –32 bit general purpose micro processor –On chip data cache(16K),instruction cache(16 K) and MMU –Used in PDAs, Smart phones, digital cameras etc. Intel PXA-250 for next generation –High performance Intel Xscale core –On chip data cache(32 K),instruction cache(32 K),branch target buffer and MMU –Used in Multimedia Applications

Apr 14,2003CPE 631 Project Configuration CurrentNext I Fetch Q size24 Branch Pred.Not TakenBimod I Issue Width11 Cache dl116:32:3232:32:32 Cache il116:32:3232:32:32 TLB itlb16:4096:4 TLB dtlb32:4096:4

Apr 14,2003CPE 631 Project Results

Apr 14,2003CPE 631 Project Results

Apr 14,2003CPE 631 Project Results Current generation predictor : Not Taken Next generation predictor : Bimod

Apr 14,2003CPE 631 Project Results

Apr 14,2003CPE 631 Project Why use power as performance’s criteria? T. Mudge, “Power: A first class design constraint,” Computer, vol. 34, no. 4, April 2001, pp – Limiting power consumption is critical, particularly in portable and mobile applications such as cell phone and laptop due to limit battery life –One of the major markets of ARM is portable and mobile products

Apr 14,2003CPE 631 Project Power Estimation Measurement Methodology –ARM simulator & power measurement tools: PowerAnalyzer 1.1 from UMICH –Configured for Current (SA 110) and Next (PXA 250) generation –Input: Same dataset (>3M) for all algorithms to achieve fair comparison and reliable result –Output: raw data related to performance and power consumption are obtained from Power Analyzer report –Data processing (digesting) and visualizing

Apr 14,2003CPE 631 Project Difficulties using PowerAnalyzer –Report gives power consumption for every ARM component, but no unit! –Since all these numbers are huge, we have difficulties figuring out what they mean ?? Power Estimation

Apr 14,2003CPE 631 Project Power Estimation

Apr 14,2003CPE 631 Project Power Estimation

Apr 14,2003CPE 631 Project Power Estimation

Apr 14,2003CPE 631 Project Conclusion The performance gain in next generation of processors is offset by the increase in power consumption. Intel Xscale almost doubles the power consumption with about 10% performance gain over SA- 110 The next generation of processors with larger caches improve performance The bimodal branch predictor greatly reduces the number of miss predictions Power consumption not only depends on hardware architecture and system configuration (system clock,etc.), but also heavily relies on Benchmark and input dataset

Apr 14,2003CPE 631 Project Thank You Questions…