October 6, 2004.Software Technology Forum 1 The Renaissance of Compiler Development Com piler optimizations motivated by embedded systems Tibor Gyimóthy.

Slides:



Advertisements
Similar presentations
Code Optimization and Performance Chapter 5 CS 105 Tour of the Black Holes of Computing.
Advertisements

Program Slicing: Theory and Practice Tibor Gyimóthy Department of Software Engineering University of Szeged.
1 Optimization Optimization = transformation that improves the performance of the target code Optimization must not change the output must not cause errors.
7. Optimization Prof. O. Nierstrasz Lecture notes by Marcus Denker.
MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita Universidad de Granada J. Manuel Martinez – Lechado Vitelcom.
Compiler-Based Register Name Adjustment for Low-Power Embedded Processors Discussion by Garo Bournoutian.
Zhiguo Ge, Weng-Fai Wong, and Hock-Beng Lim Proceedings of the Design, Automation, and Test in Europe Conference, 2007 (DATE’07) April /4/17.
A reconfigurable system featuring dynamically extensible embedded microprocessor, FPGA, and customizable I/O Borgatti, M. Lertora, F. Foret, B. Cali, L.
Overview Motivations Basic static and dynamic optimization methods ADAPT Dynamo.
Whole-Program Linear-Constant Analysis with Applications to Link-Time Optimization Ludo Van Put – Dominique Chanet – Koen De Bosschere Ghent University.
Evaluating Performance and Power of Object-oriented vs. Procedural Programming in Embedded Processors A. Chatzigeorgiou, G. Stephanides Department of Applied.
CS 151 Digital Systems Design Lecture 37 Register Transfer Level
Copyright © 2002 UCI ACES Laboratory A Design Space Exploration framework for rISA Design Ashok Halambi, Aviral Shrivastava,
Zheming CSCE715.  A wireless sensor network (WSN) ◦ Spatially distributed sensors to monitor physical or environmental conditions, and to cooperatively.
Extensible Processors. 2 ASIP Gain performance by:  Specialized hardware for the whole application (ASIC). −  Almost no flexibility. −High cost.  Use.
August Code Compaction for UniCore on Link-Time Optimization Platform Zhang Jiyu Compilation Toolchain Group MPRC.
TM Pro64™: Performance Compilers For IA-64™ Jim Dehnert Principal Engineer 5 June 2000.
Chia-Yen Hsieh Laboratory for Reliable Computing Microarchitecture-Level Power Management Iyer, A. Marculescu, D., Member, IEEE IEEE Transaction on VLSI.
Source Code Optimization and Profiling of Energy Consumption in Embedded System Simunic, T.; Benini, L.; De Micheli, G.; Hans, M.; Proceedings on The 13th.
Wish Branches A Review of “Wish Branches: Enabling Adaptive and Aggressive Predicated Execution” Russell Dodd - October 24, 2006.
Improving the Efficiency of Memory Partitioning by Address Clustering Alberto MaciiEnrico MaciiMassimo Poncino Proceedings of the Design,Automation and.
Introduction to ARM Architecture, Programmer’s Model and Assembler Embedded Systems Programming.
Introduction to Program Optimizations Chapter 11 Mooly Sagiv.
Instruction Set Architecture (ISA) for Low Power Hillary Grimes III Department of Electrical and Computer Engineering Auburn University.
The Effect of Data-Reuse Transformations on Multimedia Applications for Different Processing Platforms N. Vassiliadis, A. Chormoviti, N. Kavvadias, S.
Enhancing Embedded Processors with Specific Instruction Set Extensions for Network Applications A. Chormoviti, N. Vassiliadis, G. Theodoridis, S. Nikolaidis.
Introduction to Optimization Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
November 18, 2004 Embedded System Design Flow Arkadeb Ghosal Alessandro Pinto Daniele Gasperini Alberto Sangiovanni-Vincentelli
PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.
Architectural and Compiler Techniques for Energy Reduction in High-Performance Microprocessors Nikolaos Bellas, Ibrahim N. Hajj, Fellow, IEEE, Constantine.
8/16/2015\course\cpeg323-08F\Topics1b.ppt1 A Review of Processor Design Flow.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
A Reconfigurable Processor Architecture and Software Development Environment for Embedded Systems Andrea Cappelli F. Campi, R.Guerrieri, A.Lodi, M.Toma,
Instituto de Informática and Dipartimento di Automatica e Informatica Universidade Federal do Rio Grande do Sul and Politecnico di Torino Porto Alegre,
A RISC ARCHITECTURE EXTENDED BY AN EFFICIENT TIGHTLY COUPLED RECONFIGURABLE UNIT Nikolaos Vassiliadis N. Kavvadias, G. Theodoridis, S. Nikolaidis Section.
Automated Design of Custom Architecture Tulika Mitra
Speculative Software Management of Datapath-width for Energy Optimization G. Pokam, O. Rochecouste, A. Seznec, and F. Bodin IRISA, Campus de Beaulieu
Sogang University Advanced Computing System Chap 1. Computer Architecture Hyuk-Jun Lee, PhD Dept. of Computer Science and Engineering Sogang University.
1 Advance Computer Architecture CSE 8383 Ranya Alawadhi.
Is Out-Of-Order Out Of Date ? IA-64’s parallel architecture will improve processor performance William S. Worley Jr., HP Labs Jerry Huck, IA-64 Architecture.
Mahesh Sukumar Subramanian Srinivasan. Introduction Embedded system products keep arriving in the market. There is a continuous growing demand for more.
© 2010 IBM Corporation Code Alignment for Architectures with Pipeline Group Dispatching Helena Kosachevsky, Gadi Haber, Omer Boehm Code Optimization Technologies.
Presenter: Jyun-Yan Li A hybrid approach to the test of cache memory controllers embedded in SoCs’ W. J. Perez, J. Velasco Universidad del Valle Grupo.
Predicated Static Single Assignment (PSSA) Presented by AbdulAziz Al-Shammari
3 rd Nov CSV881: Low Power Design1 Power Estimation and Modeling M. Balakrishnan.
CS 211: Computer Architecture Lecture 6 Module 2 Exploiting Instruction Level Parallelism with Software Approaches Instructor: Morris Lancaster.
1 Code optimization “Code optimization refers to the techniques used by the compiler to improve the execution efficiency of the generated object code”
Low-Power Cache Organization Through Selective Tag Translation for Embedded Processors with Virtual Memory Support Xiangrong Zhou and Peter Petrov Proceedings.
1 CS 201 Compiler Construction Introduction. 2 Instructor Information Rajiv Gupta Office: WCH Room Tel: (951) Office.
Synopsys University Courseware Copyright © 2012 Synopsys, Inc. All rights reserved. Compiler Optimization and Code Generation Lecture - 1 Developed By:
Adaptive Multi-Threading for Dynamic Workloads in Embedded Multiprocessors 林鼎原 Department of Electrical Engineering National Cheng Kung University Tainan,
Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
CSE 598c – Virtual Machines Survey Proposal: Improving Performance for the JVM Sandra Rueda.
The Process From bare bones to finished product. The Steps Programming Debugging Performance Tuning Optimization.
DR. SIMING LIU SPRING 2016 COMPUTER SCIENCE AND ENGINEERING UNIVERSITY OF NEVADA, RENO Session 2 Computer Organization.
High Performance Embedded Computing © 2007 Elsevier Lecture 10: Code Generation Embedded Computing Systems Michael Schulte Based on slides and textbook.
3/2/2016© Hal Perkins & UW CSES-1 CSE P 501 – Compilers Optimizing Transformations Hal Perkins Autumn 2009.
SIMD Implementation of Discrete Wavelet Transform Jake Adriaens Diana Palsetia.
1 ROGUE Dynamic Optimization Framework Using Pin Vijay Janapa Reddi PhD. Candidate - Electrical And Computer Engineering University of Colorado at Boulder.
The Effect of Data-Reuse Transformations on Multimedia Applications for Application Specific Processors N. Vassiliadis, A. Chormoviti, N. Kavvadias, S.
Code Optimization.
Improving Program Efficiency by Packing Instructions Into Registers
Methodology of a Compiler that Compresses Code using Echo Instructions
A Review of Processor Design Flow
Stephen Hines, David Whalley and Gary Tyson Computer Science Dept.
STUDY AND IMPLEMENTATION
Estimating Timing Profiles for Simulation of Embedded Systems
Code Transformation for TLB Power Reduction
Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
CSc 453 Final Code Generation
Presentation transcript:

October 6, 2004.Software Technology Forum 1 The Renaissance of Compiler Development Com piler optimizations motivated by embedded systems Tibor Gyimóthy University of Szeged

Software Technology Forum2 October 6, Open Source Laboratory at the University of Szeged

Software Technology Forum3 October 6, Open Source Laboratory at the University of Szeged Beszédes Árpád Ferenc Rudolf Gergely Tamás Gyimóthy Tibor Jász Judit Havasi Ferenc Kiss Ákos Lóki Gábor Patrik Kluba Siket István Siket Péter Sógor Zoltán Tóth Gábor Vidács László

Software Technology Forum4 October 6, The Dragon book

Software Technology Forum5 October 6, Good quality compilers

Software Technology Forum6 October 6, New challenges:embedded processors

Software Technology Forum7 October 6, Low energy computation System-wide optimizations are required for energy-saving. Software design has significant impact on the energy consumption of the processor. Accurate energy models of the hardware modules are required for power analysis at system level.

Software Technology Forum8 October 6, Low energy computation(cont.) Instruction level optimizations include reordering instructions to reduce switching,reduction of memory operands etc. The register relabeling technique reorders the register labels of the generated code. A sample trace and a power model are used to obtain new labels (The Pennsylvania State University).

Software Technology Forum9 October 6, Low energy computation(cont.) ARM is the most popular processor for the embedded domain. The 32 bit ARM processor also supports the 16 bit Thumb instruction set. By using Thumb code the I-cache activity (energy) can be reduced. However,the restricted Thumb instruction set may lead to the loss in performance. Profile guided algorithms were proposed for generating mixed ARM and Thumb code (The University of Arizona)

Software Technology Forum10 October 6, Low energy computation(cont.) System-level power optimizing data-flow transformations are applied for multimedia applications(IMEC,Leuven,Belgium) Main aim is to reduce the power consumption due to data storage and transfers (significant part of the total power budget of the system). Performance and code size must be taken into account as well.

Software Technology Forum11 October 6, Low energy computation(cont.) Tradional compiler approaches focusing only on speed are not sufficient for multimedia applications. The power cost model is linear with respect to the acces frequency and the dependence on the memory size is determined by a polynomial function. Code transformations are applied to the original source code (C). The method reduces the size of the array signals and the acceses to array signals. Very large power savings can be achived without introducing significant performance penalties.

Software Technology Forum12 October 6, Low energy computation(cont.) Closely related to the code size reduction  Executing fewer instructions  Accessing external memory less frequently Agressive code size optimization techniques Post-link time optimizations

Software Technology Forum13 October 6, Code size reductions in GCC A survey paper in ACM Computing Surveys,2003 Magic switches & patches Local factoring Sequence abstraction CSiBE- Code Size Benchmark GCC improvement for Symbian

Software Technology Forum14 October 6, Magic switches & patches Function inlining Tree-to-rtl extension Optimizing large jump tables Extending move and compare parallelization Crossjumping cleanup

Software Technology Forum15 October 6, Local factoring Code motion techniques  Strategy: Move identical instructions from basic blocks to their common predecessor and successor Data and register dependence must not be altered  Implemented: Code hoisting  moving the code to an earlier place in the execution path Code sinking  moving the code to a later execution place

Software Technology Forum16 October 6, Code Hoisting

Software Technology Forum17 October 6, Code sinking

Software Technology Forum18 October 6, Sequence abstraction Sequence Abstraction  Works on SESE code fragments  Strategy: Find region of identical instructions which can be turned into procedures Replace all occurrences with calls to the newly created subroutine

Software Technology Forum19 October 6, Sequence abstraction(cont)

Software Technology Forum20 October 6, Sequence abstraction  RTL implementation: Compilation unit: function No need to create a procedure, create only a call to a labelled representative code region

Software Technology Forum21 October 6, CSiBE-GCC Code Size Benchmark Introduced in 2003 ‘de facto’ a standard size benchmark for GCC Continuous monitoring the impact of the new patches on the code size New version: compilation time and performance More and more GCC developers are using CSiBE in their daily work

Software Technology Forum22 October 6, CSiBE(cont)

Software Technology Forum23 October 6, CSiBE(cont)

Software Technology Forum24 October 6, CSiBE(cont)

Software Technology Forum25 October 6, Overall tendency of the code size 1.3% 4.3% 5.1% 7.2% 1.3% 4.3% 5.1% 7.2%

Software Technology Forum26 October 6, GCC improvement for Symbian The official Symbian build is based on GCC 2.9, from the year The GCC was extended to support the Symbian target. The modified compiler reduces the code size and improves the performance as well (5-10%).

Software Technology Forum27 October 6, Post-link time optimization Interprocedural versions of the classical compiler optimization techniques are used for binary-rewriting of machine code(whole system optimization,Squeeze—Debray et al,ATOM—Univ.of Szeged). Techniques:  Interprocedural control flow analysis,constant propagation,register liveness analysis  Redundant code-elimination  Unreachable-code elimination  Dead-code elimination  Strength-code elimination  Local factoring  Procedural abstraction

Software Technology Forum28 October 6, Some open issues in the GCC development There is a need for effective size optimization methods on Tree-SSA level. The sequence abstraction approach can be extended to the unit-at time level. The post-link time optimization methods can be integrated into GCC.

Software Technology Forum29 October 6, Conclusions Open source software is widely used in the industry There is a need for machine–level programmers Challenge:many people will use their programs

Software Technology Forum30 October 6, Effective compilers for embedded processors