Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Object Computing Laboratory Washington University St. Louis, MO Demystifying GCC Under the Hood of the GNU Compiler Collection Morgan DetersRon.

Similar presentations


Presentation on theme: "Distributed Object Computing Laboratory Washington University St. Louis, MO Demystifying GCC Under the Hood of the GNU Compiler Collection Morgan DetersRon."— Presentation transcript:

1 Distributed Object Computing Laboratory Washington University St. Louis, MO Demystifying GCC Under the Hood of the GNU Compiler Collection Morgan DetersRon K. Cytron mdeters@cs.wustl.educytron@cs.wustl.edu Copyright is held by the author/owner(s). OOPSLA'06, October 22–26, 2006, Portland, Oregon, USA. 2006 ACM 06/0010. Copyright is held by the author/owner(s). OOPSLA'06, October 22–26, 2006, Portland, Oregon, USA. 2006 ACM 06/0010. Copyright © 2005–2006 Morgan Deters

2 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 2 22 October 2006 Tutorial Objectives Introduce the internals of GCC 4.1.1  Java and C++ front-ends  Optimizations  Back-end structure How to add new or change  language front-ends  optimizations  machine-specific back-ends How to debug/improve GCC

3 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 3 22 October 2006 GCC Big Picture What is GCC? Why use GCC? What does compilation with GCC look like? Demystifying GCC

4 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 4 22 October 2006 GCC Big Picture What is GCC ? A compiler for multiple languages…  C  C++  Java  Objective-C/C++  FORTRAN  Ada

5 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 5 22 October 2006 GCC Big Picture What is GCC ? …supporting multiple targets arcarmavrbfin c4xcriscrxfr30 frvh8300i386ia64 iq2000m32cm32rm68hc11 m68kmcoremipsmmix mn10300mtpapdp11 rs6000s390shsparc stormy16v850vaxxtensa These are code generators; variants are also supported (e.g. powerpc is a “variant” of the rs6000 code generator) These are code generators; variants are also supported (e.g. powerpc is a “variant” of the rs6000 code generator)

6 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 6 22 October 2006 GCC Big Picture What GCC is not GCC is not  an assembler (see GNU binutils)  a C library (see glibc)  a debugger (see gdb)  an IDE

7 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 7 22 October 2006 GCC Big Picture Advantages of using GCC as an R&D platform Research is immediately usable by everyone  Large development community and user base  GCC is a modern, practical compiler multiple architectures, full standard languages, optimizations debugging support You can meet GCC halfway  modular: hack some parts, rely on the others Can incorporate bug fixes that come along  minor version upgrades (e.g. 3.3.x  3.4.x) – no big deal  major version upgrades (e.g. 3.x  4.x) – more of a pain Need not maintain code indefinitely (if incorporated)

8 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 8 22 October 2006 GCC Big Picture The GCC project and the GPL Open-source  covered by GNU General Public License (GPL) Any changes you make to GCC source code or associated libraries must also be GPLed However, compiler and libraries can be used/linked against in non-GPL development Your improvements to GCC must be open-source, but your customers need not open-source their programs to use your stuff

9 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 9 22 October 2006 GCC Big Picture Typical structure of GCC compilation gcc/g++/gcj compiler assembler linker ELF object assembly program assembly program source program source program

10 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 10 22 October 2006 GCC Big Picture Inside the compiler compiler (C, C++, Java) parser / semantic checker parser / semantic checker tree optimizations tree optimizations gimplifier expander RTL passes target arch instruction selection target arch instruction selection treesRTL

11 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 11 22 October 2006 GCC Basics How do you build GCC? How do you navigate the source tree? Demystifying GCC

12 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 12 22 October 2006 GCC Basics GCC Basics: Getting Started Requirements to build GCC  usual suite of UNIX tools (C compiler, assembler/linker, GNU Make, tar, awk, POSIX shell) For development  GNU m4 and GNU autotools (autoconf/automake/libtool)  gperf  bison, flex  autogen, guile, gettext, perl, Texinfo, diffutils, patch, … Obtaining GCC sources  gcc.gnu.org or local mirror (see gcc.gnu.org/mirrors.html)  get gcc-core package, then language add-ons gcc-java requires gcc-g++

13 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 13 22 October 2006 GCC Basics Building GCC from sources Configure it in a separate build directory from sources  /path/to/source/directory/configure options…  --prefix= install-location  --enable-languages= comma-separated-language-list  --enable-checking turns on sanity checks (especially on intermediate representation) Build it !  Environment variables useful when debugging compiler/runtime CFLAGSstage 1 flags (using host C compiler) BOOT_CFLAGSstage 2 and stage 3 flags (using stage 1 GCC) CFLAGS_FOR_TARGETflags for new GCC building target binaries CXXFLAGS_FOR_TARGET flags for new GCC building libstdc++/others GCJFLAGSflags for new GCC building Java runtime ‘ -O0 –ggdb3 ’ is recommended when debugging

14 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 14 22 October 2006 GCC Basics Building GCC from sources Build it ! continued…  make bootstrap (to bootstrap) or make (to not) bootstrap useful when compiling with non-GCC host compiler during development, non-bootstrap is faster and also better at recompiling just those sources that have changed  use make’s -j option to speed things up on MP/dual core  make bootstrap-lean cleans up between stages, uses less disk  make profiledbootstrap faster compiler produced, but need GCC host –j unsupported Install it !  make install

15 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 15 22 October 2006 GCC Basics Building a cross-compiler Code generator can be built for any target  runtime libraries then are built using that code generator Since GCC outputs assembly, you actually need a full cross development toolchain  Dan Kegel’s crosstool automates a GNU/Linux cross chain for popular configurations: Linux kernel headers GNU binutils glibc gcc see kegel.com/crosstoolkegel.com/crosstool

16 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 16 22 October 2006 GCC Basics GCC Basics: Getting Around Other tools recommended when hacking GCC GNU Screenattach/reattach terminal sessions etagsnavigation to source definitions (emacs) ctagsnavigation to source definitions (vi) c++filtdemangle C++/Java mangled symbols readelfdecompose ELF files objdumpobject file dumper/disassembler gdbGNU debugger

17 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 17 22 October 2006 GCC Basics GCC Drivers gcc, g++, gcj are drivers, not compilers  They will execute (as appropriate): compiler (cc1, cc1plus, jc1) Java program main entry point generation (jvgenmain) assembler (as) linker (collect2) Differences between drivers include active #defines, default libraries, other behavior  but can use any driver for any source language

18 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 18 22 October 2006 GCC Basics Most useful driver options for debugging -E preprocess, don’t compile -S compile, don’t assemble -H verbose header inclusion -save-temps save temporary files -print-search-dirs print search paths -v verbose (see what the driver does) -g include debugging symbols --help get command line help --version show full version info -dumpversion show minimal version info

19 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 19 22 October 2006 GCC Basics For extra help man gcc basic option assistance info gcc using gcc in-depth; language extensions etc. info gccint internals documentation Top-level INSTALL directory in distribution provides help on configuring and building GCC

20 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 20 22 October 2006 GCC Basics Tour of GCC source INSTALLconfiguration/installation documentation boehm-gcthe Boehm garbage collector configarchitecture-specific configure fragments contribcontributed scripts fastjara replacement for the jar tool fixincludessource for a program to fix host header files when they aren't ANSI-compliant gccthe main compiler source includeheaders used by GCC (libiberty mostly) intlsupport for languages other than English

21 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 21 22 October 2006 GCC Basics Tour of GCC source, cont’d libcppsource for C preprocessing library libffiForeign Function Interface library (allows function callers and receivers to have different calling conventions) libibertyuseful utility routines (symbol tables etc.) used by GCC and replacement functions for common things not provided by host libjavasource for standard Java library libmudflapsource for a pointer instrumentation library libstdc++-v3source for standard C++ library maintainer-scriptsutility scripts for GCC maintainers zlibcompression library source

22 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 22 22 October 2006 The GCC Front-End Option processing Controlling drivers and hooking up front-ends The C, C++, and Java front-ends The GENERIC high-level intermediate representation Middle-end Back-end Front-end Demystifying GCC

23 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 23 22 October 2006 GCC Front-End The GCC Front-End gcc, g++, gcj driver entry point  main (gcc/gcc.c) cc1, cc1plus, jc1 share a common entry point  toplev_main (gcc/toplev.c) actual main in gcc/main.c –just calls toplev_main() –can be overridden by front-end

24 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 24 22 October 2006 GCC Front-End Command-line option processing In gcc/ directory common.optoption definitions opts.{c,h} common_handle_option() c-opts.c c_common_handle_option() c.optC compiler option definitions java/lang.optJava compiler option definitions java/lang.c java_handle_option() These are cc1, cc1plus, jc1 option handling routines  drivers just pass on arguments as declared in spec files

25 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 25 22 October 2006 GCC Front-End common.opt Parsed by awk scripts at build time to generate options.c, options.h Simple format  Language specifications and option stanzas Each option stanza contains 1.option name 2.space-separated options list 3.documentation string for --help output

26 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 26 22 October 2006 GCC Front-End Properties of command-line options Available properties for use in.opt option spec files are Commonoption is available for all front-ends Targetoption is target-specific Joinedargument is mandatory and may be joined Separateargument is mandatory and may be separate JoinedOrMissingoptional argument, must be joined if present RejectNegativethere is not an associated “no-” option UIntegerargument expected is a nonnegative integer Undocumentedundocumented; do not include in --help output Report--fverbose-asm should report the state of this option

27 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 27 22 October 2006 GCC Front-End Properties of options cont’d Var(var-name)set var-name to true (or argument) if present VarExistsdo not define variable in resulting options.c Init(value)static initializer for variable Mask(name)associated with a bit in target_flags bit vector; MASK_name is automatically #defined to the bitmask; TARGET_name is automatically #defined as an expression that is 1 when the option is used, 0 when not InverseMask(other, [this]) option is inverse of another option with Mask(other); if this is given, #define TARGET_this. MaskExistsdon’t #define again; use for synonymous options Condition(cond)option permitted iff preprocessor cond is true

28 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 28 22 October 2006 GCC Front-End Language-specific options gcc/c.opt, gcc/java/lang.opt, gcc/cp/lang.opt Special processing in gcc/java/lang.c Specify valid language-names as an option

29 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 29 22 October 2006 Adding command-line options In Greater Depth

30 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 30 22 October 2006 GCC Front-End Controlling the drivers: spec files gcc/gcc.cspecs for gcc driver gcc/cp/lang-specs.hadditional specs for g++ driver gcc/java/lang-specs.hadditional specs for gcj driver gcc/gcc.c contains documentation on spec language Use -dumpspecs to see specifications %{E|M|MM:%(trad_capable_cpp) %(cpp_options) %(cpp_debug_options)} %{!E:%{!M:%{!MM: %{traditional|ftraditional: %eGNU C no longer supports -traditional without -E} %{save-temps|traditional-cpp|no-integrated-cpp:%(trad_capable_cpp) %(cpp_options) -o %{save-temps:%b.i} %{!save-temps:%g.i} \n cc1 -fpreprocessed %{save-temps:%b.i} %{!save-temps:%g.i} %(cc1_options)} %{!save-temps:%{!traditional-cpp:%{!no-integrated-cpp: cc1 %(cpp_unique_options) %(cc1_options)}}} %{!fsyntax-only:%(invoke_as)}}} %{E|M|MM:%(trad_capable_cpp) %(cpp_options) %(cpp_debug_options)} %{!E:%{!M:%{!MM: %{traditional|ftraditional: %eGNU C no longer supports -traditional without -E} %{save-temps|traditional-cpp|no-integrated-cpp:%(trad_capable_cpp) %(cpp_options) -o %{save-temps:%b.i} %{!save-temps:%g.i} \n cc1 -fpreprocessed %{save-temps:%b.i} %{!save-temps:%g.i} %(cc1_options)} %{!save-temps:%{!traditional-cpp:%{!no-integrated-cpp: cc1 %(cpp_unique_options) %(cc1_options)}}} %{!fsyntax-only:%(invoke_as)}}} adapted from gcc/gcc.c

31 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 31 22 October 2006 GCC Front-End The C front-end C front-end is in gcc/ directory  parse entry point c_common_parse_file (c-opts.c) workhorse is c_parse_file (c-parser.c) c-common.defIR codes for C compiler c-common.cfunctions for C-like front-ends c-convert.ctype conversion c-cppbuiltin.cbuilt-in preprocessor #defines c-decl.cdeclaration handling c-dump.cIR-dumping c-errors.cpedantic warning issuance c-format.cformat checking for printf-like functions c-gimplify.clowering of IR (and documentation)

32 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 32 22 October 2006 GCC Front-End The C front-end, cont’d c-incpath.cinclude path generation for preprocessor c-lang.clanguage infrastructure, front-end hookups c-lex.clexical analyzer (manually coded) c-objc-common.csome functions for C and Objective-C c-opts.coption processing, some init stuff c-parser.cparser (based on an old bison parser) c-pch.cprecompiled header support c-ppoutput.cpreprocessing-only support (-E option) c-pragma.csupport for #pragma pack and #pragma weak c-pretty-print.cused to pretty-print expressions in error messages c-semantics.cstatement list handling in IR c-typeck.cfunctions to build IR, type checks gccspec.cdriver-specific tasks for gcc driver

33 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 33 22 October 2006 GCC Front-End The C++ front-end In subdirectory gcc/cp/  same parse entry point as C compiler call.cfunction/method invocation lookup and handling class.cbuilding (the runtime artifacts of) classes etc. cp-gimplify.cIR lowering cp-lang.clanguage hooks for C++ front-end cp-objcp-common.ccommon bits for C++ and Objective-C++ cvt.ctype conversion cxx-pretty-print.cC++ pretty-printer decl.cdeclaration and variable handling decl2.cadditional declaration and variable handling dump.cIR dumping

34 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 34 22 October 2006 GCC Front-End The C++ front-end, cont’d error.cC++ error-reporting callbacks except.cC++ exception-handling support expr.cIR lowering for C++ friend.cC++ “friend” support init.cdata initializers and constructors lex.cthe C++ lexical analyzer mangle.cC++ name mangling method.cmethod handling; default constructor generation name-lookup.ccontext-aware name (type, var, namespace) lookup optimize.cconstructor/destructor cloning parser.cthe C++ parser pt.cparameterized type (template) support ptree.cIR pretty-printing repo.cC++ template repository support

35 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 35 22 October 2006 GCC Front-End The C++ front-end, cont’d rtti.csupport for run-time type information search.ctype search in the presence of multiple inheritance semantics.csemantic checking tree.cC++ front-end specific IR functionality typeck.cfunctionality dealing with types, conversion typeck2.ctypes, conversion, type errors g++spec.cdriver-specific tasks for g++ driver

36 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 36 22 October 2006 GCC Front-End The Java front-end In subdirectory gcc/java/  parse entry point java_parse_file (jcf-parse.c) boehm.cper-type bitmask building for Boehm GC buffer.{c,h}expandable buffer data type builtins.cbuiltin/inline functions for Java (like Math.min()) check-init.cchecks over IR for uninitialized variables class.cIR building of classes, class-references, vtables, etc. constants.cclass file constant pool handling decl.cJava declaration support (misc.) except.cJava exception support expr.cJava expressions (misc.) gjavah.csource for gcjh program java-gimplify.cIR lowering

37 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 37 22 October 2006 GCC Front-End The Java front-end jcf-depend.cclass file dependency tracking jcf-dump.csource for jcf-dump program jcf-io.cclass file I/O utility functions jcf-parse.centry point for compiling Java files jcf-path.cCLASSPATH-sensitive search jcf-reader.cgeneric, pluggable class file reader jcf-write.cclass file writer jv-scan.csource for jv-scan program jvgenmain.csource for jvgenmain program jvspec.cJava option specs lang.clanguage hooks, options processing lex.cJava lexical analyzer mangle.csymbol-mangling routines mangle_name.csymbol-mangling routines

38 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 38 22 October 2006 GCC Front-End The Java front-end parse-scan.yminimal, fast parser for syntax checking parse.yJava (source-language) parser resource.cSupport for --resource option typeck.croutines related to types and type conversion verify-glue.cinterface between verifier and compiler verify-impl.cbytecode verifier win32-host.cfor Windows; case-sensitive filename matching zextract.cread class files from zip/jar archives keyword.gperfJava keyword specification

39 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 39 22 October 2006 GCC Front-End Multiple “front-ends” for Java common entry point at java_parse_file  gcc/java/jcf-parse.c compile.java .o  gcc/java/parse.y compile.class .o (or.jar .so)  gcc/java/expr.c (with gcc/java/jcf-reader.c)  expand_byte_code, process_jvm_instruction compile.java .class (with –C option)  gcc/java/parse.y with flag_emit_class_files set  unusual back-end (as if syntax checking only)

40 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 40 22 October 2006 GCC Front-End The “treelang” front end: Essential front-end components configure fragment (config-lang.in) language-specific options (lang.opt) filename handling for driver (lang-specs.h) treelang-specific tree codes (treelang-tree.def) front-end hookups to toplev.c (treetree.c)  see gcc/langhooks.h for documentation flex scanner (lex.l) bison parser (parse.y) structural functions (tree1.c)

41 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 41 22 October 2006 Adding a new front-end to GCC In Greater Depth

42 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 42 22 October 2006 GCC Front-End G ENERIC trees Front-ends are written in C ! We’d like to have…  tree node base class subclasses for expressions etc. Instead we have  union tree_node (gcc/tree.h) each field is a struct components of union

43 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 43 22 October 2006 GCC Front-End Structs vs. unions field 1 field 2 field 3 field 4 struct field 1 field 2 field 3 field 4 union low memory high memory fields overlap in memory; you’re on your own for type safety !

44 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 44 22 October 2006 GCC Front-End The tree_node union low memory high memory typedef union tree_node *tree; union tree_node int_cst type identifier field_decl exp … Everything is a tree ! common

45 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 45 22 October 2006 GCC Front-End The tree_node union The “common” part contains  code (kind of tree – declaration, expression, etc.)  chain (for linking trees together)  type (type of the represented item – also a tree)  flags side effects addressable access flags (used for other things in non-declarations) 7 language-specific flags

46 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 46 22 October 2006 GCC Front-End Macros for accessing tree parts In the common part  TREE_* TREE_CODE(tree) TREE_TYPE(tree) TREE_SIDE_EFFECTS(tree) etc. For specific trees  type trees TYPE_* –TYPE_FIELDS(tree)gets a list of fields in the type –TYPE_NAME(tree)gets the type’s associated decl

47 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 47 22 October 2006 GCC Front-End Expression trees Lots of tree codes used for expressions  gcc/tree.def defines all standard tree codes  LT_EXPRless-than conditional  TRUTH_ORIF_EXPRshort-circuiting OR conditional  MODIFY_EXPRassignment  NOP_EXPRtype promotion (typically)  SAVE_EXPRstore in temporary for multiple uses  ADDR_EXPRtake address of Front-end extensions to GENERIC permitted  gcc/c-common.def  gcc/cp/cp-tree.defe.g. DYNAMIC_CAST_EXPR  gcc/java/java-tree.defe.g. SYNCHRONIZED_EXPR

48 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 48 22 October 2006 GCC Front-End A few useful front-end functions build() expression tree building – pass tree code, tree type, and (arbitrary number of) operands fold() simple tree restructuring and optimization; mostly useful for constant folding gcc_assert() assertion verification – if it fails it gives an “internal compiler error” report with source file and line number under compilation (as well as source file and line number in compiler code)

49 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 49 22 October 2006 GCC Front-End Code naming conventions Preprocessor macros ALL UPPERCASE Variables/functions all lowercase with underscores Predicates end in “_P” or “_p” Global flags start with “flag_” Global trees (vary somewhat with front-end)  null_node (or null_pointer_node)  integer_zero_node  void_type_node  integer_unsigned_type_node (or unsigned_int_type_node) Tree accessor macros FROM_TO (e.g. TYPE_DECL)

50 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 50 22 October 2006 Modifying the front-end In Greater Depth

51 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 51 22 October 2006 GCC Front-End Gimplification G ENERIC + extensions  GIMPLE  G IMPLE is a subset of GENERIC  based on SIMPLE from McGill’s McCAT group G IMPLE is just like GENERIC but  no language extensions front-end gimplify_expr callback  3-address form (with temporary variables)  control structures lowered to goto

52 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 52 22 October 2006 GCC Middle-End Optimization of trees Static Single-Assignment form The Register Transfer Language intermediate representation Front-end Back-end Middle-end Demystifying GCC

53 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 53 22 October 2006 GCC Middle-End Back-end Front-end Middle-end The middle-end in context Gimplification Tree optimizations Expansion into RTL Register allocation RTL passes

54 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 54 22 October 2006 GCC Middle-End Optimizations over the tree representation Managed by pass manager in gcc/passes.c  init_optimization_passes orders the passes  passes represented by a tree_opt_pass struct (tree-pass.h) even though it does RTL now too “gate” function – whether or not to run optimization “execute” function – implementation of pass property bitmaps –properties required, destroyed, and created “todo” bitmaps –run internal GC, dump the tree, verify SSA form, etc.

55 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 55 22 October 2006 GCC Middle-End Passes and subpasses Passes can be used to group subpasses all_passes contains all_optimization_passes  all_optimization_passes has optimizations in order pass_tree_loop contains loop optimizations

56 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 56 22 October 2006 Adding a tree optimization pass In Greater Depth

57 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 57 22 October 2006 GCC Middle-End Debugging middle-end tree passes Command-line options for dumping trees: -fdump-tree- Xoutput after pass X -fdump-tree-original output initial tree (before all opts) -fdump-tree-optimized output final GIMPLE (after all opts) -fdump-tree-gimple dump before & after gimplification -fdump-tree-inlined output after function inlining -fdump-tree-all output after each pass (Make sure you specify an –O level or you might not get anything.) Passes available for dumping in GCC 4.1.1 (see info page): cfg, vcg, ch, ssa, salias, alias, ccp, storeccp, pre, fre, copyprop, store_copyprop, dce, mudflap, sra, sink, dom, dse, phiopt, forwprop, copyrename, nrv, vect, vrp

58 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 58 22 October 2006 GCC Middle-End Debugging middle-end tree passes Can specify options for tree dumps: address print address of each tree node slim less output; don’t dump all scope bodies raw raw tree output (rather than pretty-printed C-like trees) details detailed output (not supported by all passes) stats statistics (not supported by all passes) blocks basic block boundaries vops output virtual operands for each statement lineno output line #s uid output decl’s unique ID along with each variable all all except raw, slim, and lineno\ e.g. -fdump-tree-dse-details detailed post-DSE output -fdump-tree-all-all (almost) everything

59 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 59 22 October 2006 GCC Middle-End Static Single-Assignment (SSA) form (Pure) functional languages have nice properties for optimization  single-assignment: one assignment to each variable  static single-assignment: next best thing each variable assigned at one static location in the program  makes it clearer where data is produced reduces complexity of many optimization algorithms removes association of variable uses over its lifetime Cytron et al. Efficiently computing static single assignment form and the control dependence graph. ACM TOPLAS, October 1991.

60 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 60 22 October 2006 GCC Middle-End SSA renaming (1) y = 10; /* compute 2^y */ x = 1; while (y > 0) { x = x * 2; y = y - 1; } y = 10; /* compute 2^y */ x = 1; while (y > 0) { x = x * 2; y = y - 1; } y = 10 x = 1 y = 10 x = 1 x = x * 2 y = y - 1 x = x * 2 y = y - 1 y < 0 ? EXIT false true model control flow

61 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 61 22 October 2006 GCC Middle-End SSA renaming (2) y 1 = 10 x 1 = 1 y 1 = 10 x 1 = 1 x 2 = x 1 * 2 y 2 = y 1 - 1 x 2 = x 1 * 2 y 2 = y 1 - 1 y 1 < 0 ? EXIT false true y 1 = 10; /* compute 2^y */ x 1 = 1; while (y 1 > 0) { x 2 = x 1 * 2; y 2 = y 1 - 1; } y 1 = 10; /* compute 2^y */ x 1 = 1; while (y 1 > 0) { x 2 = x 1 * 2; y 2 = y 1 - 1; } version all variables

62 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 62 22 October 2006 GCC Middle-End SSA renaming (3) y 1 = 10 x 1 = 1 y 1 = 10 x 1 = 1 x 2 = x 3 * 2 y 2 = y 3 - 1 x 2 = x 3 * 2 y 2 = y 3 - 1 y 3 < 0 ? EXIT false true x 3 = φ(x 1, x 2 ) y 3 = φ(y 1, y 2 ) x 3 = φ(x 1, x 2 ) y 3 = φ(y 1, y 2 ) y 1 = 10; /* compute 2^y */ x 1 = 1; while(true) { x 3 = φ(x 1, x 2 ); y 3 = φ(y 1, y 2 ); if (y 3 > 0) break; x 2 = x 3 * 2; y 2 = y 3 - 1; } y 1 = 10; /* compute 2^y */ x 1 = 1; while(true) { x 3 = φ(x 1, x 2 ); y 3 = φ(y 1, y 2 ); if (y 3 > 0) break; x 2 = x 3 * 2; y 2 = y 3 - 1; } insert “phi” nodes

63 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 63 22 October 2006 GCC Middle-End Into and out of SSA form in GCC pass_del_ssa SSA optimizations pass_build_ssa gcc/tree-into-ssa.c pass_del_ssa gcc/tree-outof-ssa.c

64 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 64 22 October 2006 GCC Middle-End Dealing with SSA form in GCC Given a tree node n with code = PHI_NODE PHI_RESULT(n)get lhs of φ PHI_NUM_ARGS(n)get rhs count PHI_ARG_DEF(n, i)get ssa-name PHI_ARG_EDGE(n, i)get edge PHI_ARG_ELT(n, i)tuple (ssa-name, edge) Given a tree node n with code = SSA_NAME SSA_NAME_DEF_STMT(n) get defining statement SSA_NAME_VERSION(n) get SSA version #

65 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 65 22 October 2006 GCC Middle-End A few useful functions in the middle-end walk_use_def_chains( var, func, data ) start at ssa-name var, calling func at each point up the chain; data is a generic pointer for use by func — see tree-ssa.c and internals docs ( info gccint ) walk_dominator_tree( dom-walk-data, basic-block ) start at basic-block and walk children in dominator relationship; dom-walk-data provides several callbacks — see domwalk.h and internals docs ( info gccint )

66 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 66 22 October 2006 Implementing an optimization from start to finish In Greater Depth

67 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 67 22 October 2006 GCC Middle-End RTL expansion and optimization Expansion performed by pass_expand (gcc/cfgexpand.c)  Back-end has a say in this As of GCC 4.1.x, RTL passes are carried out by same pass manager that works on trees pass_final (at end) outputs assembly

68 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 68 22 October 2006 GCC Back-End Register allocation Instruction selection Debugger support Front-end Middle-end Back-end Demystifying GCC

69 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 69 22 October 2006 GCC Back-End Register allocation RTL pseudo-registers  hard registers Proceeds in several passes 1.Register class scan (preference registers) 2.Register allocation within basic blocks 3.Register allocation for remaining registers 4.Reload (renumbering, spilling)

70 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 70 22 October 2006 GCC Back-End Instruction selection Machine description (.md) files for target CPU define_expand() matches standard names and generates RTL; assists in expansion of GIMPLE define_insn() matches RTL templates and generates assembly Internals documentation has details

71 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 71 22 October 2006 A machine description tour In Greater Depth

72 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 72 22 October 2006 GCC Back-End Debugger support Specifying –g to the compiler inserts debugging symbols in the assembly output D WARF 2 format  embedded within ELF  a tree of debug info entries (compilation unit at the root) each with a linked list of attributes  D WARF 2 manual: ftp.freestandards.org/pub/dwarf/dwarf-2.0.0.pdf Once assembled, “readelf –w” interprets them

73 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 73 22 October 2006 Runtime Issues Object layout Virtual method lookup The Boehm garbage collector crt stuff Demystifying GCC

74 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 74 22 October 2006 GCC Runtime Issues class A { public: int x; virtual void myMethod(); virtual void other(); }; class B : public A { public: int y; virtual void myMethod(); virtual void third(); }; class A { public: int x; virtual void myMethod(); virtual void other(); }; class B : public A { public: int y; virtual void myMethod(); virtual void third(); }; Simple object layout (C++) vtable x x x x y y A::myMethod A::other B::myMethod A::other vtable for B vtable for Ainstances of A instances of B B::third

75 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 75 22 October 2006 GCC Runtime Issues Simple object layout (C++) vtable x x x x y y A::myMethod A::other B::myMethod A::other vtable for B vtable for Ainstances of A instances of B subobject A of B B::third sub-vtable A of B

76 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 76 22 October 2006 GCC Runtime Issues class B pointer GC descriptor finalize hashCode Object layout (Java) vtable x x y y equals toString instances of B subobject A of B clone sub-vtable Object of B myMethod other vtable for B third sub-vtable A of B

77 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 77 22 October 2006 GCC Runtime Issues But more complicated for C++ ! x x x x y y instances of A instances of B subobject A of B class A { public: int x; void myMethod(); void other(); }; class B : public A { public: int y; void myMethod(); void third(); }; class A { public: int x; void myMethod(); void other(); }; class B : public A { public: int y; void myMethod(); void third(); }; First, classes might not have virtual functions !

78 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 78 22 October 2006 GCC Runtime Issues But more complicated for C++ ! class A { public: int x; virtual void one(); }; class B { public: int y; virtual void two(); }; class C : public A, public B { public: int z; virtual void three(); }; class A { public: int x; virtual void one(); }; class B { public: int y; virtual void two(); }; class C : public A, public B { public: int z; virtual void three(); }; Second, classes might have multiple bases ! vtable x x A::one vtable for A instances of A vtable y y B::two vtable for B instances of B vtable for Cinstances of C ? ? ? ?

79 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 79 22 October 2006 GCC Runtime Issues Object layout for multiple bases vtable x x A::one vtable for A instances of A vtable y y B::two vtable for B instances of B instances of C vtable x x y y z z vtable for C A::one — — — — B::two C::three subobject A of C subobject B of C Requires “this pointer-adjustment”

80 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 80 22 October 2006 GCC Runtime Issues Multiple bases, cont’d instances of C vtable x x y y z z vtable for C A::one [ offset = – 4 ] — — B::two C::three subobject A of C subobject B of C But what about dynamic_cast ?!

81 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 81 22 October 2006 GCC Runtime Issues [ offset = 0 ] — — Multiple bases, cont’d instances of C vtable x x y y z z vtable for C A::one [ offset = – 4 ] — — B::two C::three subobject A of C subobject B of C Top-level offset

82 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 82 22 October 2006 GCC Runtime Issues [ offset = 0 ] ptr. typeinfo C Multiple bases, finished * instances of C vtable x x y y z z vtable for C A::one [ offset = – 4 ] ptr. typeinfo C B::two C::three subobject A of C subobject B of C But what about C++ type info ?! * there are further complications, but we’ll leave it here

83 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 83 22 October 2006 GCC Runtime Issues Java and C++ share object layout vtable for (Java) B [ offset = 0 ] null typeinfo class B pointer GC descriptor finalize hashCode equals toString clone myMethod other third

84 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 84 22 October 2006 GCC Runtime Issues Virtual method lookup (C++, Java) vtable for (Java) B [ offset = 0 ] null typeinfo class B pointer GC descriptor finalize hashCode equals toString clone myMethod other third Now, virtual method invocation is a snap ! Now, virtual method invocation is a snap ! Compiler knows method offset within vtable Compiler knows method offset within vtable So it generates an indirect access through instance pointer… So it generates an indirect access through instance pointer… …and invokes the method through the pointer found in vtable …and invokes the method through the pointer found in vtable vtable x x y y instance of B

85 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 85 22 October 2006 GCC Runtime Issues The Boehm garbage collector Conservative mark & sweep garbage collector  designed to operate in a hostile environment as a drop-in replacement for malloc  “conservative” means it cannot distinguish between pointers and non-pointers  Java is considerably less “hostile” than C/C++ can’t hide pointers from the compiler Boehm, H., Space Efficient Conservative Garbage Collection. In ACM PLDI’91.

86 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 86 22 October 2006 GCC Runtime Issues Java front-end generates class pointer masks  stows them in vtable  computed in gcc/java/boehm.c Class too big for a pointer mask ?  use a count of reference fields  use a “mark procedure” Where to look  boehm-gc/doc contains docs  libjava/prims.cc contains GC-aware allocation routines [ offset = 0 ] null typeinfo class B pointer GC descriptor Java and Boehm GC finalize …

87 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 87 22 October 2006 GCC Runtime Issues crt stuff (“C runtime”) crt1.o, crti.o, crtn.o * provided by glibc crt1.osets up libc before main() is even invoked crti.oprologue for.init and.fini crtn.oepilogue for.init and.fini crtbegin.o, crtend.o * provided by GCC crtbegin.ocontributes frame_dummy() call to.init; calls static data destructors in.fini crtend.ocalls static data constructors in.init  code in gcc/crtstuff.c * and some variations

88 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 88 22 October 2006 Language feature with runtime support In Greater Depth

89 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 89 22 October 2006 Wrap-up Running GCC under GDB Obtaining development versions of GCC Reporting bugs in GCC What’s next for GCC Demystifying GCC

90 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 90 22 October 2006 Wrap-up Running GCC under GDB Inevitably, hacking a compiler will result in  segfault  assertion fault  incorrect code generation Remember to attach debugger to the compiler, not the driver “ gcc –v …,” then use GDB on the actual front-end

91 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 91 22 October 2006 Debugging GCC In Greater Depth

92 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 92 22 October 2006 Wrap-up Obtaining development versions of GCC All GCC development is in the open  design discussions  change logs  bugs Subversion (SVN) repository  public read access  for details: gcc.gnu.org/svn.html  clients available from subversion.tigris.org/

93 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 93 22 October 2006 Wrap-up What to do if you find a bug in GCC Check to see if bug is present in SVN version Check to see if bug is in bug database  http://gcc.gnu.org/bugzilla/ Collect version information (gcc --version) Guidelines: http://gcc.gnu.org/bugs.html Report it: http://gcc.gnu.org/bugzilla/

94 OOPSLA 2006 Portland, Oregon Morgan Deters and Ron K. CytronDemystifying GCC 94 22 October 2006 Thanks! Morgan DetersRon K. Cytron mdeters@cs.wustl.educytron@cs.wustl.edu – Distributed Object Computing Laboratory – Washington University Dept. of Computer Science & Engineering St. Louis, MO 63130 USA Copyright © 2005–2006 Morgan Deters


Download ppt "Distributed Object Computing Laboratory Washington University St. Louis, MO Demystifying GCC Under the Hood of the GNU Compiler Collection Morgan DetersRon."

Similar presentations


Ads by Google