Presentation is loading. Please wait.

Presentation is loading. Please wait.

LLVM IR, File - Praakrit Pradhan. Overview The LLVM bitcode has essentially two things A bitstream container format Encoding of LLVM IR.

Similar presentations


Presentation on theme: "LLVM IR, File - Praakrit Pradhan. Overview The LLVM bitcode has essentially two things A bitstream container format Encoding of LLVM IR."— Presentation transcript:

1 LLVM IR, File - Praakrit Pradhan

2 Overview The LLVM bitcode has essentially two things A bitstream container format Encoding of LLVM IR

3 Bitstream format The bitstream format is an abstract encoding of structured data, very similar to XML in some ways. Like XML, bitstream files contain tags, and nested structures, and you can parse the file without having to understand the tags. Unlike XML, the bitstream format is a binary encoding, and unlike XML it provides a mechanism for the file to self- describe “abbreviations”, which are effectively size optimizations for the content.

4 LLVM IR file LLVM IR files may be optionally embedded into a wrapper structure, or in a native object file. Both of these mechanisms make it easy to embed extra data along with LLVM IR files.wrappernative object file

5 LLVM IR Encoding LLVM IR is encoded into a bitstream by defining blocks and records. It uses blocks for things like constant pools, functions, symbol tables, etc. It uses records for things like instructions, global variable descriptors, type descriptions, etc.

6 LLVM IR is defined with the following blocks: 8 — MODULE_BLOCK — This is the top-level block that contains the entire module, and describes a variety of per-module information.MODULE_BLOCK 9 — PARAMATTR_BLOCK — This enumerates the parameter attributes.PARAMATTR_BLOCK 10 — TYPE_BLOCK — This describes all of the types in the module.TYPE_BLOCK 11 — CONSTANTS_BLOCK — This describes constants for a module or function.CONSTANTS_BLOCK 12 — FUNCTION_BLOCK — This describes a function body.FUNCTION_BLOCK 13 — TYPE_SYMTAB_BLOCK — This describes the type symbol table.TYPE_SYMTAB_BLOCK 14 — VALUE_SYMTAB_BLOCK — This describes a value symbol table.VALUE_SYMTAB_BLOCK 15 — METADATA_BLOCK — This describes metadata items.METADATA_BLOCK 16 — METADATA_ATTACHMENT — This contains records associating metadata with function instruction values.METADATA_ATTACHMENT

7 To put it visually IR better than assembly? Possibly

8 The stages: Frontend: parsing original language and spiting out LLVM Intermediate Representation (IR) code Optimizer: mangling one IR into optimized equivalent IR. This stage does all the usual optimizations like constant propagation, dead code removal and so on Backend: taking IR and producing machine code optimised for a specific CPU

9 IR is the heart of LLVM The crucial part is IR. It's a common language that sits between the high-level program and the low-level backend. IR is used to express high level concepts and is specific enough that any backend can produce a fast machine code.

10 Goals for LLVM IR Easy to produce, understand and define Language and Target Independent One IR for analysis and optimization Must be able to support aggressive IPO, loop opts, scalar opts High and low level optimization Optimize as early as possible

11 Hardware support Expectation?

12 Flowchart of Source code to LLVM IR

13 LLVM IR In memory compiler IR (intermediate representation) Human readable assembly language – LLVM IR (*.ll *.s) LLVM IR is SSA form (Single Static Assignment form) Each variable is assigned exactly once Use-def chains are explicit and each contains a single element

14 Global Variable & Array Representation

15 Function entry & Local Variables

16 Inner Most Loop

17 Lets try writing it... Let's consider a relatively straightforward function that takes three integer parameters and returns an arithmetic combination of them. This is nice and simple: And this is what we need to end up with:

18 Still trying... Here is what our basic main function will look like: The first segment is pretty simple: it creates an LLVM “module.” In LLVM, a module represents a single unit of code that is to be processed together. Here we’ve declared a makeLLVMModule() function to do the real work of creating the module. The second segment runs the LLVM module verifier on our newly created module. The verifier will print an error message if your LLVM module is malformed in any way. Finally, we instantiate an LLVM PassManager and run the PrintModulePass on our module.

19 Almost there... The first chunk of our module All this does is instantiate a module and gives it a name. This is our function: Pass in the name, return type and arg type of the function In our case it’s a 32 bit integer type We set our calling convention to a C calling convention.

20 Functions and blocks... let's also give names to the parameters This also isn’t strictly necessary (LLVM will generate names for them if you don’t specify them) The IR, being an abstract assembly language, represents control flow using jumps (we call them branches), both conditional and unconditional. The straight-line sequences of code between branches are called basic blocks, or just blocks. So we need to create these blocks :

21 Such blockage We create a new basic block by callings its constructor We need to tell it its name and the function to which it belongs We also create an IRBuilder object. This is a convenience for creating instructions and for appending them to the end of the block Instructions can be created through their constructors as well Interfaces for that are complicated, so using IRBuilder will make life simpler (doing this is ok, unless we need a lot more control)

22 And finally... Our mul_add function is composed of just three instructions: a multiply, an add, and a return. IRBuilder gives us a simple interface for constructing these instructions and appending them to the “entry” block. Each of the calls to IRBuilder returns a Value* that represents the value yielded by the instruction. You’ll also notice that, above, x, y, and z are also Value*'s, so it's clear that instructions operate on Value*'s. All hail IRBuilders? Apparently above command lines are helpful to compile and run code. (never tried this, so not sure)

23 Just another quick example :

24 What is IR? IR is a low-level programming language, pretty similar to assembly According to the AOSA book.

25 Conclusion Low Level IR SSA-Based Language-Independent Machine-Independent Allow libraries and portions written by different language And basically a better Assembly language Assembly LLVM

26 Thank you

27 References/Links http://pllab.cs.nthu.edu.tw/cs340402/lectures/lectures_2013/LLVM% 20Bitcode%20Introduction.pdf http://pllab.cs.nthu.edu.tw/cs340402/lectures/lectures_2013/LLVM% 20Bitcode%20Introduction.pdf http://llvm.org/docs/GettingStarted.html https://idea.popcount.org/2013-07-24-ir-is-better-than-assembly/ http://llvm.org/releases/2.6/docs/tutorial/JITTutorial1.html http://stackoverflow.com/questions/19453440/confusion-about- extension-of-llvm-ir-file http://stackoverflow.com/questions/19453440/confusion-about- extension-of-llvm-ir-file

28 LLVM References LLVM official website http://llvm.org/ http://llvm.org/docs/GettingStarted.html LLVM IR http://llvm.org/docs/LangRef.html

29 Unused references/links http://llvm.org/docs/doxygen/html/IRReader_8cpp_source.html http://www.ibm.com/developerworks/library/os- createcompilerllvm1/ http://www.ibm.com/developerworks/library/os- createcompilerllvm1/ https://news.ycombinator.com/item?id=6096743


Download ppt "LLVM IR, File - Praakrit Pradhan. Overview The LLVM bitcode has essentially two things A bitstream container format Encoding of LLVM IR."

Similar presentations


Ads by Google