Presentation is loading. Please wait.

Presentation is loading. Please wait.

Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Similar presentations


Presentation on theme: "Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben."— Presentation transcript:

1 Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben Bergen Applied Computer Science (CCS-7) Los Alamos National Laboratory Kartik Ramkrishnan and Ben Bergen Applied Computer Science (CCS-7) Los Alamos National Laboratory

2 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA  SKA – Static Kernel Analyzer  SKA is a very useful tool to improve the development process.  Performs static architecture aware analysis of kernels.  Outputs code metrics during the development process.  Visualizes the code execution on the specified pipeline. What is SKA  Slide 2

3 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA SKA-Enhanced Development Cycle  Slide 3

4 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA define i32 @main(i32 %argc, i8** nocapture %argv) nounwind uwtable readnone { entry: %a1 = alloca [32 x float], align 4 %b2 = alloca [32 x float], align 4 %c3 = alloca [32 x float], align 4 br label %"3" "3": ; preds = %"3", %entry %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %"3" ] %0 = getelementptr [32 x float]* %a1, i64 0, i64 %indvars.iv %1 = load float* %0, align 4 %2 = getelementptr [32 x float]* %b2, i64 0, i64 %indvars.iv %3 = load float* %2, align 4 %4 = getelementptr [32 x float]* %c3, i64 0, i64 %indvars.iv %5 = load float* %4, align 4 %6 = fmul float %3, %5 %7 = fadd float %1, %6 store float %7, float* %4, align 4 %indvars.iv.next = add i64 %indvars.iv, 1 %lftr.wideiv = trunc i64 %indvars.iv.next to i32 %exitcond = icmp eq i32 %lftr.wideiv, 32 br i1 %exitcond, label %"5", label %"3" "5": ; preds = %"3" ret i32 0 Example kernel – saxpy.ll  Slide 4

5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA  LLVM IR is SSA (single static assignment) which has infinite register count.  ISAs(instruction set architectures) have a limited number of registers.  We improve SKA’s fidelity by allocating registers to the IR based on the target ISA. Register allocation support for SKA  Slide 5

6 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA  Simple register allocation algorithm. Register Allocation algorithm  Slide 6

7 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Build Liveness Tables  Slide 7

8 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA  SKA takes an LLVM IR module as input and builds a liveness table. Build Liveness Tables  Slide 8 Partial liveness table for saxpy.ll

9 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Build Liveness Tables  Slide 9 Top level loop Single BB liveness calculation Populate liveness table

10 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Build Interference Graph  Slide 10

11 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA  Traverse the liveness table to create the interference graph. Build Interference Graph  Slide 11 Partial igraph for saxpy.ll

12 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Build Interference Graph  Slide 12 Top level loop Populate igraph

13 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Simplify Interference Graph  Slide 13

14 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA  Populate a stack which records whether a register (node) is simple or not. Simplify Interference Graph  Slide 14 Partial node stack for saxpy.ll

15 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Simplify Interference Graph  Slide 15 Populate simple node stack

16 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Assign ISA Registers to IR  Slide 16

17 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Assign ISA Registers to IR  Slide 17  Assign ISA registers to IR, if no true spill.  We choose between int, float and vector. Partial register allocation for saxpy.ll

18 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Assign ISA registers to IR  Slide 18 Assign register if no true spill

19 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Rewrite IR  Slide 19

20 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA  The live range of %a1 is shown in red. It reduces after rewriting the IR. Rewrite IR  Slide 20

21 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Rewrite IR  Slide 21 Store instruction into stack Load, use and store

22 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Register allocation done !  Slide 22

23 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA  Specified in an xml file.  Specifies logical units, instructions they process, latencies, issue width … Virtual architecture specification  Slide 23 Partial architecture example

24 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Pipeline simulation  Slide 24 Pipeline simulation of saxpy.ll

25 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Skaview  Slide 25 Graphical visualization of saxpy.ll

26 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA  SKA outputs useful metrics about the code.  Primitive statistics include basic performance counters, such as instructions, cycles and stalls.  Derived statistics are obtained from primitive statistics. Code metrics  Slide 26

27 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA  CPI prediction is better after register allocation. Results for residual.ll  Slide 27

28 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA  No change in CPI prediction. Why ? Results for ef_operator.ll  Slide 28

29 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA  Predicts CPI > 1.0 for KNC for single threaded workloads. Results for KNC (Knights corner)  Slide 29

30 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA  SKA now supports register allocation.  Register allocation improves SKA’s fidelity by 5- 10% across three architectures for a compute intensive benchmark.  Dynamic scheduling and cache models can further improve SKA fidelity. Conclusion  Slide 30

31 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA  Questions ? Thank You !  Slide 31


Download ppt "Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben."

Similar presentations


Ads by Google