Download presentation

Presentation is loading. Please wait.

Published byFrances Wiglesworth Modified about 1 year ago

1
A Dictionary Construction Technique for Code Compression Systems with Echo Instructions Embedded and Reconfigurable Systems Lab Computer Science Department University of California, Los Angeles {philip, macbeth, ani, majid}@cs.ucla.edu LCTES ’05. June 16, 2005. Chicago, IL Philip BriskJamie MacbethAni NahapetianMajid Sarrafzadeh

2
Outline Introduction: Code Compression Dictionary Compression Dictionary Construction Overview of the Algorithm Experimental Methodology and Results Summary

3
Why Reduce Program Size? Reduces Memory Requirements Silicon Cost of Program Storage in on-chip ROMs As Embedded Systems Become More Complex, Ever-More Functionality Will Migrate to Software Costs of Runtime Decompression Performance Overhead Area of the Decoder Circuitry Introduction: Code Compression For Embedded Systems

4
Dictionary Compression 1.Find Repeated Code Sequences 2.Place Each Sequence Into a Dictionary 3.Replace Each Sequence in the Program with a Codeword that Accesses the Dictionary Program Dictionary

5
CALD Instructions Place each sequence in a dictionary All Codewords Point to the Dictionary Echo Instructions Leave one Instance of the Sequence Inline All Codewords Point to the Sequence CALD and Echo Instructions Program Dictionary Program

6
The Traditional Approach: Compression Performed at Link Time Substring Matching [Fraser et al., 1984] + Register Renaming [Cooper and McIntosh, 1999] [Debray et al., 2000] + Instruction Rescheduling [De Sutter et al., 2002] Our Approach is Somewhat Different… Identify Repeated Isomorphic Patterns that Occur within the Intermediate Representation PRIOR TO Register Allocation [Brisk et al., 2004] Compression Algorithms

7
Dictionary Construction A:R1 ← R2 + R3 B:R4 ← R5 + R6 C:R7 ← R1 + R4 A:R1 ← R2 + R3 C:R7 ← R1 + R4 A:R1 ← R2 + R3 B:R4 ← R5 + R6 C:R7 ← R1 + R4 B:R4 ← R5 + R6 A:R1 ← R2 + R3 C:R7 ← R1 + R4 A:R1 ← R2 + R3 C:R7 ← R1 + R4 Dictionary 1 Dictionary 2 Sequence 1 Sequence 2 2 Schedules Exist for DAG 1 DAG 1 DAG 2 DAG 2 is isomorphic to a subgraph of DAG 1 5 3

8
Isomorphic Pattern Generation Edge Contraction Add an Operation to a Pattern Combine 2 Patterns into a Larger One Build a Subgraph Hierarchy (SH)

9
Isomorphic Pattern Generation Edge Contraction Add an Operation to a Pattern Combine 2 Patterns into a Larger One Build a Subgraph Hierarchy (SH) T1T1 T1T1 SH

10
Isomorphic Pattern Generation Edge Contraction Add an Operation to a Pattern Combine 2 Patterns into a Larger One Build a Subgraph Hierarchy (SH) T1T1 T1T1 SH

11
Isomorphic Pattern Generation Edge Contraction Add an Operation to a Pattern Combine 2 Patterns into a Larger One Build a Subgraph Hierarchy (SH) T1T1 T1T1 SH T2T2

12
Isomorphic Pattern Generation Edge Contraction Add an Operation to a Pattern Combine 2 Patterns into a Larger One Build a Subgraph Hierarchy (SH) T1T1 T1T1 SH T2T2 T2T2

13
Isomorphic Pattern Generation Edge Contraction Add an Operation to a Pattern Combine 2 Patterns into a Larger One Build a Subgraph Hierarchy (SH) T1T1 T1T1 SH T2T2 T2T2

14
Isomorphic Pattern Generation Edge Contraction Add an Operation to a Pattern Combine 2 Patterns into a Larger One Build a Subgraph Hierarchy (SH) T1T1 T1T1 SH T2T2 T2T2

15
Isomorphic Pattern Generation Edge Contraction Add an Operation to a Pattern Combine 2 Patterns into a Larger One Build a Subgraph Hierarchy (SH) T1T1 T1T1 SH T2T2 T2T2

16
Isomorphic Pattern Generation Edge Contraction Add an Operation to a Pattern Combine 2 Patterns into a Larger One Build a Subgraph Hierarchy (SH) T1T1 T1T1 SH T2T2 T2T2 T3T3 T2T2 T2T2 T3T3 T4T4

17
Isomorphic Pattern Generation Edge Contraction Add an Operation to a Pattern Combine 2 Patterns into a Larger One Build a Subgraph Hierarchy (SH) T1T1 T1T1 SH T2T2 T2T2 T3T3 T2T2 T2T2 T3T3

18
T3T3 Isomorphic Pattern Generation Edge Contraction Add an Operation to a Pattern Combine 2 Patterns into a Larger One Build a Subgraph Hierarchy (SH) T1T1 T2T2 T1T1 T2T2 SH T2T2 T4T4 T2T2 T3T3 T4T4

19
T3T3 Isomorphic Pattern Generation Edge Contraction Add an Operation to a Pattern Combine 2 Patterns into a Larger One Build a Subgraph Hierarchy (SH) T1T1 T2T2 T1T1 T2T2 SH T2T2 T4T4 T2T2 T3T3 T4T4

20
An SH Grammar The SH is also a DAG Generate a pattern T k from sub-patterns T i and T j ; Contract edge (T i, T j ) Create a Production: T k → T i T j T3T3 T1T1 T2T2 T2T2 T4T4 T 2 → xT 1 x T 4 → T 3 T 2 x

21
Derivations and Scheduling a b c d ef g a b c d ef g a c d e d ef d f G1G1 G2G2 G3G3 G4G4 G6G6 G5G5 G7G7 G 1 → G 2 G 3 G 2 → G 4 bG 3 → G 5 g G 4 → acG 5 → G 6 fG 5 → G 7 e G 6 → deG 7 → df Grammar G1G1 G3G3 G4G4 ac G7G7 df G5G5 e g b G2G2 G1G1 G3G3 G4G4 ac G6G6 de G5G5 f g b G2G2 acbdefgacbdfeg Derivations

22
Compatibility T i, T j – patternsS i, S j – schedules for T i, T j Assume T i is a Subgraph of T j We want T i and T j to Share the Same Dictionary Entry Then S i must be a Contiguous Subsequence of S j. A:R1 ← R2 + R3 B:R4 ← R5 + R6 C:R7 ← R1 + R4 A:R1 ← R2 + R3 C:R7 ← R1 + R4 B:R4 ← R5 + R6 A:R1 ← R2 + R3 C:R7 ← R1 + R4 AC is a Contiguous Subsequence of BAC but not ABC

23
Convex Cuts in DAGs Let G = (V, E) be a DAG A Cut is a Partition of V A Convex Cut cannot have edges that cross the boundary of a cut in BOTH directions SH Construction Ensures Convex Cuts DAGNon-Convex Cut Convex Cut / Scheduling

24
Convex Cuts and Compatibility a b c d ef g G1G1 a b c d ef g G2G2 G3G3 b d f a c e g G4G4 G5G5 a b c d e f g a b c d e f g a b c d ef g G 1→(2,3) b d f a c e g G 1→(4,5) a b c d ef g G 1→(2,3),(4,5) CYCLE! G 1 → G 2 G 3 G 1 → G 4 G 5

25
Generalized Compatibility Given a Set of Productions with G 1 on the LHS… G 1 → G 2 G 3 G 1 → G 4 G 5 …G 1 → G 2k G 2k+1 How can we Tell if they are Compatible?, Three Criteria Equivalent to Compatibility 1.G 1→(2,3),(4,5),…,(2k,2k+1) is Acyclic 2.G 2 G 4 … G 2k 3.G 2k+1 … G 5 G 3 The Pragmatic Question: If all Productions are NOT Compatible, what is the Largest Compatible Subset?

26
The Subset/Subgraph View of Compatibility and Scheduling GiGi GjGj G i G j G j - G i SiSi S j-i SiSi 1.Construct a Schedule S i for G i 2.Construct a Schedule S j-i for G j-i 3.Construct a Schedule S j = S i S j-i for G j

27
A Production Compatibility Graph Represent the Subgraph Relation as a DAG called the Production Compatibility Graph (PCG) Productions G 1 → G i … and G 1 → G j … create vertices G i and G j Add an Edge (G i, G j ) to the PCG if 1.G i G j 2.There is no G k such that G j G k G i Any PATH in the PCG Corresponds to a Subset of Patterns that can be Scheduled Contiguously within a Dictionary entry for G 1.

28
PCG Example a b c d ef g G1G1 a b c d ef g G2G2 G3G3 b d f a c e g G4G4 G5G5 a b c d ef g G6G6 G7G7 a b c d ef g G8G8 G9G9 a b c d e f g G 10 G 11 G8G8 G2G2 G4G4 G6G6 G 10 PCG

29
Algorithm Overview Recall that the Subgraph Hierarchy is a DAG Process SH Entries in Topological Order All Sub-Patterns Processed Before Each Pattern Construct a PCG for each SH Entry Assign Vertex Weights to Each Pattern based on the Number of Sub-Patterns in the Dictionary Entry Find Max Vertex-Weighted Path in the PCG Determine the Maximum Gain Pattern in the SH Remove the Max Gain Pattern – and all Sub-Patterns Selected for its Dictionary Entry Repeat until the SH is Empty

30
Experimental Framework Algorithm Built into the Machine SUIF Compiler 1.Consolidate Each Application using link_suif Pass All Unrolled Loops Manually Re-rolled 2.Standard Front End Compilation Script One Round of Constant Folding/DCE 3.Instruction Selection for Alpha Architecture ARM Back End Recently Released… 4.Detect Recurring Isomorphic Patterns in the IR Analysis described in [Brisk et al., 2004] 5.Dictionary Construction as Described Here

31
Experimental Methodology Cannot Compare with Substring Matching Many Schedules Exist for Each DAG Substring Matching Assumes Scheduled Code How to Determine the Best Schedule for Each DAG? Our Algorithm Determines a Schedule for the Entire Set of DAGs to Maximize Pattern Overlap Naïve Approach – Each Pattern Gets Its Own Dictionary Entry Our Approach - Isomorphism/Scheduling

32
Experimental Results Applications Taken from MediaBench [Lee et al., 1997]

33
Compilation Time Benchmark Total (sec) Dictionary (sec)(%) Epic G.721 GSM JPEG MPEG2 Dec MPEG2 Enc Pegwit PGP PGP (RSA) Rasta 9.88 2.71 33.6 362 32.3 65.1 32.6 198 9.06 18.1 0.524 0.196 0.821 16.1 1.31 1.99 1.10 5.64 0.520 0.871 5.30% 7.23% 2.44% 4.45% 4.06% 3.06% 3.37% 2.85% 5.74% 4.81%

34
Conclusion Algorithm Given for Dictionary Construction What Is Built is Actually an Intermediate Representation of a Dictionary Combination of 3 Classically Hard Problems Graph/Subgraph Isomorphism Scheduling Dictionary Construction/Compression Future Work: Register Allocation and Assignment Make a Best Effort to Assign Registers So that Isomorphic Patterns have Identical Register Usage

35
1. Brisk, P., Nahapetian, A., and Sarrafzadeh, M. Instruction Selection for Compilers that Target Architectures with Echo Instructions, SCOPES 2004. 2.Fraser, C. W., Myers, E., and Wendt, A. Analyzing and Compressing Assembly Code. Symposium on Compiler Construction, 1984. 3.Cooper, K. D., and McIntosh, N. Enhanced Code Compression for Embedded RISC Processors, PLDI 1999. 4.De Sutter, B., De Bus, B., and De Bosschere, K. Sifting out the Mud: Low-Level C++ Code Reuse, OOPSLA 2002. 5.Debray, S., Evans, W., Muth, R., and De Sutter, B. Compiler Techniques for Code Compaction, TOPLAS, 2000. 6.Lee, C., Potkonjak, M., and Mangione-Smith, W. H. MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communications Systems, MICRO-30, 1997. References

36
Questions ?

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google