Presentation is loading. Please wait.

Presentation is loading. Please wait.

Seok-Won Seong and Prabhat Mishra University of Florida IEEE Transaction on Computer Aided Design of Intigrated Systems April 2008, Vol 27, No. 4 Rahul.

Similar presentations


Presentation on theme: "Seok-Won Seong and Prabhat Mishra University of Florida IEEE Transaction on Computer Aided Design of Intigrated Systems April 2008, Vol 27, No. 4 Rahul."— Presentation transcript:

1 Seok-Won Seong and Prabhat Mishra University of Florida IEEE Transaction on Computer Aided Design of Intigrated Systems April 2008, Vol 27, No. 4 Rahul Sridharan 1 of 25

2  Motivation  Background ◦ Code compression using Bitmasks  Challenges in Bitmask-based approach  Application-Aware Code Compression ◦ Mask Selection ◦ Bitmask-aware Dictionary Selection ◦ Code Compression Algorithm  Results  Conclusion 2 of 25

3  Bitmask-based code compression ◦ Addresses issue of memory constraints in Embedded Systems improving power and performance ◦ Constraints code size  Application-Aware code compression algorithm ◦ Improve compression efficiency without introducing decompression penalty 3 of 25

4 Background: Code Compression Compressed Code (Memory) Decompression Engine Processor (Fetch and Execute) Application Program (Binary) Compression Algorithm Static Encoding (Offline) Dynamic Decoding (Online) 4 of 25

5 Format for Uncompressed Code Format for Compressed Code Uncompressed Data (32 Bits) Decision (1 Bit) Decision (1 Bit) # of Bit Changes Dictionary Index Location (5 Bits) Locatio n (5 Bits ) …  Dictionary based ◦ Frequency based Dictionary-selection Format for Uncompressed Code (32 Bit Code) Format for Compressed Code Uncompressed Data (32 Bits) Decision (1 Bit ) Dictionary Index Decision (1 Bit)  Hamming Distance based ◦ Remembering Mismatches  Bit-mask based 5 of 25

6 Bitmask Encoding  32-bit instructions  Format for uncompressed code  Format for compressed code Uncompressed Data (32 Bits) Decision (1 Bit) Decision (1 Bit) Number of Masks Dictionary Index … Mask Type Location Mask Pattern Mask Type Location Mask Pattern Location to apply the bitmask Actual mask pattern Type of the mask e.g., 2-bit, 4-bit etc. 6 of 25

7 Code Compression with Bitmasks 0000 1000 0010 0000 0010 0100 0010 0100 1110 0101 0010 0000 1100 0100 0010 1100 0000 0000 Original Program Compressed Program Dictionary 0 1 0 0 0 00 11 1 0 0 11 10 0 0 1 1 0 0 10 11 1 0 0 01 01 1 0 0 10 11 0 0 1 1 0 0 00 11 0 0 1 0 IndexEntry 00000 10100 0010 0 – Compressed 1 – Not Compressed 0 – Bit Mask Used 1 – No Bit Mask Used Bit Mask PositionBit Mask Value 7 of 25

8 Challenges in Bitmask-based Compression  Selection of appropriate mask pattern ◦ Larger bitmask generates more matches  4-bit mask can handle up to 16 mismatches  8-bit mask can handle up to 256 mismatches ◦ Larger bitmask incurs higher cost  4-bit mask costs 7 bits  8-bit mask costs 10 bits  Efficient Dictionary Selection ◦ Frequency-based selection not always optimum  Need for efficient masking and dictionary selection schemes to improve efficiency 8 of 25

9 Frequency v/s Spanning based Dictionary Selection Frequency-based DS CR = 97.5% Spanning-based DS CR = 87.5% 9 of 25

10  Bitmask Selection  Bitmask-Aware Dictionary Selection ◦ Nondeterministic polynomial-time-hard problem  Code Compression Algorithm ◦ Based on the combination of the two approaches 10 of 25

11 Mask Selection  How many bitmask patterns are needed?  Which of them are profitable?  Fixed and sliding bitmask patterns MaskFixedSliding 1 BitX 2 BitsXX 3 BitsX 4 BitsXX 5 BitsX 6 BitsX 7 BitsX 8 BitsXX Bit Changes Size of Mask Pattern 1 Bit 2 Bits 4 Bits 8 Bits 16 Bits 32 Bits 32Bits16510059423532 16Bits8451302117 8Bits43261510 4Bits22137 2Bits116 1Bit5 11 of 25

12 Mask Selection  Bits needed to indicate particular location ◦ Size of mask ◦ Type of mask  No. of bitmask patterns needed ◦ Up to two mask patterns  Minimum cost to store three bitmasks is 27-31 bits for a 32-bit vector  Not very profitable  Which combinations are profitable? ◦ Eleven possibilities  1s, 2s, 2f, 3s, 4s, 4f, 5s, 6s, 7s, 8s, 8f ◦ Select one/two from eleven possibilities  Number of combinations can be further reduced 12 of 25

13 Comparison of Bitmask Combinations Benchmarks are compiled for TI TMS320C6x (1s, 4f) and (2f, 2s) provide the best compression s (1s, 4f) (2s, 2f) 13 of 25

14 Mask Selection: Observations  Factors of 32 (1, 2, 4 and 8) produce better results ◦ Since they can be applied cost-effectively on fixed locations  8-bit fixed/sliding is not helpful ◦ Probability of more than 4 consecutive changes is low ◦ Two smaller masks perform better than a larger one ◦ 4-bit sliding does not perform better than 4-bit fixed  Two bitmasks provide better results than a single one  Choose two from four bitmasks: (1s, 2f, 2s, 4s) MaskFixedSliding 1 BitX 2 BitsXX 4 BitsX 14 of 25

15 Dictionary Selection DynamicStatic Frequency Spanning Bit Savings Select most frequently occurring binary patterns Select patterns to ensure uniform coverage of all patterns based on hamming distance. Select patterns based on bit savings due to self and mask-matched repetitions 15 of 25

16 16 of 25

17 BitSavings-based Dictionary Selection  A = 0+10 = 10  B = 7+15 = 22  C = 7+15 = 22  D = 0+5 = 5  E = 0+15 = 15  F = 7+20 = 27  G =14+10 = 24 A(0) B(7) C(7) D(0) E(0)F(7) G(14) 5 10 5 5 Node Weight: number of bits saved due to frequency of the pattern Edge Weight: number of bits saved due to use of the bitmask based match Total weight: node weight + all edge weights (connected to the node) 17 of 25

18 BitSavings-based Dictionary Selection  A = 0+10 = 10  B = 7+15 = 22  D = 0+5 = 5  G =14+10 = 24 A(0) B(7) D(0) G(14) 5 5 10 Node Weight: number of bits saved due to frequency of the pattern Edge Weight: number of bits saved due to use of the bitmask based match Total weight: node weight + all edge weights (connected to the node) Continues until the dictionary is full or the graph is empty 18 of 25

19 Application Aware Code Compression 19 of 25

20 Experiments  Experimental Setup ◦ Benchmarks: TI and MediaBench ◦ Architectures: Sparc, TI TMS320C6x, MIPS  Results ◦ BCC: Bitmask-based code compression  Customized encodings for different architectures  Effects of dictionary size selection  Comparison with existing techniques ◦ ACC: Application-aware code compression  Bitmask selection  Dictionary selection 20 Of 25

21 Compression Ratio for adpcm_en Encoding 1 (one 8-bit mask) Encoding 2 (two 4-bit masks) Encoding 3 (4-bit and 8-bit masks) Encoding2 outperforms others 21 of 25

22 Comparison with other Techniques  Outperforms other dictionary-based techniques by 15%  Higher decompression bandwidth than existing compression techniques Smaller compression ratio is better Bitmask Approach 22 of 25

23 Comparison of Dictionary Selection Methods BitSavings approach outperforms both frequency- and spanning-based techniques 23 of 25

24 Compression Ratio Comparison BCC generates 15-20% improvement over other techniques ACC outperforms BCC by another 5-10% BCC: Bitmask-based Code Compression ACC: Application-aware Code Compression 24 of 25

25 ??? 25 of 25


Download ppt "Seok-Won Seong and Prabhat Mishra University of Florida IEEE Transaction on Computer Aided Design of Intigrated Systems April 2008, Vol 27, No. 4 Rahul."

Similar presentations


Ads by Google