LZMA Докладчик : Гареев Роман

Slides:



Advertisements
Similar presentations
Image Compression, Transform Coding & the Haar Transform 4c8 – Dr. David Corrigan.
Advertisements

Introduction to Data Compression
Introduction to Computers
Chapter 4 The Components of the System Unit
Linux+ Guide to Linux Certification Chapter 12 Compression, System Backup, and Software Installation.
September,2012 File Compression 8/6/ Compiled By:- Solomon W. Demissie.
Linux+ Guide to Linux Certification, Third Edition Chapter 11 Compression, System Backup, and Software Installation.
Adnan Ozsoy & Martin Swany DAMSL - Distributed and MetaSystems Lab Department of Computer Information and Science University of Delaware September 2011.
Data dan Teknologi Multimedia Sesi 08 Nofriyadi Nurdam.
Data Compression Gabriel Laden CS146 – Dr. Sin-Min Lee Spring 2004.
Spring 2015 Mathematics in Management Science Binary Linear Codes Two Examples.
Indexing Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata.
VPC3: A Fast and Effective Trace-Compression Algorithm Martin Burtscher.
Data Compression for PDS4 Lisa Gaddis, Sue LaVoie, Jeff Anderson, Elizabeth Rye PDS Imaging Node March 26, 2010.
Lecture 10 Data Compression.
Cosc 2150: Computer Organization Chapter 2a Data compression.
Linux+ Guide to Linux Certification
Introduction to Computers
Information and Coding Theory Heuristic data compression codes. Lempel- Ziv encoding. Burrows-Wheeler transform. Juris Viksna, 2015.
1 Analysis of Algorithms Chapter - 08 Data Compression.
Hardware Data Storage.
Data Representation and Storage Lecture 5. Representations A number value can be represented in many ways: 5 Five V IIIII Cinq Hold up my hand.
1 CP Lecture 8 PC and Media exchange standards.
Install Software. UNIX Shell The UNIX/LINUX shell is a program important part of a Unix system. interface between the user & UNIX kernel starts running.
Linux+ Guide to Linux Certification Chapter Thirteen Compression, System Back-Up, and Software Installation.
Survey on Improving Dynamic Web Performance Guide:- Dr. G. ShanmungaSundaram (M.Tech, Ph.D), Assistant Professor, Dept of IT, SMVEC. Aswini. S M.Tech CSE.
Image Compression (Chapter 8) CSC 446 Lecturer: Nada ALZaben.
Getting to know Storage Media 1.Stores information 2.Retrieve information for later use.
Analysing the Impact of File Formats on Data Integrity Volker Heydegger University of Cologne Archiving 2008 Bern, 23rd – 27th June 2008.
LZRW3 Decompressor dual semester project Characterization Presentation Students: Peleg Rosen Tal Czeizler Advisors: Moshe Porian Netanel Yamin
1 Classification of Compression Methods. 2 Data Compression  A means of reducing the size of blocks of data by removing  Unused material: e.g.) silence.
Source Coding Efficient Data Representation A.J. Han Vinck.
Comp 335 File Structures Data Compression. Why Study Data Compression? Conserves storage space Files can be transmitted faster because there are less.
Image File Formats By Dr. Rajeev Srivastava 1. Image File Formats Header and Image data. A typical image file format contains two fields namely Dr. Rajeev.
Sound (analogue signal). time Sound (analogue signal) time.
Unit C-Hardware & Software1 GNVQ Foundation Unit C Bits & Bytes.
1 Introduction to Computers Prof. Sokol Computer and Information Science Brooklyn College.
Multi-media Data compression
Memory The term memory is referred to computer’s main memory, or RAM (Random Access Memory). RAM is the location where data and programs are stored (temporarily),
Computer Performance. Hard Drive - HDD Stores your files, programs, and information. If it gets full, you can’t save any more. Measured in bytes (KB,
Lecture 12 Huffman Coding King Fahd University of Petroleum & Minerals College of Computer Science & Engineering Information & Computer Science Department.
Information Systems Design and Development Media Types Computing Science.
Computer Sciences Department1. 2 Data Compression and techniques.
This slide deck is for LPI Academy instructors to use for lectures for LPI Academy courses. ©Copyright Network Development Group Module 7 Archiving.
Software Design and Development Storing Data Part 2 Text, sound and video Computing Science.
Data Compression: Huffman Coding in Weiss (p.389)
VXA: A Virtual Architecture for Durable Compressed Archives Bryan Ford Computer Science and Artificial Intelligence Laboratory Massachusetts Institute.
Or, how to make it all fit! DIGITAL VIDEO FILES AND COMPRESSION STANDARDS.
University of Maryland Baltimore County
GCSE COMPUTER SCIENCE Topic 3 - Data 3.3 Data Storage and Compression.
Data Representation.
Data Compression for PDS4
Compression & Huffman Codes
Hardware specifications
Information and Coding Theory
Some required material From ICT T529 CD
Storage Hardware This icon indicates the slide contains activities created in Flash. These activities are not editable. For more detailed instructions,
Introduction to Computer Science - Lecture 4
Data Structures and Analysis (COMP 410)
Representing Images 2.6 – Data Representation.
Data Compression.
The Linux Command Line Chapter 18
Compression, Lossy, Lossless
Computer Systems – Unit 1
Module 7 Archiving and Compression
Number Systems Instructions, Compression & Truth Tables.
Unit 2- Lesson 1 & 2- Bytes and File Sizes / Text Compression
Compression.
Thesis Presented By Mohammad Abul Kalam Azad C Shabbir Ahmad C Francis Palma Tony C Supervised by S. M. Kamruzzaman Assistant.
Presentation transcript:

LZMA Докладчик : Гареев Роман

LZMA (Lempel-Ziv-Markov chain-Algorithm) 7z(7-Zip) LZMA SDK 2 /31

3

4

LZMA: LZ77(Sliding Window) Deflate: Zip and Gzip Range Encoding 5

Jacob ZivAbraham Lempel LZ77 (Sliding Window) 6 /31

In computer science and information theory, data compression, source coding, or bit-rate reduction involves encoding information using fewer bits than the original representation. Compression can be either lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by identifying marginally important information and removing it. Compression is useful because it helps reduce the consumption of resources such as data space or transmission capacity. Because compressed data must be decompressed to be used, this extra processing imposes computational or other costs through decompression. For instance, a compression scheme for video may require expensive hardware for the video to be decompressed fast enough to be viewed as it is being decompressed, and the option to decompress the video in full before watching it may be inconvenient or require additional storage. The design of data compression schemes involve trade-offs among various factors, including the degree of compression, the amount of distortion introduced (e.g., when using lossy data compression), and the computational resources required to compress and uncompress the data. 7 /31

Закодированный текст … sir sid eastman easily t eases sea sick seals … Текст для … (16, 3, “e”) Закодированный текст … sir sid eastman easily tease s sea sick seals …. Текст для … sir sid eastman ⇒ (0,0,“s”) s ir sid eastman e ⇒ (0,0,“i”) si r sid eastman ea ⇒ (0,0,“r”) sir sid eastman eas ⇒ (0,0,“ ”) sir sid eastman easi ⇒ (4,2,“d”) 8 /31

a10.4 a20.2 a30.2 a40.1 a50.1 Hu ff man Coding a5 a4 a3 a2 a1 a45 a345 a2345 a /31

Phillip Katz Deflate: Zip and Gzip 10 /31

( сдвиг, длина, символ ) Закодированный текст … old...she needs..then…there… the new... Текст для пр … Режимы : Normal High-compression Fast ⇒ ( сдвиг, длина ) 11 /31 Режимы сжатия : Без сжатия Сжатие с фиксированным размером таблиц Сжатие с индивидуальными таблицами, создаваемыми для текущей информации

“range encoding” “Handbook of Data Compression” David Salomon, Giovanni Motta 12 /31

Закодированный текст … … the new... Текст для пр … “hash-chain”, “binary-tree” 13 /31 bt2 Binary Tree with 2 bytes hashing bt3 Binary Tree with 3 bytes hashing bt4 Binary Tree with 4 bytes hashing hc4 Hash Chain with 4 bytes hashing LZMA индекс

14 /31 Hash-chain XY 123 …… …… … … 24 …

15 /31 …abm…abcd2…abcx…abcd1…aby… Binary-tree abm… abcd2… abcd1… abm… abcx… abm… abcx… aby…

Пример реализации LZMA( реализация на JAVA из LZMA SDK) 16 /31

LZMA SDK 2004 ANSI-C/C++/C#/Java lzma.exe lzma.txt 7zFormat.txt history.txt 17 /31

Основные характеристики LZMA SDK Различный размер словаря Предполагаемая скорость сжатия : около 2MB/s на 2 GHz CPU Предполагаемая скорость распаковки : MB/s на 2 GHz Core 2 или AMD Athlon MB/s на 200 MHz RISC Небольшое количество затрат памяти для распаковки (16 KB + размер словаря ) Поддержка многопоточности 18 /31

Основные опции LZMA SDK a{N} Режим сжатия. 0, 1, 2 fast, normal, max -d{N} -si -so -mf{MF_ID} 19 /31 MF_ID Memory Description bt2 d * MB Binary Tree с 2 байтным хэшированием bt3 d * MB Binary Tree с 3 байтным хэшированием bt4 d * MB Binary Tree с 4 байтным хэшированием hc4 d * MB Hash Chain с 4 байтным хэшированием

20 /31 Lasse Collin

Сравнение Gzip, Bzip2 и LZMA AMD mobile Athlon XP MB RAM Linux gzip 1.3.3, bzip , LZMA SDK 4.17 (lzmash) Обращалось внимание на : размер файлов после сжатия время распаковки память, требуемая для распаковки обычный формат, который все знают 21 /31

Tar archive OpenOffice.org 1.1.4(Linux) (203 MB) 22 /31 gzipbzip2lzmash 140,6%35,8%31,7% 239,9%34,9%29,2% 339,3%34,5%28,0% 438,2%34,3%27,4% 537,5%34,2%26,7% 637,2%34,1%26,4% 737,1%34,1%26,1% 837,1%34,0%25,7% 937,0%34,0%25,4% Compressed size / Uncompressed size * 100%

Tar archive OpenOffice.org 1.1.4(Linux) (203 MB) 23 /31 gzipbzip2lzmash 111.5s1m 26s0m 58s 212.0s1m 40s2m 7s 313.7s1m 54s4m 58s 415.1s2m 5s5m 26s 518.4s2m 11s6m 47s 624.5s2m 18s7m 30s 729.4s2m 25s8m 24s 845.5s2m 32s10m 59s 966.9s2m 37s12m 20s Compression time gzipbzip2lzmash 13.3s16.5s11.3s 23.3s24.2s10.5s 33.3s29.2s10.5s 43.3s32.1s10.4s 53.2s34.2s10.2s 63.2s35.4s10.2s 73.2s36.5s10.1s 83.2s37.5s10.0s 93.1s38.2s10.0s Decompression time

Tar archive The Linux kernel source (199 MB) 24 /31 gzipbzip2lzmash 127,8%21,1% 226,5%19,7%18,7% 325,7%19,1%16,7% 423,9%18,7%16,1% 522,9%18,4%15,6% 622,6%18,2%15,2% 722,5%18,1%14,8% 822,4%17,9%14,5% 922,4%17,8%14,3% Compressed size / Uncompressed size * 100%

Tar archive The Linux kernel source (199 MB) 25 /31 gzipbzip2lzmash 18.3s1m 9s0m 45s 28.7s1m 22s1m 45s 39.8s1m 34s5m 10s 411.1s1m 45s5m 43s 513.8s1m 57s7m 39s 617.8s2m 2s8m 23s 720.7s2m 11s9m 11s 829.7s2m 21s11m 34s 940.9s2m 26s12m 31s Compression time gzipbzip2lzmash 12.8s12.8s7.7s 22.7s19.4s6.9s 32.6s23.8s6.4s 42.5s26.4s6.3s 52.5s28.3s6.3s 62.4s29.6s6.2s 72.4s30.6s6.2s 82.4s31.3s6.1s 92.4s32.1s6.1s Decompression time

XMMS binary package (5.2 MB)(Slackware 10.1) 26 /31 gzipbzip2lzmash 139,3%32,8%26,0% 238,4%29,3%20,7% 337,7%28,0%18,8% 436,9%27,0%18,3% 536,2%26,6%18,0% 636,0%26,1%17,9% 735,9%26,0%17,9% 835,9%25,7%17,8% 935,8%25,2%17,8% Compressed size / Uncompressed size * 100%

XMMS binary package (5.2 MB)(Slackware 10.1) 27 /31 gzipbzip2lzmash 10.3s2.4s1.4s 20.3s2.9s2.7s 30.4s3.2s6.2s 40.4s3.3s6.6s 50.5s4.6s8.2s 60.7s5.6s8.5s 70.8s4.7s8.6s 81.1s4.9s10.5s 91.8s5.1s10.5s Compression time gzipbzip2lzmash 10.1s0.4s0.3s 20.1s0.6s0.2s 30.1s0.7s0.2s 40.1s0.8s0.2s 50.1s0.9s0.2s 60.1s0.9s0.2s 70.1s0.9s0.2s 80.1s1.0s0.2s 90.1s1.0s0.2s Decompression time

XMMS source tarball (15.2 MB) 28 /31 gzipbzip2lzmash 129,5%23,2%21,2% 228,6%19,9%13,3% 327,9%18,3%12,0% 426,4%17,2%11,3% 525,7%16,7%10,8% 625,4%16,2%10,3% 725,3%15,7%9,7% 825,3%15,4%9,6% 925,3%15,1%9,6% Compressed size / Uncompressed size * 100%

XMMS source tarball (15.2 MB) 29 /31 gzipbzip2lzmash 10.7s6.1s3.5s 20.7s7.3s6.0s 30.8s8.5s19.0s 40.9s9.9s19.9s 51.1s11.2s28.9s 61.4s11.0s30.1s 71.7s12.5s30.9s 82.5s15.9s41.7s 92.9s17.5s41.7s Compression time gzipbzip2lzmash 10.2s1.0s0.6s 20.2s1.5s0.4s 30.2s1.9s0.4s 40.2s2.1s0.4s 50.2s2.3s0.4s 60.2s2.5s0.4s 70.2s2.6s0.4s 80.2s2.7s0.4s 90.2s2.8s0.4s Decompression time

Memory requirements 30 /31 gzipbzip2lzmash 1<1 MB2 MB 2<1 MB2 MB12 MB 3<1 MB3 MB12 MB 4<1 MB4 MB16 MB 5<1 MB5 MB26 MB 6<1 MB5 MB45 MB 7<1 MB6 MB83 MB 8<1 MB7 MB159 MB 9<1 MB7 MB311 MB RAM usage on compression gzipbzip2lzmash 1<1 MB1 MB 2<1 MB2 MB 3<1 MB2 MB1 MB 4<1 MB2 MB 5<1 MB3 MB 6<1 MB3 MB5 MB 7<1 MB3 MB9 MB 8<1 MB4 MB17 MB 9<1 MB4 MB33 MB RAM usage on decompression

Спасибо за внимание ! 31 /31