Download presentation

Presentation is loading. Please wait.

Published byBrayan Curt Modified over 2 years ago

1
NATO UNCLASSIFIED NATO Consultation, Command and Control Agency COMMUNICATIONS & INFORMATION SYSTEMS SYSTEMS Decreasing “Bit Pollution” through “Sequence Reduction” Dr. Davras Yavuz

2
NATO UNCLASSIFIED2 You will find this presentation and the accompanying paper at from where both can be viewed and/or downloaded (the four other NC3A presentations can also be found at the above URL)

3
NATO UNCLASSIFIED3 Terminology “Sequence Reduction” Originates with Peribit ~2000, Founder’s Ph. D. on Genome Mapping - uses the term “Molecular Sequence Reduction” (MCR) - Biomedical Informatics, Stanford University “Bit Pollution” Link/network pollution repetition of redundant digital sequences over transmission media (especially significant for mobile/deployed networks/links) Other related terms: WAN optimizer, Application Accelerator/ Optimizer or Application Controller-Optimizer, Performance Enhancement Proxies (PEP), WAN Expanders, Latency (=delay) removers/compensators/mitigators ….. etc. New & dynamic field, many terms will continue to appear, coalesce, some will catch on others will disappear

4
“Next Generation Compression”, “Bit Pollution Reduction”, “Sequence Reduction” (latter Peribit/Dr. Amit Singh) WAN Expander (WX), WAN Optimizer, WAN Optimization Controller (WOC) (Juniper/Peribit) Application Accelerator/Optimizer/Controller-Optimizer Latency Remover/Optimizer (replace Latency by “Delay” ) Especially for networks with SATCOM links In general; use of a-priori knowledge of data comms protocols required by application to optimize the data input/output Combinations of above Unfortunately all present implementations “proprietary” Unrealistic to expect “standards” soon, technology too new and lucrative Terminology

5
NATO UNCLASSIFIED5 Why “Bit Pollution” ? Most of us deal daily with various electronic files/ information Taking MS Office as an example; Word, PPT, Excel, Project, HTML, Access, …. Files …and/or many other electronic files, data-bases, forms, etc.,.. On many occasions we make small changes and send them back and/or forward to others Repetitive traffic over communication links can, in general, be classified broadly into 3 categories: 1) Application & protocol overheads 2) Commonly used words, phrases, strings, objects (logos, images, audio clips, etc.) 3) Process flows (data-base updates/views, forms, templates, etc. going back & forth)

6
NATO UNCLASSIFIED6 SEQUENCE REDUCTION Next Generation Compression - Examples 256 Kbps satellite link 20 Mbytes PPT file (48 slides) sent 1 st time : ~12 minutes (700 secs) 6 of the slides modified, file size change <0.5 Mbytes Modified file sent 6 hours later time taken: ~ 8 secs Same modified file sent 24 hours later ~ 18 secs Sent 7 days later ~24 secs Original file sent 7 days later : ~14 secs Similar results for Word, Excel files and web pages Less but still significant improvement for PDF files Smallest improvement for zipped files (reduction by ~ 2.5 to 3) Amount of “new” files in between repetitions & SR RAM/HD capacities have strong effect on the duration of repeat transmissions (dynamic library updates) Above results based on Peribit SR s : German MOD, Syracuse University “Real World” Labs (Network Computing Nov 2004) and NC3A GE MOD results based on operational traffic, others test traffic Ref [6] of paper: “Record for throughput was ~60Mbps through a T1. It came about when copying 1.5GB file twice! ”

7
Mobile/Tactical Comms Divergence NATO UNCLASSIFIED Fixed communications – WANs with all users/nodes fixedFixed communications – WANs with all users/nodes fixed Fiber-optic/photonic revolution: Essentially unlimited capacity is now possible/available if/when a cable can be installed Mobile comms: Networks with mobile/deployable users No technological revolution similar to photonic foreseen Radio propagation will be the limiting factor –Mainstay will be radio: Tactical LOS tens/hundreds of Kbps, BLOS (rough terrain, long distances) few Kbps –Star-wars scenarios : Moving laser beams ??? LEO satellites will provide some 100s of Kbps at a cost Divergence will continue Another factor: Input into the five senses : ~100 Shannon/ Entropy bps – For transmission redundancy : x 10 = 1 Kbps Fixed communications – WANs with all users/nodes fixedFixed communications – WANs with all users/nodes fixed Fiber-optic/photonic revolution: Essentially unlimited capacity is now possible/available if/when a cable can be installed Mobile comms: Networks with mobile/deployable users No technological revolution similar to photonic foreseen Radio propagation will be the limiting factor –Mainstay will be radio: Tactical LOS tens/hundreds of Kbps, BLOS (rough terrain, long distances) few Kbps –Star-wars scenarios : Moving laser beams ??? LEO satellites will provide some 100s of Kbps at a cost Divergence will continue Another factor: Input into the five senses : ~100 Shannon/ Entropy bps – For transmission redundancy : x 10 = 1 Kbps Therefore: we must treat mobile/tactical comms differently

8
NATO UNCLASSIFIED8 Deployable, Mobile, On-the-Move Communications At least one end of a link moving/deployed Networks which have nodes/users moving/deployed Such links/networks essential for survivability and rapid reaction Will be taking on increasingly more critical tasks Present approach: Use applications developed for fixed links/networks for deployed/mobile units Must consider the very different characteristics of such networks when choosing applications Can we measure information” so we can determine performance of links/ networks in terms of “information” transported, not just bits/bytes Can we measure information” so we can determine performance of links/ networks in terms of “information” transported, not just bits/bytes

9
NATO UNCLASSIFIED9 Can we measure “information” ? Yes we can ! Shannon defined the concept of “Entropy”, a logarithmic measure in 1940s (while working on cryptography), it has stood the test of time Shannon defined the concept of “Entropy”, a logarithmic measure in 1940s (while working on cryptography), it has stood the test of time First suggestion of log measure was Hartley (base 10) but Shannon used the idea to develop a complete “theory of information & communication” Shannon preferred Log 2 and called the “unit” bits Base e is also sometimes used (Nats) Smaller the probability of occurrence of an event higher the “information delivered” when it occurs Smaller the probability of occurrence of an event higher the “information delivered” when it occurs

10
C. E. Shannon (BSTJ 1948) {{ {S i } {R j } discrete Discrete, countable

11
NATO UNCLASSIFIED11 Entropy Entropy (H) in the case of two possibilities/events/symbols Prob of one = p the other q = 1-p H = -(p log p + q log q) H versus p plotted

12
NATO UNCLASSIFIED12 Let us take a “Natural Language” English as an example English has 26 letters (characters) Space as a delimiter TOTAL 27 characters (symbols) One could include punctuation, special characters, etc., for example we could use the full 256 ASCII symbol set - methodology is the same Extension to other natural languages readily made Extension to images also possible (same methodology)

13
NATO UNCLASSIFIED13 Structure of a “Natural Language” - English Defined by many characteristics: Grammar, semantics, etymology, usage, …., historical developments, …. Until early 70s there was substantial belief that “Natural Languages” and “computer programming languages” (finite automata instructions) had similarities Noam Chomsky’s work (Professor at MIT) completely destroyed those expectations Natural Languages can be studied through probabilistic (Markov) models Shannon’s approach (1940s, no computers, Bell Labs staff flipped through many pages of books to get the probabilities) He was actually working on cryptography and made important contributions in that area also

14
NATO UNCLASSIFIED14 Various Markov model examples here, skipped here for continuity, may be found at the end Various Markov model examples here, skipped here for continuity, may be found at the end

15
NATO UNCLASSIFIED15 Zipf’s Law “Principle of Least Effort” George Kingsley Zipf, Professor of Linguistics, Harvard (1902 – 1950) If the “words” in a language are ordered (“ranked”) from the most frequently used down the probability P n of the n th word in this list is P n 0.1 / n Implies a maximum vocabulary size words since ( 1 / n is not finite when summed 1 to ) For details of above see DY IEEE Transactions on Information Theory, September 1974 Many other applications of “Zipf’s Law”, if interested just make a Google/Internet search

16
Zipf’s Law (Principle of Least Effort) From “Symbols, Signals & Noise” J. R. Pierce ~ million words, various texts

17
NATO UNCLASSIFIED17 Entropy bits/character - English Amazingly it turns out to be about the same for most “Natural Languages” for which the analysis has been done (Arabic, French, German, Hebrew, Latin, Spanish, Turkish,.…). These languages also follow Zipf’s Law.

18
NATO UNCLASSIFIED18 Entropy of Natural Languages Between 1 & 2 bits per letter/character 1.5 bits per letter is commonly used English has ~4.5 letters per word on the average 4.5 x 1.5 = 6.75 or ~7 bits per word average Normal speech words per second Hence information per second ~ 5 bits

19
NATO UNCLASSIFIED19 Extension to Images Same concept and definitions Letters replaced by pixels/groups of pixels, etc. Words could be analogous to sets of pixels, objects The numbers are much larger E.g. 400 x 600 = pixel image with each pixel capable of taking on one of 16 brightness levels possible images possible images Assume all these images are equally likely (*): Probability of one these images is 1/ and the information provided by that image is log 2 16 = bitsAssume all these images are equally likely (*): Probability of one these images is 1/ and the information provided by that image is log 2 16 = bits A real image contains much smaller “information” adjacent/nearby pixels are not independent of each otherA real image contains much smaller “information” adjacent/nearby pixels are not independent of each other Movies : frame to frame only small/incremental changesMovies : frame to frame only small/incremental changes “equally likely” assumption clearly not realistic (*) “equally likely” assumption clearly not realistic

20
~5 b/s is irreducible information content, x by 10 to introduce redundancy - therefore we should be able communicate speech “information” at ~50 bps Examples of speech coding we use: bps, bps PC bps CVSD, 2400 bps LPC, MELP 1200, 600 bps MELP All above “waveform” codecs, they will also convey “non- measurable” (intangible) information Speech codecs (recognition at transmitter and synthesis at receiver ) technology could conceivably go lower than 600 bps but would not contain the intangible component ! Speech Coding

21
NATO UNCLASSIFIED21 A QUICK REFRESHER ON CONVENTIONAL COMPRESSION May be found at the end

22
NATO UNCLASSIFIED22 SEQUENCE REDUCTION Next Generation Compression Dictionary based – implements learning algorithm Dynamically learns the “language” of the communications traffic and translates into “short-hand” Continuously updates/improves “knowledge” of link “language” Frequent patterns move up in dictionary, infrequent patterns move down and eventually can age out No fixed packet or window boundaries Unlike e.g. LZ which generally uses 2048 byte window Once a pattern is learned and put in dictionary it will be compressed wherever it appears Data compression is based on previously seen data Performance improves with time as “learning” increases Very quickly at first (10 –20 minutes) and then slowly When a new application comes in, SR adapts to its “language”

23
Relative positioning of statistical and substitutional compression algorithms (from Peribit, A. P. Singh) MOLECULAR SEQUENCE REDUCTION

24
NATO UNCLASSIFIED24 “Molecular Sequence reduction”

25
NATO UNCLASSIFIED25 MSR – Technology Origins in DNA pattern matching Real time, high speed, low latency Continuously learns and updates dictionary Transparently operates on all traffic (optimized for IP) Eliminates patterns of any size, anywhere in stream Patent-pending technology

26
NATO UNCLASSIFIED26 MSR – Molecular Sequence Reduction “Next-gen dictionary-based compression”

27
NATO UNCLASSIFIED27 Government/Military use examples Many thousands of units in use in USA (mostly corporate but also government agencies) GE MOD using Peribit SRs (since ~2 years) INMARSAT German Navy WAN (encrypted) Links to GE Navy ships in/around South Africa Satellite links to GE units in Afghanistan Plans for some 64 Kbps landlines GE MOD total : 300+ units also other nations …… Some with initial trials

28
NATO UNCLASSIFIED28 Reduction rates observed (reduced by % amount given) GE Armed Forces Results Traffic type Version 3.0V 4.02V 5.0 HTTP30 %40 %46 % MAIL61 %67 % NetBios59 %62 % CIFS92 % FTP69 %73 % TELNET65 %69 % 93 %

29
NATO UNCLASSIFIED29 From German MOD

30
NATO UNCLASSIFIED30 Startup behavior example From German MOD

31
NATO UNCLASSIFIED31 From German MOD

32
NATO UNCLASSIFIED32 From German MOD

33
NATO UNCLASSIFIED33 From Peribit.com (not GE MOD data)

34
NATO UNCLASSIFIED34 EFFECTIVE WAN CAPACITY INCREASED BY 2.80 DATA REDUCTION BY % NO DATA COMPRESSION & NO REDUCTION WITH DATA COMPRESSION & REDUCTION !!! Peribit (screen capture) NC3A – WAN (NL – BE)

35
NATO UNCLASSIFIED35

36
NATO UNCLASSIFIED36 Peribit Sequence Reducers

37
NATO UNCLASSIFIED kbps satellite link Multiplexed TCP/IP Link with SCPS-TP acceleration Link with application accelerator & IP data compressor Un-accelerated link NC3A TEST RESULT SUMMARY Expand Model 4800 “WAN Link Accelerators”

38
NATO UNCLASSIFIED kbps satellite link Multiplexed TCP/IP Link with SCPS-TP acceleration Link with application accelerator & IP data compressor Un-accelerated link NC3A TEST RESULT SUMMARY

39
NATO UNCLASSIFIED39 Link with SCPS-TP acceleration Link with application accelerator & IP data compressor Un-accelerated link 512 Kbps satellite link 10 multiplexed TCP/IP sessions 512 Kbps satellite link 10 multiplexed TCP/IP sessions

40
NATO UNCLASSIFIED40 Packeteer

41
NATO UNCLASSIFIED41 Industry New area but many & increasing number of companies Peribit.com (now Juniper Networks) Expand.com (Expand Networks) Packeteer.comRiverbed.comSilver-peak.com….. National authorities (e.g. USA & GE) also working with industry to incorporate SR/WX technology into national crypto devices

42
NATO UNCLASSIFIED42 SEQUENCE REDUCTION Next Generation Compression Summary (1) WANs will form backbone of Network Enabled Operation This technology provides significant improvements in capacity Dictionary based – implements learning algorithm Dynamically learns the “language” of the communications traffic and translates into “short-hand” Continuously updates/improves “knowledge” of link “language” Frequent patterns move up in dictionary, infrequent patterns move down and eventually can age out No fixed packet or window boundaries Unlike conventional compression which operates over 1-2 Kbytes Once a pattern is learned and put in dictionary it will be compressed wherever it appears Data compression is based on previously seen data Performance improves with time as “learning” increases Very quickly at first (10 –20 minutes) and then slowly When a new application comes in, SR adapts to its “language”

43
NATO UNCLASSIFIED43 SEQUENCE REDUCTION Next Generation Compression Summary (1) Significant advantages for WANs where capacity is an issue (i.e. deployed/mobile/tactical)Significant advantages for WANs where capacity is an issue (i.e. deployed/mobile/tactical) Removes redundant/repetitive transmissionsRemoves redundant/repetitive transmissions Packet-flow acceleration (latency removal) can be easily addedPacket-flow acceleration (latency removal) can be easily added Quality of Service & Policy Based Multipath can also be implementedQuality of Service & Policy Based Multipath can also be implemented Does not impact security implementations (cryptos between SRs)Does not impact security implementations (cryptos between SRs)However Presently available from a few sources, each with its “proprietary” technologyPresently available from a few sources, each with its “proprietary” technology

44
NATO UNCLASSIFIED44 Conclusions Shannon Information Theory provides tools for measuring “information” as “Entropy” Has formed the basis for most of the coding, data transmission/detection results since 1950s DNA / Genome mapping process has also apparently benefited from it In 90s estimate for human genome was years; took 2-3 years with the computational developments in late 90s A new form of compression, “Sequence Reduction” provides significant reductions by reducing redun- dancies in transmitted data Will provide important advantages for mobile/deployable/moving WAN link applications

45
NATO UNCLASSIFIED45 QuestionsComments This presentation & associated paper can be found at

46
NATO UNCLASSIFIED46 NC3A NC3A Brussels Visiting address: Bâtiment Z Avenue du Bourget 140 B-1110 Brussels Telephone +32 (0) Fax +32 (0) Postal address: NATO C3 Agency Boulevard Leopold III B-1110 Brussels - Belgium NC3A The Hague Visiting address: Oude Waalsdorperweg AK The Hague Telephone +31 (0) Fax +31 (0) Postal address: NATO C3 Agency P.O. Box CD The Hague The Netherlands

47
NATO UNCLASSIFIED47 Markov model examples Markov model examples

48
AZEWRTZYNSADXESYJRQY_WGECIJJ_OB _KRBQPOZB_YMBUAWVLBTQCNIKFMP_K MVUUGBSAXHLHSIE_MAULEXJ_NATSKI AZEWRTZYNSADXESYJRQY_WGECIJJ_OB _KRBQPOZB_YMBUAWVLBTQCNIKFMP_K MVUUGBSAXHLHSIE_MAULEXJ_NATSKI Zero th approximation to English (zero memory) [Zero order Markov : equally likely letters, 27 numbers ] All logs base 2 Entropy = p i log (1/p i ) for i = 1 to 27 = log 27 = 4.75 bits / letter (or symbol)

49
AI_NGAE__ITF__NR_ASAEV_OIE_BAINTHHHYRO O_POER_SETRYGAIETRWCO__ EHDUARU_ EU_C_FT_NSREM_DIY_EESE_ F_O_SRIS_R __UNNASHOR_CIE_AT_XEOIT_UTKLOOUL_E AI_NGAE__ITF__NR_ASAEV_OIE_BAINTHHHYRO O_POER_SETRYGAIETRWCO__ EHDUARU_ EU_C_FT_NSREM_DIY_EESE_ F_O_SRIS_R __UNNASHOR_CIE_AT_XEOIT_UTKLOOUL_E First approximation to English (zero memory) [Zero order Markov : letter probabilities, 27 numbers ] Entropy = p i log (1/p i ) for i = 1 to 27 = ~ 4 bits / letter

50
URTESHETHING_AD_E AT_FOULE_ ITHALIORT_WACT_D_STE_MINTSAN_OLI NS__TWID_OULY_TE_THIGHE_CO_YS_TH _HR_ UPAVIDE_PAD_CTAVED_QUES_E Second approximation to English (memory) [First order Markov : e.g. prob(a|a), prob(b|a), prob(c|a), …, 27 x 27 = 729 numbers, some zero] Entropy = p i,k log (1/p i/k ) for i = 1 to 729 (= 27 x 27) = ~ 3.3 bits / letter

51
IANKS _CAN_OU_ANG_RLER_THATTED _OF_TO_SHOR_OF_TO_HAVEMEM_A_I_ MAND_AND_BUT_WHISSITABLY_THERV EREER_EIGHTS_TAKILLIS_TA_KIND_AL Third approximation to English (memory) [Second order Markov : e.g. prob(a|aa), prob(a|ab), prob(a|ac), …, ….., prob(z|zy), prob(z|zz - 27 x 27 x 27 = 19683, ~ 75% zero] (Shannon calls these “di-gram probabilities) Entropy: ~ 3 bits / letter

52
JOU_MOUPLAS_DE_MONNERNAISSAI NS_DEME_US_VREH_BRETU_DE_TOU CHEUR_DIMMERE_LLES_MAR_ELAME _RE_A_VER_IL_DOUVENTS_SO_FUITE Third approximation to French N. Abramson “Information Theory & Coding”

53
ET_LIGERCUM_SITECI_LIBEMUS_AC ERELEN_TE_VICAESCERUM_PE_NON _SUM_MINUS_UTERNE_UT_IN_ARION _POPOMIN_SE_INQUENEQUE_IRA Third approximation to ???? N. Abramson “Information Theory & Coding”

54
WE COULD CONTINUE THIS WITH CONDITIONAL PROBABILITIES GIVEN TRIPLETS (tri-grams), QUADRUPLETS (tetra-grams), … n-grams,... etc. (i.e. m th ORDER MARKOV SOURCES m 3) HOWEVER, THIS BECOMES IMPRACTICAL AS THE NUMBER OF JOINT PROBABILITIES BECOMES TOO LARGE - SO SHANNON JUMPED TO MARKOV SOURCES WITH WORDS AS SYMBOLS - symbol set no longer 27 characters, but thousands of words. However m=1,2 Markov model gives much better results than n-gram analysis as “n” is increased

55
REPRESENTING AND SPEEDILY IS AN GOOD APT OR COME CAN DIFFERENT NATURAL HERE HE THE A IN CAME THE TO OF TO EXPERT GRAY COME TO FURNISHES THE LINE MESSAGE HAD BE THESE … Fourth approximation to English [Zero order Markov with words : e.g. Probability of words, zero memory] (Shannon 1948) Entropy = ~ 2.2 bits / letter (using Zipf’s Law)

56
THE HEAD AND IN FRONTAL ATTACK ON AN ENGLISH WRITER THAT THE CHARACTER OF THIS POINT IS THEREFORE ANOTHER METHOD FOR THE LETTERS THAT THE TIME OF WHO EVER TOLD THE PROBLEM FOR AN… Fifth approximation to English (memory) [First order Markov with words : e.g. Probability (word i | word j ) (Shannon 1948)

57
BIR ANLATTIKLARINA GŰLMECE YAZDI YAPITLARININ ŞARAP BİÇİMLERİ BELA GÖRŰNŰMŰ GİBİ GİBİ AMA BİR ETMEK YOK TUTULDU GELEN GİDEN GİDEN YER KALMADI KALMADI... Fifth approximation to Turkish (memory) [First order Markov with words : e.g. Probability (word i | word j )

58
NATO UNCLASSIFIED58 A QUICK REFRESHER ON CONVENTIONAL COMPRESSION

59
Lossy Compression Not necessarily a copy of the input: most audio, image, video compression algorithms are “Lossy” – our ears and eyes have resolution thresholds Loss-less Compression Data integrity essential in digital data communications – Network compression must be “Loss-less” Two basic approaches Statistical compression algorithms Substitutional compression algorithms Conventional Compression

60
Statistical compression : Probabilities of characters in the input data calculated (or given) - frequently occurring characters are encoded into fewer bits [e.g. Huffman code, Morse code] Static coding : Once the coding is determined in accordance with the probabilities of occurrence it does not change Dynamic coding : Coding changes with “context” - for example, the occurrence of “q” in English increases the probability of occur- rence of “u” to 1, similarly the occurrence of “th” significantly increases the probability of occurrence of “e”, etc. As the amount of “historical context” information increases “dynamic coding” techniques can approach “Shannon limit”, however computational requirements increase exponentially making them impractical for real-time/on-line applications

61
Substitutional compression : Identifies repeated strings of characters (longer the better) and replaces them with reference identifiers or tokens (shorter the better) - At the receiver the tokens are de-referenced and the reverse substitution performed Essentially a form of “pattern recognition” and classification Pattern detection/recognition generally much faster than computations needed for dynamic coding algorithms Most network compression techniques in use today use substitutional compression Compression techniques can also be combined – for example substitution based compression followed by static coding, etc.

62
“Substitution” based compression is the basis of almost all network compression implementations Principle of all : replace repeated patterns with shorter tokens Different techniques for detecting/encoding repeated patterns Two basic approaches : Lempel-Ziv (LZ) “stateless” window compression e.g. v.42bis, fax compression, LZS(STAC) Predictor compression Tries to predict the next input byte : the matching algorithm looks for the most recent match of any pattern rather than best and longest match - higher speed but misses many significant pattern repetitions therefore lower data reduction (not much used)

63
Published in 1977 (hence LZ77) Basis of ~all loss-less data compression implementations today Repeated “strings” replaced by “pointers” to the previous location where the string had occurred Buffer or “window” required for the “historical” information to be available for reference – typically 1000 – 2000 bytes (mostly 2048 bytes) All previous data outside the buffer/window is lost or “forgotten” hence the name “stateless” or memory-less Can find and compress only patterns that are repeated within the window – repetitions separated by more than window size are ignored Poor scalability: For compression efficiency large window size is required but this increases pattern search computation significantly Good for “file compression” type applications Lempel-Ziv (LZ) “stateless” window compression

64
NATO UNCLASSIFIED64

65
NATO UNCLASSIFIED65 Nov 1978, University of Pennsylvania, Museum Hall, Banquet in honor of Claude E. Shannon receiving H. Pender award (Prof. F. Haber & DY)

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google