Presentation is loading. Please wait.

Presentation is loading. Please wait.

End-to-End Data Outline Presentation Formatting Data Compression.

Similar presentations


Presentation on theme: "End-to-End Data Outline Presentation Formatting Data Compression."— Presentation transcript:

1 End-to-End Data Outline Presentation Formatting Data Compression

2 Problem The sender and receiver seeing the same data is often called the presentation format. The efficiency of the encoding involves the error detection/correcting and data compression.

3 Presentation Formatting The transformations of network data from the representation used by the application into a form suitable for transmission is called presentation formatting. The sending program encodes data into a message and the receiving application decodes the message into data. Encoding is sometimes called argument marshalling, and decoding called unmarshalling.

4 Presentation Formatting Data types we consider –integers –floats –strings –arrays –structs Application data Presentation encoding Application data Presentation decoding Message … Types of data we do not consider –images –video –multimedia documents

5 Difficulties Representation of base types –floating point: IEEE 754 versus non-standard –integer: big-endian versus little-endian (e.g., 34,677,374) Compiler layout of structures (126)(34)(17)(2) 00000010Big-endian Little-endian (2)(17)(34)(126) High address Low address 00111111 00001001 0000100100001001 0000100100000001 00111111

6 Taxonomy Data types –base types (e.g., ints, floats); must convert –flat types (e.g., structures, arrays); must pack –complex types (e.g., pointers); must linearize Conversion Strategy –canonical intermediate form –receiver-makes-right (an N x N solution) Marshaller Application data structure

7 Taxonomy (cont) Tagged versus untagged data Stubs –compiled –interpreted type = INT len = 4value =417892 Call P Client stub RPC Arguments Marshalled arguments Interface descriptor for Procedure P Stub compiler Message Specification P Server stub RPC Arguments Marshalled arguments Code

8 eXternal Data Representation (XDR) Defined by Sun for use with SunRPC C type system (without function pointers) Canonical intermediate form Untagged (except array length) Compiled stubs

9 #define MAXNAME 256; #define MAXLIST 100; struct item { int count; char name[MAXNAME]; int list[MAXLIST]; }; bool_t xdr_item(XDR *xdrs, struct item *ptr) { return(xdr_int(xdrs, &ptr->count) && xdr_string(xdrs, &ptr->name, MAXNAME) && xdr_array(xdrs, &ptr->list, &ptr->count, MAXLIST, sizeof(int), xdr_int)); } CountName JO37HNSON List 3 4972658321

10 Abstract Syntax Notation One (ASN-1) An ISO standard Essentially the C type system Canonical intermediate form Tagged Compiled or interpretted stubs BER: Basic Encoding Rules (tag, length, value) value type lengthvaluelengthtypevaluelength

11 Network Data Representation (NDR) Defined by DCE Essentially the C type system Receiver-makes-right (architecture tag) Individual data items untagged Compiled stubs from IDL 4-byte architecture tag –IntegerRep 0 = big-endian 1 = little-endian –CharRep 0 = ASCII 1 = EBCDIC –FloatRep 0 = IEEE 754 1 = VAX 2 = Cray 3 = IBM

12 Compression Overview Encoding and Compression –Huffman codes Lossless –data received = data sent –used for executables, text files, numeric data Lossy –data received does not != data sent –used for images, video, audio

13 Huffman Codes Huffman coding [1952] can be used as a reasonable approximation to the theoretical limit. 1.Write down the symbols and their probabilities: A B C D.50.30.15.05 They are the terminal nodes. 2.Find and mark the two smallest nodes. Add a node with arcs to the nodes marked. 3.Set the probability of the new node to the sum of marked nodes. 4.Repeat steps 2 and 3 until all nodes have been marked, except one the root. 5.The encoding is found by tracing the path from the root to the symbol, with left=0, right=1.

14 Huffman Codes () / \ / \1 / \ 0/ () / / \ / 0/ \1 / / \ / / () / / / \ / / 0/ \1 / / / \ (A) (B) (C) (D).5.3.15.05 0 10 110 111

15 Lossless Algorithms Run Length Encoding (RLE) –Replace consecutive occurrences of a given symbol with only one copy of the symbol, plus a count of how many times that symbol occurs. –example: AAABBCDDDD encoding as 3A2B1C4D –good for scanned text (8-to-1 compression ratio) –can increase size for data with variation (e.g., some images) Differential Pulse Code Modulation (DPCM) –First output a reference symbol and then, for each symbol in the data, to output the difference between that symbol and the reference symbol. –example AAABBCDDDD encoding as A0001123333 –change reference symbol if delta becomes too large –works better than RLE for many digital images (1.5-to-1)

16 Dictionary-Based Methods Build dictionary of variable-length strings of common terms Transmit index into dictionary for each term –For example, replace ‘compression’ with 9293. ‘compression’ is 9293rd in /usr/share/dict/words. Lempel-Ziv (LZ) – compress command is the best-known example. Commonly achieve 2-to-1 ration on text Variation of LZ used to compress GIF images –first reduce 24-bit color to 8-bit color –treat common sequence of pixels as terms in dictionary –not uncommon to achieve 10-to-1 compression ( x 3)

17 Image Compression JPEG (Joint Photographic Experts Group) is an ISO/IEC group of experts that develops and maintains standards for a suite of compression algorithms for computer image files. JPEG is also a term for any graphic image file produced by using a JPEG standard. A JPEG file is created by choosing from a range of compression qualities (actually, from one of a suite of compression algorithms). Lossy still-image compression

18 MPEG The Moving Picture Experts Group (MPEG), develops standards for digital video and digital audio compression. Lossy compression of video First approximation: JPEG on each frame Also remove inter-frame redundancy


Download ppt "End-to-End Data Outline Presentation Formatting Data Compression."

Similar presentations


Ads by Google