Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data.

Similar presentations


Presentation on theme: "Data."— Presentation transcript:

1 Data

2 Characteristics Location Data type Structure Size

3 Characteristics Location Data type Structure Size Where it is stored
global/static location set at compile time automatic variables location set at runtime (offset from stack position)

4 Characteristics Location Data type Structure Size
Defines acceptable values and interpretation char integer float Boolean pointer/reference

5 Characteristics Location Data type Structure Size Contents and layout
Primitive Array Record

6 Characteristics Memory consumed Location Fixed/Variable Data type
Structure Size Memory consumed Fixed/Variable Bounds Lower Upper

7 Primitives: Integers Usually a range of integer values corresponding to the number of bits used to store the value C: char, short, int, long, long long But some languages support arbitrary-sized integers Java: BigNum Python: Long

8 Primitives: Integers C supports signed and unsigned integers
Java supports only signed integers This limitation is at the JVM level, so languages like Clojure and Scala have the same restriction

9 Primitives: Integers Most hardware represents signed integers using twos-complement notation: Any integer for which the most significant bit is a one is considered negative. To invert the sign of a negative number, you invert all of the bits and add one. b is -17 b + 1 = b The advantage is that the same adder logic can be used for all integers

10 Primitives: Floating Point
A means of approximating real numbers in a fixed amount of space IEEE 754 16, 32, 64, 128, bit floats

11 Primitives: Floating Point
Sign bit, Significand, Exponent The significand is interpreted as having an assumed one as the most significant bit The granularity is a function of the size of the value

12 Primitives: Floating Point
Take 16-bit float as an example For numbers greater than 256, the fractional part has a granularity of 0.25 Consider 300

13 Primitives: Decimal IBM mainframes provided efficient operations
PL/I and COBOL provided primitives Binary-Coded Decimal Either eight bits or four bits per digit

14 Primitives: Decimal Advantage Accurate representation of money
Binary floating point representations can't describe 0.1

15 Primitives: Character
Historically, an unsigned 8-bit value Some character sets and protocols only supported seven-bit characters SMTP (simple mail transfer protocol)

16 Primitives: Character
Eight-bit characters are not adequate for representing all of the worlds character sets Unicode provides code-points for many languages It is better to think of Unicode encoding on the entire string The most popular, UTF-8, uses a different number of bytes per code-point depending on its value English (7-bit ASCII) - one byte Korean - three bytes

17 Primitives: Integer Subranges
Allows you to specify the minimum and maximum value of an integer Pascal provided this Type T = 0..51; From a type theory perspective, these are a bit problematic. We usually like to think of integer types as closed under addition, but the sum of two variables of type T should be stored in a bigger type Type TT = In general, these types require runtime checks to be maintained. This is a really simple version of a dependent type (about which we may say more later).

18 Primitives: Enumerations
A version of integer subranges that names each of the available values enum workdays { Monday, Tuesday, Wednesday, Thursday, Friday}; They are implemented as an integer "under the hood" They are particularly useful for C's switch statement

19 Complex Data String Arrays Associative Arrays Records Unions

20 Strings An ordered collection of characters Options Mutable?
C,C++ : yes Java, Python : no Size stored as metadata? C : no C++ : yes (std::string) Java: yes

21 Strings in C

22 Strings in Java

23 Arrays Collection of one or more data elements
Dimensions may be fixed or dynamic Fixed dimensions may be known at compile time or at runtime In C99, a function may declare an array with size set as a function of the function's parameters.

24 Dynamic Arrays Grow as needed Two implementations
Contiguous memory with resize C++ std::vector Segmented C++ std::deque

25 Dynamic Arrays: std::vector

26 Dynamic Arrays: std::vector

27 Dynamic Arrays: std::vector

28 Dynamic Arrays: std::vector

29 Dynamic Arrays: std::vector

30 Dynamic Arrays: std::deque

31 Multi-Dimensional Arrays
Guaranteed rectangular (solid) In C, int a[2][3] looks like But in Fortran, it would be 0,0 0,1 0,2 1,0 1,1 1,2 0,0 0,1 1,0 1,1 2,0 2,1

32 Arrays of Arrays

33 Arrays of Arrays

34 Arrays of Arrays

35 Associative Array Also called key-value pairs
Any object can be used as a key

36 Associative Array C++ provides two versions std::map
Requires that keys provide a < (less-than) operator Implemented with red-black tree std::unordered_map Implemented with a hash table

37 Record A data structure composed of a fixed number of elements
Each of which is at a known offset from the beginning of the structure That may be different data types

38 Record In C, these are structs struct data { char a; int b; short c;
float d; double e; }

39 Record In C, these are structs struct data { char a; int b; short c;
float d; double e; } How much memory does this consume?

40 Record In C, these are structs struct data { char a; int b; short c;
float d; double e; } How much memory does this consume? Nominally: = 19 bytes

41 Record In C, these are structs struct data { char a; int b; short c;
float d; double e; } But most architectures perform better on values that are aligned in memory according to their size How much memory does this consume?

42 Record In C, these are structs struct data { char a; int b; short c;
float d; double e; } How much memory does this consume? 24 bytes!

43 Union types In C, this is like a structure, but for which its elements overlap in memory union U { float floatVal; int intVal; char charVal; };

44 Union types In C, this is like a structure, but for which its elements overlap in memory union U { float floatVal; int intVal; char charVal; }; Only consumes four bytes (size of largest member)

45 Union types In C, this is like a structure, but for which its elements overlap in memory U u; Elements accessed as u.floatVal; or u.intVal; union U { float floatVal; int intVal; char charVal; }; Only consumes four bytes (size of largest member)

46 Union types In C, this is like a structure, but for which its elements overlap in memory U u; Elements accessed as u.floatVal; or u.intVal; union U { float floatVal; int intVal; char charVal; }; Only consumes four bytes (size of largest member) No type checking is done! You can write as an integer and read as a float!

47 Algebraic Data Types Available in languages like ML, Haskell, etc.
Based on building data types out of the operators + and * A record of name, age, and favorite color would be String * integer * color

48 Algebraic data types They are useful for building structures without resorting to the use of null pointers Consider a binary tree data BinTree: | leaf | node(value :: Number, left :: BinTree, right : BinTree) end

49 Algebraic Data Types data BinTree: | leaf | node(value :: Number, left :: BinTree, right : BinTree) end Every element in the binary tree must be either a node or a leaf The only way to access the value and left/right fields of a node version of a BinTree is a test that ensures that it actually is a node rather than a leaf

50 No null-pointer exceptions!
Algebraic Data Types data BinTree: | leaf | node(value :: Number, left :: BinTree, right : BinTree) end Every element in the binary tree must be either a node or a leaf The only way to access the value and left/right fields of a node version of a BinTree is a test that ensures that it actually is a node rather than a leaf No null-pointer exceptions! Ever!


Download ppt "Data."

Similar presentations


Ads by Google