1 Languages and Compilers (SProg og Oversættere) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to Elsa Gunter who’s.

Slides:



Advertisements
Similar presentations
Intermediate Code Generation
Advertisements

Arrays and records Programming Language Design and Implementation
Programming Languages and Paradigms
Chapter 7:: Data Types Programming Language Pragmatics
Elementary Data Types Prof. Alamdeep Singh. Scalar Data Types Scalar data types represent a single object, i.e. only one value can be derived. In general,
Chapter 5: Elementary Data Types Properties of types and objects –Data objects, variables and constants –Data types –Declarations –Type checking –Assignment.
Chapter Four Data Types Pratt 2 Data Objects A run-time grouping of one or more pieces of data in a virtual machine a container for data it can be –system.
Compiler Construction
ICE1341 Programming Languages Spring 2005 Lecture #9 Lecture #9 In-Young Ko iko.AT. icu.ac.kr iko.AT. icu.ac.kr Information and Communications University.
Chapter 6 Structured Data Types Arrays Records. Copyright © 2007 Addison-Wesley. All rights reserved. 1–2 Definitions data type –collection of data objects.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 5 Types Types are the leaven of computer programming;
1 Chapter 4 Language Fundamentals. 2 Identifiers Program parts such as packages, classes, and class members have names, which are formally known as identifiers.
Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Introduction.
ISBN Chapter 6 Data Types: Structured types.
G. Levine Chapter 6 Chapter 6 Encapsulation –Why do we want encapsulation? Programmer friendliness- programmer need not know about these details –Easier.
ISBN Chapter 6 Data Types. Copyright © 2006 Addison-Wesley. All rights reserved.1-2 Definitions A data type defines a collection of data.
Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Introduction.
Elementary Data Types Scalar Data Types Numerical Data Types Other
PZ04A Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, PZ04A - Scalar and composite data Programming Language.
Primitive Data Types: Numbers Strings Ordinal Types Pointers
Structured Data Types and Encapsulation Mechanisms to create new data types: –Structured data Homogeneous: arrays, lists, sets, Non-homogeneous: records.
1 Type Type system for a programming language = –set of types AND – rules that specify how a typed program is allowed to behave Why? –to generate better.
Chapter 6 Structured Data Types Arrays Records. Copyright © 2007 Addison-Wesley. All rights reserved. 1–2 Definitions data type –collection of data objects.
Data Types. Primitives Integer Float Character Boolean Pointers Aggregates Strings Records Enumerated Arrays Objects.
MT311 Java Application Programming and Programming Languages Li Tak Sing ( 李德成 )
College of Computer Science and Engineering
1 Programming Languages Implementation of Data Structures Cao Hoaøng Truï Khoa Coâng Ngheä Thoâng Tin Ñaïi Hoïc Baùch Khoa TP. HCM.
ISBN 0-321— Chapter 6 sections 1-4, 9 Primitive Data Types Numbers Strings Ordinal Types Pointers.
Names Variables Type Checking Strong Typing Type Compatibility 1.
Basic Semantics Associating meaning with language entities.
CSE 425: Data Types I Data and Data Types Data may be more abstract than their representation –E.g., integer (unbounded) vs. 64-bit int (bounded) A language.
1 Records Record aggregate of data elements –Possibly heterogeneous –Elements/slots are identified by names –Elements in same fixed order in all records.
Arithmetic Expressions
These notes were originally developed for CpSc 210 (C version) by Dr. Mike Westall in the Department of Computer Science at Clemson.
Other data types. Standard type sizes b Most machines store integers and reals in 4 bytes (32 bits) b Integers run from -2,147,483,648 to 2,147,483,647.
ISBN Chapter 6 Data Types Introduction Primitive Data Types User-Defined Ordinal Types.
Types(1). Lecture 52 Type(1)  A type is a collection of values and operations on those values. Integer type  values..., -2, -1, 0, 1, 2,...  operations.
CS 330 Programming Languages 10 / 30 / 2007 Instructor: Michael Eckmann.
12/9/20151 Programming Languages and Compilers (CS 421) Elsa L Gunter 2112 SC, UIUC Based in part on slides by Mattox.
MT311 Java Application Development and Programming Languages Li Tak Sing ( 李德成 )
Operators & Identifiers The Data Elements. Arithmetic Operators exponentiation multiplication division ( real ) division ( integer quotient ) division.
Computer Organization and Design Pointers, Arrays and Strings in C Montek Singh Sep 18, 2015 Lab 5 supplement.
ISBN Chapter 6 Data Types. Copyright © 2006 Addison-Wesley. All rights reserved. 6-2 Chapter 6 Topics Introduction Primitive Data Types.
CS 330 Programming Languages 11 / 01 / 2007 Instructor: Michael Eckmann.
MT311 Java Application Development and Programming Languages Li Tak Sing ( 李德成 )
ISBN Chapter 6 Data Types. Copyright © 2006 Addison-Wesley. All rights reserved.2 Primitive Data Types Almost all programming languages.
Names, Scope, and Bindings Programming Languages and Paradigms.
1 Scalar and composite data Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Scalar and composite data Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
CSI 3125, Data Types, page 1 Data types Outline Primitive data types Structured data types Strings Enumerated types Arrays Records Pointers Reading assignment.
Records type city is record -- Ada Name: String (1..10); Country : String (1..20); Population: integer; Capital : Boolean; end record; struct city { --
1 CE 454 Computer Architecture Lecture 4 Ahmed Ezzat The Digital Logic, Ch-3.1.
Data Types In Text: Chapter 6.
Programming Languages and Compilers (CS 421)
Chapter 6 – Data Types CSCE 343.
Chapter 6 Data Types.
Type Checking Generalizes the concept of operands and operators to include subprograms and assignments Type checking is the activity of ensuring that the.
Chapter 6: Data Types Lectures # 10.
Programming Languages and Compilers (CS 421)
Lecture 16: Introduction to Data Types
CS 326 Programming Languages, Concepts and Implementation
Instructor : Ahmed Alalawi Slides from Chung –Ta King
Programming Languages and Compilers (CS 421)
Complex Data Types One very important measure of the “goodness” of a PL is the capability of its data types to model the problem space variables Design.
Introduction to Abstract Data Types
Chapter 6 Data Types.
Introduction to Data Structure
Course Overview PART I: overview material PART II: inside a compiler
Lecture 7: Types (Revised based on the Tucker’s slides) 10/4/2019
SPL – PS2 C++ Memory Handling.
Presentation transcript:

1 Languages and Compilers (SProg og Oversættere) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to Elsa Gunter who’s slides this lecture is based on.

2 Type Checking When is op(arg1,…,argn) allowed? Type checking assures that operations are applied to the right number of arguments of the right types –Right type may mean same type as was specified, or may mean that there is a predefined implicit coercion that will be applied Used to resolve overloaded operations

3 Type Checking Type checking may be done statically at compile time or dynamically at run time Untyped languages (eg LISP, Prolog) do only dynamic type checking Typed languages can do most type checking statically

4 Dynamic Type Checking Performed at run-time before each operation is applied Types of variables and operations left unspecified until run-time – Same variable may be used at different types

5 Static Type Checking Performed after parsing, before code generation Type of every variable and signature of every operator must be known at compile time

6 Static Type Checking Can eliminate need to store type information in data object if no dynamic type checking is needed Catches many programming errors at earliest point

7 Strongly Typed Language When no application of an operator to arguments can lead to a run-time type error, language is strongly typed Depends on definition of “type”

8 Strongly Typed Language C is “strongly typed” but type coercions may cause unexpected (undesirable) effects; no array bounds check (in fact, no runtime checks at all) SML “strongly typed” but still must do dynamic array bounds checks, arithmetic overflow checks

9 How to Handle Type Mismatches Type checking to refuse them Apply implicit function to change type of data –Coerce int into real –Coerce char into int

10 Conversion Between Types: Explicit: all conversions between different types must be specified Implicit: some conversions between different types implied by language definition – Implicit conversions called coercions

11 Coercion Examples Example in Pascal: var A: real; B: integer; A := B –Implicit coercion - an automatic conversion from one type to another

12 Coercions Versus Conversions When A has type int and B has type real, many languages allow coercion implicit in A := B In the other direction, often no coercion allowed; must use explicit conversion: –A := round(B); Go to integer nearest B –A := trunc(B); Delete fractional part of B

13 Type Equality (aka Type Compatibility) When are two types “the same”? Name equivalence: two types equal only if they have the same name –Simple but restrictive –Usually loosened to allow two types to be equal when one is defined with the name of the other (declaration equivalence)

14 Type Equality Structure equivalence: Two types are equivalent if the underlying data structures for each type are the same –Problem: how far to go – are two records with the same number of fields of same type, but different labels equivalent?

15 Elementary Data Types Data objects contain single data value with no components Standard elementary types include: integers, reals, characters, booleans, enumerations, pointers (references in SML)

16 Specification of Elementary Data Types Basic attributes of type usually used by compiler and then discarded Some partial type information may occur in data object Values usually match with hardware types: 8 bits, 16 bits, 32 bits, 64 bits Operations: primitive operations with hardware support, and user-defined operations built from primitive ones

17 Integers – Specification Range of integers for some fixed minint to some fixed maxint, typically -2^31 through 2^31 – 1 or –2^30 through 2^ Standard collection of operators: +, -, *, /, mod, ~ (negation) Standard relational operations: =,, =, =/=

18 Integers - Implementation Implementation: –Binary representation in 2’s complement arithmetic –Three different standard representations: S Data Sign bit (0 for +, 1 for -) Binary integer

19 Integers - Implementation First kind: S Data Sign bit (0 for +, 1 for -) Binary integer

20 Second kind Third kind T Address Integers – Implementation S DataT S Data Type descriptor Sign bit

21 Integer Numeric Data Positive values = sign bit

22 Subranges Example (Ada): A:integer range Subtype of integers (implicit coercion into integer)

23 Subranges Data may require fewer bits than integer type –Data in example above require only 4 bits Range checking usually requires some runtime time information and dynamic type checking

24 IEEE Floating Point Format IEEE standard 754 specifies both a 32- and 64-bit standard At least one supported by most hardware Numbers consist of three fields: –S (sign), E (exponent), M (mantissa) S E M

25 Floating Point Numbers: Theory Every non-zero number may be uniquely written as (-1) S * 2 e * m where 1  m < 2 and S is either 0 or 1

26 Floating Point Numbers: Theory Every non-zero number may be uniquely written as (-1) S * 2 (E – bias) * (1 + (M/2 N )) where 0  M < 1 N is number of bits for M (23 or 52) Bias is 127 of 32-bit ints Bias is 1023 for 64-bit ints

27 IEEE Floating Point Format (32 Bits) S: a one-bit sign field. 0 is positive. E: an exponent in excess-127 notation. Values (8 bits) range from 0 to 255, corresponding to exponents of 2 that range from -127 to 128.

28 IEEE Floating Point Format (32 Bits) M: a mantissa of 23 bits. Since the first bit of the mantissa in a normalized number is always 1, it can be omitted and inserted automatically by the hardware, yielding an extra 24th bit of precision.

29 Exponent Bias If 8 bits (256 values) +127 added to exponent to get E If E = 127 then = 0 is true exponent If E = 129 then = 2 is true exponent If E = 120 then = -7 is true exponent

30 Floating Point Number Range In 32-bit format, the exponent has 8 bits giving a range from –127 to 128 for exponent This give a number range from to roughly speaking

31 Floating Point Number Range In 64-bit format,the exponent is extended to 11 bits giving a range from to for the exponent This gives a range from to roughly speaking

32 Decoding IEEE format Given E, and M, the value of the representation is: Parameters Value E=255 and M  0 An invalid number E=255 and M = 0  0<E<255 2 {E-127} (1+(M/ 2 23 )) E=0 and M  (M / 2 23 ) E=0 and M=0 0

33 Example Floating Point Numbers +1= 2 0 *1= 2 { } *(1 +.0) … +1.5= 2 0 *1.5= 2 { } *( / 2 23 ) … -5= -2 2 *1.25= 2 { } *( / 2 23 ) …

34 Other Numeric Data Short integers (C) - 16 bit, 8 bit Long integers (C) - 64 bit Boolean or logical - 1 bit with value true or false (often stored as bytes) Byte - 8 bits

35 Other Numeric Data Character - Single 8-bit byte characters ASCII is a 7 bit 128 character code Unicode is a 16-bit character code (Java) In C, a char variable is simply 8-bit integer numeric data

36 Enumerations Motivation: Type for case analysis over a small number of symbolic values Example: (Ada) Type DAYS is {Mon, Tues, Wed, Thu, Fri, Sat, Sun} Implementation: Mon  0; … Sun  6 Treated as ordered type (Mon < Wed) In C, always implicitly coerced to integers

37 Pointers A pointer type is a type in which the range of values consists of memory addresses and a special value, nil (or null) Use of pointers to create arbitrary data structures

38 Pointer Data Each pointer can point to an object of another data structure –Its l-value is its address; its r-value is the address of another object Accessing r-value of r-value of pointer called dereferencing

39 Pointer Aliasing A:= B –Numeric assignmentA:B: –Pointer assignment A: B:

40 Problems with Pointers Dangling Pointer A: Delete A B: Garbage (lost heap-dynamic variables) A: B:

41 Ways to Create Dangling Pointers int * A, B; A = new int; A = 5; B = A; delete A; /* B is still pointing to the address of object A returned to stack */

42 Ways to Create Dangling Pointers int * A; int * sub () { int B; B = 5; return B;} main () { A = sub();... } /* A has been assigned the address of an object that is out of scope */

43 SML references An alternative to allowing pointers directly References in SML can be typed … but they introduce some abnormalities

44 SML imperative constructs SML reference cells –Different types for location and contents x : int non-assignable integer value y : int ref location whose contents must be integer !y the contents of location y ref x expression creating new cell initialized to x –SML assignment operator := applied to memory cell and new contents –Examples y := x+3 place value of x+3 in cell y; requires x:int y := !y + 3 add 3 to contents of y and store in location y

45 SML examples Create cell and change contents val x = ref “Bob”; x := “Bill”; Create cell and increment val y = ref 0; y := !y + 1; While loop val i = ref 0; while !i < 10 do i := !i +1; !i;

46 Composite Data Types Composite data types are sets of data objects built from data objects of other types Elements called data structures Some created by users, eg an array of integers Some created internally by compiler, eg symbol table, or subroutine activation record

47 Specification of Structured Data Types Number of components –Fixed or varying over life of data structure Arrays and records have fixed number Lists have variable number –If variable number of components, is there a max number possible

48 Specification of Structured Data Types Type of each component –Homogeneous: all components have same type Arrays –Heterogeneous: components have varying types Records (also lists in some languages, but not SML)

49 Specification of Structured Data Types Method of accessing components –Array subscripting –Record labels –SML datatype pattern matching

50 Operations on Data Structures Creation and deletion of structures Whole-structure operations –Assigning to variable –Iterating a function over the structure –Computing its length or size

51 Operations on Data Structures Component selection operations –Direct access (aka random selection) Takes constant time –Sequential selection Usually proportional to some dimension of the structure (like the number of components) –May allow component update, or may only allow access to value

52 Operations on Data Structures Component insertion and deletion –Applies to structures with variable number of components –Causes major effects on possible data layouts Example seen in the layouts for strings

53 General Layout of Data Structures Descriptor –Contains type information and other attributes of data structure –May only exist in symbol table at compile time, or may be a direct part of data object, or split between two –Usually several words long

54 General Layout of Data Structures Layout of component data –Sequential: arrays and records Uses least storage for structure if number of components fixed Least flexible for overall storage management

55 General Layout of Data Structures Layout of component data –Linked: lists, trees Uses more space per structure since each component must also have a pointer to it Maximum flexibility for overall storage management, put pieces where they fit

56 Strings Character string is a data object composed of a sequence of characters Main kinds: –Fixed declared length –Variable length with declared maximum length –Unbounded length

57 String operations String concatenation Length of string Substring selection by position Lexicographical ordering (based on underlying codes such as ASCII) Substring by pattern matching

58 String Interface Can be implemented as primitive type (as in SML or Java) or an array of characters (as in C and C++) If primitive, operations are built in If array of characters, string operations provided through a library

59 String Implementations Fixed declared length (aka static length) –Packed array padded with blanks Descriptor Data A l l  a b o a r d ø ø String Length=12 Pointer to data

60 String Implementations May need runtime descriptor for type, and length is substring operations include runtime checks Update pads with blanks or truncates as necessary

61 String Implementations Variable length with declared maximum (aka limited dynamic length) –Packed array with runtime descriptor String Max Length=12 Cur Length=10 Pointer to data A l l  a b o a r d

62 String Implementations Descriptor may occur as initial block of data object for array

63 String Implementations Unbounded length (aka dynamic length) –Two standard implementations –First: Linked list A l l  String Curr Length = 10 Pointer to data a b o ar d

64 String Implementations Unbounded length –Second implementation: null terminated contiguous array –Must reallocate and copy when string grows A l l  a b o a r d String Pointer to data

65 Arrays Ordered sequence of fixed number of objects all of the same type Indexed by integer, subrange, or enumeration type, called subscript Multidimensional arrays have one subscript per each dimension L-value for array element given by accessing formula

66 Type Checking Arrays Basic type – array Number of dimensions Type of components Type of subscript Range of subscript (must be done at runtime, if at all)

67 Array Layout Assume one dimension 1 dim array Virtual Origin (VO) Lower Bound (LB) Upper Bound (UB) Comp type Comp size (E) A[LB] A[LB+1] A[UB] A[0] 

68 Array Component Access Component access through subscripting, both for lookup (r-value) and for update (l-value) Component access should take constant time (ie. looking up the 5 th element takes same time as looking up 100 th element)

69 Array Access Function L-value of A[i] = VO + (E * i) =  + (E * (i – LB)) Computed at compile time VO =  - (E * LB) More complicated for multiple dimensions

70 Records Ordered sequence of fixed number of objects of differing types Indexed by fixed identifiers called labels or fields L-value for record element given by more complex accessing formula than for arrays

71 Typical Record Layout Descriptor Data R.1 R.2 R.n Record type Num. of components Comp 1 label Comp 1 type Comp 1 location =  Comp n label Comp n type Comp n location

72 Type Checking Record Basic type – record Number, name (label) of components Possibly order of labels –If order matters, labels must be unique –If order doesn’t matter, layout must give a canonical ordering Type of components per label

73 Record Layout Most of descriptor exists only at compile time Access function: Comp i location given by L-value of R.i =  +  (size of R.j) i - 1 j = 1

74 Lists Ordered collection of variable number of elements –Many languages (LISP, Scheme, Prolog) allow heterogeneous list –SML has only homogeneous lists

75 Lists Layout: linked series of cells (called cons cells) with descriptor, data and pointers –Data in first cell of list called head of list –R-value of pointer in first cell called tail of list

76 Lists Sequential access of data by following pointers –Access is linear in position in list Takes twice as long to look up 10 th element as to look up 5 th element

77 Lists Adding a new element to list done only at head, called consing Creates new cell with element to be added and pointer to old list (ie. creates new list)

78 List Layout Example: [1,2.5,’a’] list int 1 real 2.5 char ‘a’

79 List Layout Example: [[1,2.5],[’a’]] list int 1 real 2.5 char ‘a’list

80 Union Types Set-wise the (discriminated) union of the component types Interchangeable with variant records as primitive type construct Elements chosen from one of component types

81 Union Types Problem: if int occurs as two different components of union type, can we tell which component an int is for?

82 Union Types Two kinds of union types: –Free union - Ans: no –Discriminated union – Ans: yes If each component is tagged to separate occurrences of same type, discriminated union, otherwise not

83 Descriptor Data No tag if free union L is fixed length of biggest component Union Layout Union type Component type Component tag Component location Actual data Unused space L

84 Combining Data Structures Possible to have any of the above structures as components of others Since lists are of variable size, but arrays must store fixed size element, how to store lists in an array?

85 Combining Data Structures Answer: cons cells have uniform size, store just the leading cons cell

86 Example: Data in 4-element array of lists list int 5 int 6 int 3 int 1 int 7 int 2

87 Type symmary Static type checking takes place after syntax check and before code generation Some type checking can be necessary at run time Types vs. Syntax Simply typed values and composite values User defined types Equivalence on types