CSE 452: Programming Languages

Name: CSE 452: Programming Languages
Uploaded: 2017-07-29T20:56:57+00:00
Duration: PTM37S31
Description: CSE 452: Programming Languages

CSE 452: Programming Languages
Subprograms

Outline Subprograms Parameter passing Type checking
Using multidimensional arrays as parameters Using subprograms as parameters Overloaded subprograms Generic subprograms Implementation

Parameter Passing Pass-by-value Pass-by-result Pass-by-value-result
Pass-by-reference Pass-by-name

Parameter Passing in PL
Fortran Always use inout-mode model of parameter passing Before Fortran 77, mostly used pass-by-reference Later implementations mostly use pass-by-value-result C mostly pass by value Pass-by-reference is achieved using pointers as parameters int *p = { 1, 2, 3 }; void change( int *q) { q[0] = 4; } main() { change(p); /* p[0] = 4 after calling the change function */

C Pass-by reference: value of pointer is copied to the called function and nothing is copied back #include <stdio.h> void swap (int *p, int *q) { int *temp; temp = p; p = q; q = temp; } main() { int p[] = {1, 2, 3}; int q[] = {4, 5, 6}; int i; swap (p, q); When return, p and q will be pointing to the same thing as before the swap call.

C++ includes a special pointer type called a reference type void GetData(double &Num1, const int &Num2) { int temp; for (int i=0; i<Num2; i++) { cout << “Enter a number: “; cin >> temp; if (temp > Num1) { Num1 = temp; return; } } Num1 and Num2 are passed by reference const modifier prevents a function from changing the values of reference parameters Referenced parameters are implicitly dereferenced Why do we need a constant reference parameter? Dereferencing: resolves the pointer (gets at the l-value of the pointer variable) Num1 and Num2 pointers: *Num1 refers to the data value of the address pointed to by Num1 Why need constant reference params? When you pass a const reference it is assumed that the inner statements do not modify the passed object.

Ada Reserved words: in, out, in out (in is the default mode) procedure temp(A : in out Integer; B : in Integer; C : in Integer ) out mode can be assigned but not referenced in mode can be referenced but not assigned in out can be both referenced and assigned Fortran Semantic modes are declared using Intent attribute Subroutine temp(A, B, C) Integer, Intent(Inout) :: A Integer, Intent(In) :: B Integer, Intent(Out) :: C

Perl Actual parameters are implicitly placed in a predefined array sub foo { local $i, $a=0, $b = 1; for ($i=0; $i++) { $a = $a + $_[$i]; $b = $b * $_[$i]; } return ($a, $b); … ($a, $b) = foo(1, 2, 3); indicates the length of the Implicit is that the actual params (1,2,3) are in the @ is always used to contain actual params in the subprogram.

Type Checking Ansi C: users can choose whether parameters should be type-checked #include <stdio.h> double count1(x) double x; // avoids type checking { return x * 2; } // may generate nonsense: count1(y) double count2(double x) // prototype method { return x * 2; } // can coerce actual params: count2(y) main() { double x = 30.0; int y = 30; printf("count1 : %f\n", count1(x)); printf("count2 : %f\n", count2(x)); printf("count1 : %f\n", count1(y)); printf("count2 : %f\n", count2(y)); } Output: count1 : count2 : count1 :

Implementing Parameter Passing
Code Data Heap Stack Memory contents program code global and static data Dynamically allocated variables Most subprograms: static or stack-dynamic: Stack-dynamic: allocate space for local vars and referencing environment when Execute the subprogram. The referencing env goes away when the subprogram Terminates. local data

Pass by Value Values copied into stack locations Stack locations serve as storage for corresponding formal parameters Pass by Result Implemented opposite of pass-by-value Values assigned to actual parameters are placed in the stack, where they can be retrieved by calling program unit upon termination of called subprogram Pass by Value Result Stack location for parameters is initialized by by the call and then copied back to actual parameters upon termination of called subprogram

Pass by Reference Regardless of type of parameter, put the address in the stack For literals, address of literal is put in the stack For expressions, compiler must build code to evaluate expression before the transfer of control to the called subprogram Address of memory cell in which code places the result of its evaluation is then put in the stack Compiler must make sure to prevent called subprogram from changing parameters that are literals or expressions Access to formal parameters is by indirect addressing from the stack location of the address

val res val-res ref sub(w,x,y,z) sub(a,b,c,d) Main program calls sub(w,x,y,z) where w is passed by value, x is passed by result, y is passed by value-result, and z is passed by reference

Pass by Name run-time resident code segments or subprograms evaluate the address of the parameter called for each reference to the formal Very expensive, compared to pass by reference or value-result

Multidimensional Arrays as Parameters
C: Uses row major order for matrices address(mat[i, j]) = address(mat[0,0]) + i*num_columns + j Must specify num_columns but not num_rows void fun (int matrix[][10]) { … } void main() { int mat[5][10]; fun(mat); … } Does not allow programmers to write function that accepts different number of columns Alternative: use pointers C: row-major order (collections/columns of rows of arrays) Fortran: column major order (rows of columns of arrays)

Ada: type Mat_Type is array (Integer range <> Integer range<>) of Float; Mat1 : Mat_Type(1..100, 1..20); function Sumer(Mat : in Mat_Type) return Flat is Sum : Float := 0.0; begin for Row in Mat’range(1) loop for Col in Mat’range(2) loop Sum := Sum + Mat(Row, Col); end loop; return Sum; end Sumer; No need to specify size of array Use range attribute to obtain size of arrray

Fortran Array parameters must have declaration after the header Subroutine Sub (Matrix, Rows, Cols, Result) Integer, Intent(In) :: Rows, Cols Real, Dimension(Rows, Cols), Intent(In) :: Matrix Real, Intent (In) :: Result … End Subroutine Sub

Subprogram Names as Parameters
Issues: Are parameter types checked? Early Pascal and FORTRAN 77 do not; later versions of Pascal and FORTRAN 90 do Ada does not allow subprogram parameters Java does not allow method names to be passed as parameters C and C++ - pass pointers to functions; parameters can be type checked What is the correct referencing environment for a subprogram that was sent as a parameter? Environment of the call statement that enacts the passed subprogram Shallow binding Environment of the definition of the passed subprogram Deep binding Environment of the call statement that passed the subprogram as actual parameter Ad hoc binding (Has never been used) SHALLOW BINDING - the nonlocal referencing environment of a procedure instance is the referencing environment in force at the time it (the procedure) is invoked. Original LISP works this way by default. DEEP BINDING - the nonlocal referencing environment of a procedure instance is the referencing environment in force at the time the procedure's declaration is elaborated. For procedures passed as parameters, this environment is the same as would be extant if the procedure were actually called at the point where it was passed as an argument. When the procedure is passed as an argument, this referencing environment is passed as well. When the procedure is eventually invoked (by calling it using the corresponding formal parameter), this saved referencing environment is restored. LISP funarg and procedures in Algol and Pascal work this way.

Subprogram Names as Parameters
function sub1() { var x; function sub2() { alert(x); }; function sub3() { x = 3; sub4(sub2); } function sub4(subx) { x = 4; subx(); x = 1; sub3(); Shallow binding: Referencing environment of sub2 is that of sub4 Deep binding Referencing environment of sub2 is that of sub1 Ad-hoc binding Referencing environment of sub2 is that of sub3 Main: invoke sub3 sub3: sub4 (sub2) % Deep binding: ref env of sub2 is sub1 (based on decl) sub4: (subx) sub2 % Ad-hoc binding: ref env of sub2 is sub3 sub2: alert(x) % Shallow binding: where it was called from, ref env is sub4

Overloaded Subprograms
A subprogram that has the same name as another subprogram in the same referencing environment Every version of the overloaded subprogram must have a unique protocol Must be different from others in the number, order, or types of its parameters, or its return type (if it is a function) C++, Java, Ada, and C# include predefined overloaded subprograms – e.g., overloaded constructors in C++ Overloaded subprograms with default parameters can lead to ambiguous subprogram calls void foo( float b = 0.0 ); void foo(); … foo(); /* call is ambiguous; may lead to compilation error */

Generic (Polymorphic) Subprograms
Polymorphism: Increase reusability of software Types: Ad hoc polymorphism = Overloaded subprogram Parametric polymorphism Provided by a subprogram that takes a generic parameter that is used in a type expression Ada and C++ provide compile-time parametric polymorphism

Generic Subprograms Example:
type Index_Type is (<>); type Element_Type is private; type Vector is array (Integer range <>) of Element_Type; procedure Generic_Sort(List : in out Vector); procedure Generic_Sort(List : in out Vector) is Temp : Element_Type; begin for Top in List'First .. Index_Type’Pred(List’Last) loop for Bottom in Index_Type’Succ(Top) .. List’Last loop if List(Top) > List(Bottom) then Temp := List (Top); List(Top) := List(Bottom); List(Bottom) := Temp; end if; end loop; -- for Bottom ... end loop; -- for Top ... end Generic_Sort; Example: procedure Integer_Sort is new Generic_Sort( Index_Type => Integer; Element_Type => Integer; Vector => Int_Array);

Generic Subprograms int top, bottom; Type temp;
template <class Type> void generic_sort(Type list[], int len) { int top, bottom; Type temp; for (top = 0; top < len - 2; top++) for (bottom = top + 1; bottom < len - 1; bottom++) { if (list[top] > list[bottom]) { temp = list [top]; list[top] = list[bottom]; list[bottom] = temp; } //** end of for (bottom ... } //** end of generic_sort float flt_list[100]; ... generic_sort(flt_list, 100); // Implicit instantiation Elements of list is considered generically

Implementing Subprograms
The subprogram call and return operations are together called subprogram linkage Implementation of subprograms must be based on semantics of subprogram linkage Implementation: Simple subprograms no recursion, use only static local variables Subprograms with stack-dynamic variables Nested subprograms

Simple Subprograms Simple
subprograms are not nested and all local variables are static Example: early versions of Fortran Call Semantics require the following actions: Save execution status of current program unit Carry out parameter passing process Pass return address to the callee Transfer control to the callee Return Semantics require the following actions: If pass by value-result or out-mode, move values of those parameters to the corresponding actual parameters If subprogram is a function, move return value of function to a place accessible to the caller Restore execution status of caller Transfer control back to caller

Simple Subprograms Required Storage: Status information of the caller
Parameters return address functional value (if it is a function) Subprogram consists of 2 parts: Subprogram code Subprogram data The format, or layout, of the noncode part of an executing subprogram is called an activation record An activation record instance (ARI) is a concrete example of an activation record (the collection of data for a particular subprogram activation)

Code and Activation record of a program with simple subprograms
Activation record instance for simple subprograms has fixed size. Therefore, it can be statically allocated Since simple subprograms do not support recursion, there can be only one active version of a given subprogram

Subprograms with Stack-Dynamic Variables
Compiler must generate code to cause implicit allocation and deallocation of local variables Run-time stack Top of the stack Local variables Parameters Dynamic link Return address Activation record instance Dynamic Link: points to top of ARI of caller (information to determine referencing env) Return address: code segment of caller and offset address of instrux following call. Points to top of activation record instance of caller Pointer to code segment of the caller and an offset address of the instruction following the call

void sub(float total, int part) { int list[4]; float sum; … } Local variable sum Local variable list[3] Local variable list[2] Local variable list[1] Local variable list[0] Parameter part Parameter total Dynamic link Return address

Example: without Recursion
void A(int X) { int Y; … C(Y); } void B(float R) { int S, T; A(S); void C(int Q) { void main() { float P; B(P); 2 1 3 Collection of dynamic links present in the stack at any given time is called the dynamic chain

Recursion adds possibility of multiple simultaneous activations of a subprogram Each activation requires its own copy of formal parameters and dynamically allocated local variables, along with return address

Subprograms with Recursion
int factorial (int n) { … if (n <= 1) return 1; else return n*factorial(n-1); } void main() { int value; value = factorial(3);

Subprograms with Recursion
int factorial (int n) { … if (n <= 1) return 1; else return n*factorial(n-1); } void main() { int value; value = factorial(3); 1 2 3

Nested Subprograms Support for static scoping
Implemented using static link (also called static scope pointer), which points to the bottom of the activation record instance of its static parent Static scope: scope of variable determined statically (based on Structure of program) Dynamic scope:scope of variable depends on calling sequence/execution Order, so can only be determined at runtime.

Nested Subprograms Static chain:
links all static ancestors of executing subprogram Static_depth an integer associated with static scope that indicates how deeply it is nested in outermost scope Chain offset Difference between static_depth of procedure containing reference to variable x and static_depth of procedure containing declaration of x procedure A is procedure B is procedure C is … end; of C end; -- of B end; of A Static_depths of A, B, and C are 0, 1, and 2, respectively If procedure C references a variable declared in A, the chain_offset of that reference is 2 Static chain: chain of static links that connect certain ARI in stack.

Nested Subprograms Calling sequence: Main_2 calls BIGSUB
program MAIN_2; var X : integer; procedure BIGSUB; var A, B, C : integer; procedure SUB1; var A, D : integer; begin { SUB1 } A := B + C; < end; { SUB1 } procedure SUB2(X : integer); var B, E : integer; procedure SUB3; var C, E : integer; begin { SUB3 } SUB1; E := B + A: < end; { SUB3 } begin { SUB2 } SUB3; A := D + E; < end; { SUB2 } begin { BIGSUB } SUB2(7); end; { BIGSUB } begin BIGSUB; end. { MAIN_2 } Calling sequence: Main_2 calls BIGSUB BIGSUB calls Sub2 Sub2 calls Sub3 Sub3 calls Sub1

Example References to A: 1: (0,3) (local) 2: (2,3) (two levels away)
program MAIN_2; var X : integer; procedure BIGSUB; var A, B, C : integer; procedure SUB1; var A, D : integer; begin { SUB1 } A := B + C; < end; { SUB1 } procedure SUB2(X : integer); var B, E : integer; procedure SUB3; var C, E : integer; begin { SUB3 } SUB1; E := B + A: < end; { SUB3 } begin { SUB2 } SUB3; A := D + E; < end; { SUB2 } begin { BIGSUB } SUB2(7); end; { BIGSUB } begin BIGSUB; end. { MAIN_2 } (chain offset, local_offset) Chain offset: number of links to correct ARI Local_offset: offsets from the beginning of the AR of the local scope. References to A: 1: (0,3) (local) 2: (2,3) (two levels away) 3: (1,3) (one level away)

Nested Subprograms At position 1 in SUB1:
A - (0, 3) ============> (chain_offset, local_offset) B - (1, 4) C - (1, 5) At position 2 in SUB3: E - (0, 4) B - (1, 4) A - (2, 3) At position 3 in SUB2: A - (1, 3) D - an error ==== ARI for sub1 has been removed E - (0, 5)

Nested Subprograms Drawbacks
A nonlocal reference is slow if the number of scopes between the reference and the declaration of the referenced variable is large Time-critical code is difficult, because the costs of nonlocal references are hard to estimate Displays Alternative to static chains Store static links in a single array called display, instead of storing in the activation records Accesses to nonlocals require exactly two steps for every access, regardless of the number of scope levels Link to correct activation record is found using a statically computed value called the display_offset Compute local_offset within activation record instance

Blocks User-specified local scope for variables
{ int temp; temp = list[upper]; list[upper] = list[lower]; list[lower] = temp; } Blocks can be implemented using static chain Blocks are treated as parameterless subprograms that are always called from same place in the program Every block has an activation record An instance is created every time a block is executed Alternative implementation Amount of space can be allocated statically Offsets of all block variables can be statically computed, so block variables can be addressed exactly as if they were local variables

Blocks e d c Variables occupy same locations b and g a and f z y x
void main() { int x, y, z; while (…) { int a, b, c; … int d, e; } int f, g; e d c Variables occupy same locations b and g a and f z y x Activation record instance for Main

Subprogram Implementation
Activation record on the stack Parameters Return address Local variables Static link Dynamic link int factorial (int n) { … if (n <= 1) return 1; else return n*factorial(n-1); } void main() { int value; value = factorial(3);

Subprogram Implementation
Bad design of subprogram implementation may result in network security problems Buffer overflow attack A type of vulnerability used by hackers to compromise the integrity of a system Problem is due to Lack of safety feature in language design bad coding by programmers

Buffer overflow attack
The effectiveness of the buffer overflow attack has been common knowledge in software circles since the 1980’s The Internet Worm used it in November 1988 to gain unauthorized access to many networks and systems nationwide Still used today by hacking tools to gain “root” access to otherwise protected computers The fix is a very simple change in the way we write array accesses; unfortunately, once code that has this vulnerability is deployed in the field, it is nearly impossible to stop a buffer overflow attack

Overview of Buffer Overflow Attacks
The buffer overflow attack exploits a common problem in many programs. In several high-level programming languages such as C, “boundary checking”, i.e. checking to see if the length of a variable you are copying is what you were expecting, is not done. void myFunction(char *str) { char bufferB[16]; strcpy(bufferB, str); } void main(){ char bufferA[256]; myFunction(bufferA); }

void myFunction(char *str) { char bufferB[16]; strcpy(bufferB, str); } void main(){ char bufferA[256]; myFunction(bufferA); } main() passes a 256 byte array to myFunction(), and myFunction() copies it into a 16 byte array! Since there is no check on whether bufferB is big enough, the extra data overwrites other unknown space in memory. This vulnerability is the basis of buffer overflow attacks How is it used to harm a system? It modifies the system stack

Stack content void main(){ char bufferA[256]; myFunction(bufferA); } bufferA

Stack content void main(){ char bufferA[256]; myFunction(bufferA); } bufferB OS data Str void myFunction(char *str) { char bufferB[16]; strcpy(bufferB, str); } Dynamic link Return Address to Main bufferA

Stack content void main(){ char bufferA[256]; myFunction(bufferA); } bufferB OS data This region is now contaminated with data from str Str void myFunction(char *str) { char bufferB[16]; strcpy(bufferB, str); } Dynamic link Return Address to Main bufferA May overwrite the return address!!

Stack content If the content of str is carefully selected, we can point the return address to a piece of code we have written When the system returns from the function call, it will begin executing the malicious code Malicious Code New Address bufferA

A Possible Solution void main(){ char bufferA[256];
myFunction(bufferA, 256); } void myFunction(char *str, int len) { char bufferB[16]; if (len <= 16) strcpy(bufferB, str); }

Buffer Overflow Attack

CSE 452: Programming Languages

Similar presentations

Presentation on theme: "CSE 452: Programming Languages"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CSE 452: Programming Languages

Similar presentations

Presentation on theme: "CSE 452: Programming Languages"— Presentation transcript:

Similar presentations

About project

Feedback