An Introduction to C Prepared for UCSD summer 2009 class.

Slides:



Advertisements
Similar presentations
Numerical Recipes The Art of Scientific Computing (with some applications in computational physics)
Advertisements

C Language.
An introduction to pointers in c
What is a pointer? First of all, it is a variable, just like other variables you studied So it has type, storage etc. Difference: it can only store the.
Introduction to Computing Concepts Note Set 7. Overview Variables Data Types Basic Arithmetic Expressions ▫ Arithmetic.
C Programming - Lecture 5
Managing Memory Static and Dynamic Memory Type Casts Allocating Arrays of Dynamic Size Resizing Block of Memory Returning Memory from a Function Avoiding.
Kernighan/Ritchie: Kelley/Pohl:
Lecture 2 Introduction to C Programming
C Programming - Lecture 3 File handling in C - opening and closing. Reading from and writing to files. Special file streams stdin, stdout & stderr. How.
CIS 101: Computer Programming and Problem Solving Lecture 8 Usman Roshan Department of Computer Science NJIT.
Topic 9 – Introduction To Arrays. CISC105 – Topic 9 Introduction to Data Structures Thus far, we have seen “simple” data types. These refers to a single.
Memory Arrangement Memory is arrange in a sequence of addressable units (usually bytes) –sizeof( ) return the number of units it takes to store a type.
Copyright © 2008 Pearson Addison-Wesley. All rights reserved. Chapter 9 Pointers and Dynamic Arrays.
PHYS 2020 Making Choices; Arrays. Arrays  An array is very much like a matrix.  In the C language, an array is a collection of variables, all of the.
1 ICS103 Programming in C Lecture 2: Introduction to C (1)
1 Pointers, Dynamic Data, and Reference Types Review on Pointers Reference Variables Dynamic Memory Allocation –The new operator –The delete operator –Dynamic.
Adapted from Dr. Craig Chase, The University of Texas at Austin.
Guide To UNIX Using Linux Third Edition
Engineering H192 - Computer Programming The Ohio State University Gateway Engineering Education Coalition Lect 14P. 1Winter Quarter Pointers Lecture 14.
Chapter 3: Introduction to C Programming Language C development environment A simple program example Characters and tokens Structure of a C program –comment.
CMSC 104, Version 8/061L18Functions1.ppt Functions, Part 1 of 4 Topics Using Predefined Functions Programmer-Defined Functions Using Input Parameters Function.
Week 4-5 Java Programming. Loops What is a loop? Loop is code that repeats itself a certain number of times There are two types of loops: For loop Used.
CS 11 C track: lecture 5 Last week: pointers This week: Pointer arithmetic Arrays and pointers Dynamic memory allocation The stack and the heap.
EE4E. C++ Programming Lecture 1 From C to C++. Contents Introduction Introduction Variables Variables Pointers and references Pointers and references.
C Programming Tutorial – Part I CS Introduction to Operating Systems.
Principles of Programming - NI July Chapter 5: Structured Programming In this chapter you will learn about: Sequential structure Selection structure.
CSEB 114: PRINCIPLE OF PROGRAMMING Chapter 5: Structured Programming.
CMPSC 16 Problem Solving with Computers I Spring 2014 Instructor: Tevfik Bultan Lecture 12: Pointers continued, C strings.
Stack and Heap Memory Stack resident variables include:
Hello.java Program Output 1 public class Hello { 2 public static void main( String [] args ) 3 { 4 System.out.println( “Hello!" ); 5 } // end method main.
Spring 2005, Gülcihan Özdemir Dağ Lecture 7, Page 1 BIL104E: Introduction to Scientific and Engineering Computing, Spring Lecture 7 Outline 7. 1.
Pointers OVERVIEW.
CPS120: Introduction to Computer Science Decision Making in Programs.
Chapter 8: Arrays Introduction to arrays Declaring arrays Initializing arrays Examples using arrays Relationship with pointers Array passing to a function.
Chapter 5: Structured Programming
19&20-2 Know how to declare pointer variables. Understand the & (address) and *(indirection) operators. Dynamic Memory Allocation Related Chapter: ABC.
CS 376b Introduction to Computer Vision 01 / 23 / 2008 Instructor: Michael Eckmann.
Lecture 3 Classes, Structs, Enums Passing by reference and value Arrays.
Pointers in C Computer Organization I 1 August 2009 © McQuain, Feng & Ribbens Memory and Addresses Memory is just a sequence of byte-sized.
Topic 3: C Basics CSE 30: Computer Organization and Systems Programming Winter 2011 Prof. Ryan Kastner Dept. of Computer Science and Engineering University.
Fall 2002CS 150: Intro. to Computing1 Streams and File I/O (That is, Input/Output) OR How you read data from files and write data to files.
Pointers *, &, array similarities, functions, sizeof.
Computer Organization and Design Pointers, Arrays and Strings in C Montek Singh Sep 18, 2015 Lab 5 supplement.
 2008 Pearson Education, Inc. All rights reserved. 1 Arrays and Vectors.
General Computer Science for Engineers CISC 106 Lecture 12 James Atlas Computer and Information Sciences 08/03/2009.
Introduction Chapter 1 1/22/16. Check zyBooks Completion Click on the boxes for each section.
Pointers1 WHAT IS A POINTER? Simply stated, a pointer is an address. A running program consists of three parts: execution stack, code, and data. They are.
Variables in C Topics  Naming Variables  Declaring Variables  Using Variables  The Assignment Statement Reading  Sections
M1G Introduction to Programming 2 2. Creating Classes: Game and Player.
CMSC 104, Version 8/061L09VariablesInC.ppt Variables in C Topics Naming Variables Declaring Variables Using Variables The Assignment Statement Reading.
MORE POINTERS Plus: Memory Allocation Heap versus Stack.
CMPSC 16 Problem Solving with Computers I Spring 2014 Instructor: Lucas Bang Lecture 11: Pointers.
1 Flow of Control Chapter 5. 2 Objectives You will be able to: Use the Java "if" statement to control flow of control within your program.  Use the Java.
Announcements Assignment 2 Out Today Quiz today - so I need to shut up at 4:25 1.
Dynamic Memory Management & Static Class Members Lecture No 7 Object Oriented Programming COMSATS Institute of Information Technology.
C Tutorial - Pointers CS 537 – Introduction to Operating Systems.
BIL 104E Introduction to Scientific and Engineering Computing Lecture 9.
Lecture 3: More Java Basics Michael Hsu CSULA. Recall From Lecture Two  Write a basic program in Java  The process of writing, compiling, and running.
DYNAMIC MEMORY ALLOCATION. Disadvantages of ARRAYS MEMORY ALLOCATION OF ARRAY IS STATIC: Less resource utilization. For example: If the maximum elements.
Dynamic Allocation in C
Computer Organization and Design Pointers, Arrays and Strings in C
User-Written Functions
Revision Lecture
Lecture 6 C++ Programming
Object Oriented Programming COP3330 / CGS5409
Lecture 18 Arrays and Pointer Arithmetic
File I/O in C Lecture 7 Narrator: Lecture 7: File I/O in C.
Computer programming Lecture 3.
Presentation transcript:

An Introduction to C Prepared for UCSD summer 2009 class

Overview Low level programming language Language of choice when speed/efficiency at a premium A compiled language Supports both Pass by Reference and Pass by Value Very similar syntax to R (in fact, R closely adopted C syntax)

When to use C over R WEAKNESSES Not a statistics environment Limited graphical support (ie. no plot() commands) Pointless for small problems, or when you don’t need to optimize speed STRENGTHS Custom estimators that take a long time to converge Working with huge data sets

/* A standard program to print out a greeting */ #include int main(void){ printf(“\nHello world”); return(0); } Hello.c --- Our first C program > gcc –o hello hello.c > hello Note commenting conventions ‘//’ ‘vs. ‘/* ‘ // is just like # in R, /* is for long comments. Exactly like the library() command in R All executables are functions called “main”, which must always spit out a number of type int. Functions are always defined by the thing they output (int), their name (main), their input (void in this case, which is nothing) gcc is the default gnu compiler, -o tells it what to output to (hello.exe), the final file is the source code In C, you must directly use the return() command, not just output ‘0’ Brackets: Note that, like R, C uses different kinds of brackets. Round brackets ‘()’ are for functions, squiggly brackets ‘{}’ are for control flow, and square brackets ‘[]’ are for indexes. This is identical to R. All commands end with semicolons, and can not span multiple lines.

Working with Vectors A container of objects, in some order Can be a list of characters, integers, doubles, or structs In R, just like sequences you would observe from functions like seq(), rep() Static vs Dynamic Allocation: When to do what What is a memory leak?

int main(void){ double numlist[4]; numlist[0] = 3; numlist[1] = 4; numlist[2] = 5; numlist[3] = 10; printf(“\n numlist[0] == %f”, numlist[0]); printf(“\n numlist[1] == %f”, numlist[1]); printf(“\n numlist[2] == %f”, numlist[2]); printf(“\n numlist[3] == %f”, numlist[3]); printf(“\n\nWhy will numlist[4] give an error?”); return(0); } Vector.c --- Static Allocation of Vector First, notice you declare your variables up front. This is considered good form for compiled procedural language programming. To declare, it should be the type (double here) and a name (numlist here). If it is a vector and you are statically allocating it, use square brackets and size. Doubles are basically the same thing as “numeric” in R Assigning values to the container in order. Notice that it starts at 0. This is different from R, so be very careful here. Printing the results, just like how sprintf() is done in R. %f is to show floating point, which is what a double is.

int main(void){ double *numlist; numlist = (double *) malloc(10*sizeof(double)); numlist[0] = 3; numlist[1] = 4; numlist[2] = 5; numlist[3] = 10; printf(“\n numlist[0] == %f”, numlist[0]); printf(“\n numlist[1] == %f”, numlist[1]); printf(“\n numlist[2] == %f”, numlist[2]); printf(“\n numlist[3] == %f”, numlist[3]); printf(“\n\nWhy will numlist[4] give an error?”); free(numlist) return(0); } Vector.c --- Dynamic Allocation of Vector To dynamically allocate, instantiate it as a pointer with a * (more on this later). At this point it has no space reserved, it just points to a double malloc() takes pointer and reserves space. First, remember to cast it for safety (double *). Second, remember that you must reserve space equal to number of slots (10 slots here) x amount of space for each slot (size(double) in this case) calloc() and realloc() are similar functions, calloc() initiates everything to be 0, realloc() resizes arrays Always free the space after, this is the whole point

Working with Arrays Arrays are basically matrices, though they can span more than two dimensions Can think of them as “vectors of vectors” Often these are stored as vectors that need to be unrolled (we will see this later) Dynamic allocation is still supported

int main(void){ double randMat[3][3]; randMat[0][0] = 1; randMat[1][2] = 3; printf(“\nrandMat[0][0] = %f”, randMat[0][0]); printf(“\nrandMat[1][2] = %f”, randMat[1][2]); return(0); } Array.c --- Static Allocation of Array To create an array, declare the type, then the size of each dimension in the square brackets like this. Only thing to note here is that notation is different from R. Not randMat[1,2], but randmat[1][2] Always free the space after, this is the whole point

int main(void){ double **randMat; randMat[0] = (double *) malloc (3*sizeof(double)); randMat[1] = (double *) malloc (3*sizeof(double)); randMat[2] = (double *) malloc (3*sizeof(double)); randMat[0][0] = 1; randMat[1][2] = 3; printf(“\nrandMat[0][0] = %f”, randMat[0][0]); printf(“\nrandMat[1][2] = %f”, randMat[1][2]); free(randMat[0]); free(randMat[1]); free(randMat[2]); free(randMat); return(0); } Array.c --- Dynamic Allocation of Array Dynamic allocation as pointer of pointers. Essentially 3x3 matrix is represented as 3 vectors, in a vector. Obviously you can do this allocation in a for() loop, but I haven’t gotten there yet. This technique is particularly effective for sparse matrices. When freeing, you have to free each thing individually, this is very important.

Control Flow for(), if(), and while() are the ones we are concerned about Use {} brackets “Controls flow” in the sense that it may not simply do the next command on the line if(): used to condition execution of a statement while(): use to loop over a chunk of code until condition is met for(): used to loop over a chunk of code in some order

int main(){ int i, a=4, b=10; int temp[10]; if(b>a){ printf(“\nb>a is true”); } if(b<a){ printf(“\nb<a is true”); } while(a != b){ printf(“\na equals %i”, a); a=a+1; } for(i=0;i<10;i++){ temp[i] = i; } for(i=0;i<10;i++){ printf(“\nElement %i in temp = %i”, i, temp[i]); } } Control.c --- Control Flow demonstration if() statements evaluate booleans (ie. True/False statements), which can also be 1/0 integers. Evaluative statements include >, =, <=. note: ‘==’ is not the same as ‘=’ while() statements have the same syntax as if(), but an important point to note here is that the condition has to change at some point (a=a+1 here). Otherwise you will get an infinite loop. for() loops typically loop around vectors like this, doing something to each element. Note that 3 things are present in the syntax. First, a counter is initialized to a start value (i=0). Next, a condition is set, and the loop runs until while the condition is met (i<10). Finally, a piece of code that increments/decrements the counter at the end of each loop is included (i++)

Functions Just like functions in R, think of these as black boxes Usually output something after getting some (multiple) inputs In fact, with pointers you can have multiple outputs (we will see this soon) Use ‘()’ brackets, match arguments to call One question to think about: what is the computer actually doing here?

double convert(double fahrenheit); int main(){ printf("\n\t30 degrees fahrenheit == %f celcius",convert(30)); printf("\n\t20 degrees fahrenheit == %f celcius",convert(20)); } double convert(double fahrenheit){ double celcius; celcius = (fahrenheit - 32)/1.8; return(celcius); } Function.c --- Temperature Converter Put a header for each function in the code. For trival programs this won’t matter, but it will matter a lot for larger programs. This defines a function. The first double declares output type. ‘convert’ is the name. “double celcius” defines input type. This line is equivalent to convert <- function(double celcius) in R, except R doesn’t declare output type Just like in R: define a quantity like “fahrenheit”, do some stuff with it, then call return() on it. Remember that the output has to be the same output type you declared! Call the function with the function name, and arguments in ‘()’ brackets.

Pointers Consider the following line in R, where each matrix is NxN and huge superMat = superMat %*% anotherMat What is/could be happening here? Why might this be stupid? What is the alternative? Pass by Reference vs. Pass by Value (convert() was done pass by value) Two key commands: ‘*’ (dereference) and ‘&’ (reference) Matrices are special because of this

void convert(double *celcius); int main(){ double temperature = 80; printf("\n\tBefore conversion == %f celcius",temperature); convert(&temperature); printf("\n\tAfter conversion == %f celcius\n\n",temperature); } void convert(double *celcius){ double temp; temp = *celcius; temp = (temp - 32)/1.8; *celcius = temp; } Pointer.c ---Temp Converter with Pointer temp = *celcius takes the value at the address of celcius (i.e. the dereferencing operator), and stores it in the variable temp. At the end here, we are referencing the value of celcius and storing a new temperature in it. Hence, the original variable “temperature” has been modified. Note that an address to temperature is being passed here. This means temperature is being passed by reference, so it can be modified by the function it is passed to.

LAPACK/BLAS Often you will want to do linear algebra routines to matrices While you can write functions to do calculations manually, this is definitely not advised LAPACK/BLAS standardizes the functions, and runs much more efficiently General rule: Need to read documentation very carefully, and test with small examples Let’s work through a manual calculation of OLS using dgesv() and dgemm() You really need to have the documentation of the functions to understand this example as I walk through this. See handouts. For other functions, just Google them!

int main(){ int i,info, ipiv[2]; char trans = 't', notrans ='n'; double alpha = 1.0, beta=0.0; int ncol=2; int nrow=5; int one=1; double XprimeX[4]; double X[10] = {1,1,1,1,1,0.3,-0.2,0.4,-0.5,0.3}; double Y[5] = {0.7,-0.5,0.9,-1.1,0.7}; double XXinv[4] = {1,0,0,1}; double XXinvX[10]; double coef[2]; printf("\n\nX = "); for(i=0;i<5;i++) printf("\n%f %f", X[i],X[i+5]); printf("\n\nY = "); for(i=0;i<5;i++) printf("\n%f", Y[i]); ols.c --- Manual OLS with BLAS functions Everything here should be pretty straightforward. Just define a few variables, planning to solve for the beta hats in OLS given X and Y. The only new thing here so far is that X is a 5x2 matrix, but I am storing it as a 10x1 vector. This is very common when working with LAPACK/BLAS.

//solve X’X dgemm_(&trans,&notrans,&ncol,&ncol,&nrow,&alpha,X,&nrow,X,&nrow,&beta,XprimeX,&ncol); printf("\n\nX'X = "); for(i=0;i<2;i++) printf("\n%f %f",XprimeX[i], XprimeX[i+2]); //solve (X’X)-1 dgesv_(&ncol,&ncol,XprimeX,&ncol,ipiv,XXinv,&ncol,&info); printf("\n\n(X'X)-1 = "); for(i=0;i<2;i++) printf("\n%f %f",XXinv[i], XXinv[i+2]); //solve (X’X)-1X’ dgemm_(&notrans,&trans,&ncol,&nrow,&ncol,&alpha,XXinv,&ncol,X,&nrow,&beta,XXinvX,&ncol); //solve (X’X)-1X’Y dgemm_(&notrans,&notrans,&ncol,&one,&nrow,&alpha,XXinvX,&ncol,Y,&nrow,&beta,coef,&nrow); printf("\n\nB0 = %f", coef[0]); printf("\nB1 = %f\n\n", coef[1]); return(0); } ols.c --- continued

Data Input and Output Up to now, we have just manually created data. What if we want to read data from a file? Core idea: Create a file pointer, open the file with permissions, and then read or write to it Even better: Have some error checking on the reading In many cases: you will start with a large memory space you read data into that you will need to resize Common mistake here: Incorrect casting of what you are reading Two programs here are attached. We first generate some random data in a file. Then, we read the data in another program.

#include int main(void){ FILE *fp; double data[10]; int i = 0; for(i=0;i<10;i++){ data[i] = ( (double)rand() / ((double)(RAND_MAX)+(double)(1))); } fp = fopen("data.txt","w"); for(i=0;i<10;i++){ fprintf(fp, "%2.3f \n", data[i]); } fclose(fp); } writedata.c --- Writes random numbers into file You will need some C libraries here for file and random number functions. Declare a file pointer here. The file pointer has not yet been opened. rand() generates a random number from 0 to RAND_MAX, so this line generates a random number from 0 to 1. Here you open a file with the file pointer you created, with the “w” permission to write to it. Always close your pointers after you are done with them!! fprintf() works just like printf(), except you have to specify the file pointer it is printing to.

#include int main(void) { int MAXVOTES = 10000; FILE *fp; double *numlist; numlist = (double *) malloc (MAXVOTES*sizeof(double)); int i; if((fp = fopen("data.txt","r"))==NULL) { printf("\nUnable to open file DATA.TXT: %s\n", strerror(errno)); exit(1); } else { i=0; while (!feof(fp)) { fscanf(fp,"%f", (double *) &numlist[i]); i++; } fclose(fp); numlist = (double *) realloc(numlist, i* sizeof(double)); printf("\nAllocation OK, %i votes allocated.\n", i); } readdata.c --- Reads random numbers into file Open required libraries again, including one for error handling. Very typical error recovery. Try opening the file with read permissions. If it fails, print error message. Usually you will allocate a ton of space before reading data because you don’t know how much space you need. Declaring that space up front is a good idea, like MAXVOTES feof() returns true if it is the end of file, so this while() loop reads data until there is no more to be read. fscanf() is the reverse of fprintf(), it reads an observation from a data file. Same syntax. Notice I used ‘i’ to count the number of entries, then I resized the array. If you only have 10 entries, you don’t need a container that can contain 10,000!