Presentation on theme: "An Introduction to R: Logic & Basics. The R language Command line Can be executed within a terminal Within Emacs using ESS (Emacs Speaks Statistics)"— Presentation transcript:
An Introduction to R: Logic & Basics
The R language Command line Can be executed within a terminal Within Emacs using ESS (Emacs Speaks Statistics)
Data structures Vector (~ 1 dimensional array) > vect = c(4,6,7,10) > vect2 = c(vect,vect) Matrix (~ 2 dimensional array) > mat = matrix(0, 2, 2) # matrix of size 2x2, with 0 > mat = matrix(vect, 2, 2) # matrix of size 2x2 Array (~ n dimensional) > array = array(0, dim=c(10,10,10)) # cube of 10x10x10 These data structures settle the R logic as all is designed to make an easy use of it
A few tricks on Vectors - The indices can be a vector ! Indices begin at 1!!! > vect.A = c(4,6,7,10) > vect.B = c(1,4) > vect.A 4 > vect.A[vect.B] # equivalent to vect.A[c(1,4)] The WHICH() function is the most useful > which(vect.A == 7) 3 > which(vect.A > 6) The ‘:’, ‘seq’ and ‘gl’ functions allow to generate sequence of numbers
A few tricks on Matrices (1) - The indices can still be given as a vector ! > MY.matrix[i,j] # give the element at line i, col j > MY.matrix[i,] # gives the line i > MY.matrix[c(1,2,3),] # gives the first 3 lines as a matrix > MY.matrix[c(1,2,3), c(1,2,3)] # gives a sub-matrix i j i
A few tricks on Matrices (2) - The WHICH() function is still very useful # Prints extreme values of a matrix > MY.matrix[which(MY.matrix > cutoff)] here Values > to cutoff are printed - An example: I have a file of the following type: pdb NB_chains NB_identical_int NB_homologous_int NB_different_int > data = read.table(file = ‘~/elevy/... ’) > identical.1 = which(data[,3] == 1) > dimers = which(data[,2] == 2) > homodimers = intersect(identical,dimers) > data[homodimers,] # prints all the homodimer lines !
More tricks - How many numbers in a matrix are equal to 5 ? - How many numbers are in common between 2 matrices ? - Replace all the 4 by -4 in any data structure ?
Some useful functions (1) Play with data structures min, max, which.min, which.max == combined with sum mean, sd intersect cor combined with hclust / heatmap type Cast operator : as.type
Some useful functions (2) For text printing print(‘Hello’) print(paste(‘one’, i, ’two’, j,sep=‘ ‘))
Some useful functions (3) For statistical analysis rand / random doesn’t exists ! Their are specific laws instead runif(x) Uniform law (equiv. To rand) rnorm(x) Gaussian law
Some useful functions (4) Useful graphical functions Plot 2D look at demo(graphics) Image 2D look at demo(image) Heatmap (clust + image & tree) par() store most of the graphical parameters to custom the display Persp 3D look at demo(persp) Find help & examples: help.start() or help(function) or ?function
Some remarks - No « underscores » in variables names are allowed (the dot is generally used instead) The « dot » doesn’t mean « method call » like in object oriented languages! - There is actually another « vector like » data structure : list which allows to store objects rather than digits. - There is actually another « matrix like » data structure : data.frame which is a matrix for which rows/columns can be given a name
Last remarks - You can run scripts in BATCH mode, example: $ R --vanilla < script.r - To quit R, type q() The () are very important, when you don’t put it the source code of the function is printed! (this is true for any function) - Don’t hesitate to ask questions