Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to R:Joseph Powell Overall Aims Introduce programming concepts relevant to MX Demonstrate the strengths (and weaknesses) of R.

Similar presentations


Presentation on theme: "Introduction to R:Joseph Powell Overall Aims Introduce programming concepts relevant to MX Demonstrate the strengths (and weaknesses) of R."— Presentation transcript:

1 Introduction to R:Joseph Powell Overall Aims Introduce programming concepts relevant to MX Demonstrate the strengths (and weaknesses) of R

2 Introduction to R:Joseph Powell Books The R Book – Crawley (2007) Introductions to statistics using R –Cohen Y. and Cohen J. Y. (2008). Statistics and Data with R. –Crawley M. (2005). Statistics: An Introduction using R. –Dalgaard P. (2002). Introductory Statistics with R. –Maindonald J. & Braun J. (2003). Data Analysis and Graphics Using R: An Example-based Approach. Books on biological topics –Paradis E. (2006). Analysis of Phylogenetics and Evolution with R. –Broman K. W. & Sen S. (2009). A Guide to QTL Mapping with R/qtl. –Bolker B.M. (2008). Ecological Models and Data in R. Books on statistical topics –Aitkin M. et al. (2009). Statistical Modelling in R. –Faraway J. (2009). Linear Models with R. –Albert J. (2009). Bayesian Computation with R. –Bivand R.S. et al. (2009). Applied Spatial Data Analysis with R. –Cowpertwait P.S.P. & Metcalfe A.V. (2009). Introductory Time Series with R. Books on R specifics and R programming –Spector P. (2008). Data Manipulation with R. –Murrell P. (2006). R Graphics. –Chambers J. M. (2008). Software for Data Analysis: Programming with R.

3 Introduction to R:Joseph Powell Websites Websites: –Cran R: http://www.r-project.org/http://www.r-project.org/ –R cookbook: http://www.r-cookbook.com/http://www.r-cookbook.com/ –R graphics: http://addictedtor.free.fr/graphiques/http://addictedtor.free.fr/graphiques/ –R wiki: http://wiki.r-project.org/http://wiki.r-project.org/ –Mailing lists: http://www.r-project.org/mail.htmlhttp://www.r-project.org/mail.html –R seek: http://www.rseek.org/http://www.rseek.org/ Websites on statistical topics –R genetics: http://rgenetics.org/trac/rgalaxyhttp://rgenetics.org/trac/rgalaxy –Bioconductor: http://www.bioconductor.org/http://www.bioconductor.org/

4 Introduction to R: Joseph Powell The console Load up R Console window appears, with a command prompt Everything in the R console can be partitioned into two fundamental operations: –Input variables > x <- 2 –Output variables > x [1] 2

5 Introduction to R: Joseph Powell Objects Names –Case sensitive, no spaces –Must begin with a letter but also can contain numbers and:. _ –Try to give your objects meaningful names > My_f4vourite.langua6e_evR <- “R” x, y and My_f4v… are objects that we have created > ls()# this will bring up a list of all our objects > rm(y)# this deletes y (forever) > rm(list=ls())# this deletes everything (..forever)

6 Introduction to R:Joseph Powell Workspace 1 Everything shown in this list of objects comprises our 'workspace' > ls() [1] "My_f4vourite.langua6e_evR" "x" "y“ > save.image(file=“myworkspace.RData”) > rm(list=ls()) > ls() character(0) > load(file = “myworkspace.RData”) > ls() [1] "My_f4vourite.langua6e_evR" "x" "y“ Objects are internal to R –Does not behave like a file structure on the computer –Can't be read or interpreted outside R (?)

7 Introduction to R:Joseph Powell Workspace 2 You can select which objects to save > save(y, x, file = “two_objects.RData”) Different computer folders can be accessed > dir()# shows current work directory > setwd(“~/work_directory”)# sets R's focus to a different computer folder

8 Introduction to R:Joseph Powell Built-in functions Native functions make R succinct Diverse range available from graphics to data manipulation to statistical algorithms etc. Highly optimised so use them if they are available instead of writing your own Function structure: > function_name(,, …)

9 Introduction to R:Joseph Powell Missing values NA is a “reserved” word in R It is a single element (length 1) that indicates a missing value A helpful alternative to coding missing values (e.g -99) > my_array <- c(NA,100,120,120,120,130,NA) > sum(my_array) [1] NA > sum(my_array,na.rm=T)# most functions allow you to explicitly state how to handle NA [1] 590 > table(my_array)# HOWEVER the default action varies from function to function my_array 100 120 130 1 3 1

10 Introduction to R:Joseph Powell R help pages Each function has its own unique syntax –Default arguments –Data structure requirements –Output options > ?seq# brings up help page of seq() function > ??”sequence”# searches for all related functions Note > seq(from = 2, to = 100, by = 2) is clearer than > seq(2,100,2)

11 Introduction to R:Joseph Powell Basic Scripting Note pad / text editor –Within the R GUI –Open with: File > New Scriptor Ctrl+N –Layout as tile is useful: Windows > Tile

12 Introduction to R:Joseph Powell Basic Scripting Note pad / text editor –Useful for keeping all work together –Scripts can be saved –Can be used to save a “program” –Add # comments –Check individual bits of code –Ctrl+R Whole line Selected code

13 Introduction to R:Joseph Powell Basic Scripting Brackets –( )functions –[ ]subsets –{ }processes Subsets –Take a subset of an object –Objects have either 1 x n, or m x n dimensions > x [,1] [,2] [,3] [,4] [1,] 1 4 7 10 [2,] 2 5 8 11 [3,] 3 6 9 12 > x [1] 2 5 6 2 6 77 55 > x[5] [1] 6 > X[3,4] [1] 12 [rows, columns]

14 Introduction to R:Joseph Powell Basic Scripting Data input –Direct input into the console scan() –Reading in data read.table / read.csv –“name.txt” –“c:\\temp\\name.txt” –choose.file() –list.files() –dir() > y <- scan() 1: 3 2: 4 3: 12 4: 3 5: 5 6: 2 7: 14 8: Read 7 items > dir() [1] "temp.csv" "temp2.csv" “name.txt” > y <- read.table("name.txt", header=T, sep="\t") >

15 Introduction to R:Joseph Powell Basic Scripting Data output –Direct input into the console sink() –Writing out data write.table / write.csv –“name.txt” –“c:\\temp\\name.txt” sink(“sink_tmp.txt”) i <- 1:10 outer(i, i, "*") sink() > dir() [1] "temp.csv" "temp2.csv" “name.txt” > write.table("name.txt", header=T, sep="\t") >

16 Introduction to R:Joseph Powell Basic Scripting Adding rows and columns –Allows objects to be joined, either to an existing object or to make a new object –cbind() – adds columns together –rbind() – adds rows together > y1 [,1] [,2] [,3] [1,] 1 3 12.5 [2,] 1 2 13.8 [3,] 1 5 15.3 [4,] 1 4 16.8 > y2 [,1] [1,] 0.349 [2,] 0.745 [3,] 0.684 [4,] 0.964 > y3 <- cbind(y1, y2) > y3 [,1] [,2] [,3] [,4] [1,] 1 3 12.5 0.349 [2,] 1 2 13.8 0.745 [3,] 1 5 15.3 0.684 [4,] 1 4 16.8 0.964 > y3 <- rbind(y1, y2[1:3]) > y3 [,1] [,2] [,3] [1,] 1.000 3.000 12.500 [2,] 1.000 2.000 13.800 [3,] 1.000 5.000 15.300 [4,] 1.000 4.000 16.800 [5,] 0.349 0.745 0.684

17 Introduction to R:Joseph Powell Basic Scripting for loops –loop through a set of commands a given number of times –very useful, but are not optimal for memory > dim(y) [1] 10 10 > for(i in 1:ncol(y)) { y_mean <- mean(y[i, 1:10]) } > y_mean [1] 0.1974492 > out <- array(0, c(ncol(y), 1)) > for(i in 1:ncol(y)) { out[i] <- mean(y[i, ]) } > out [,1] [1,] -0.3110800 [2,] -0.2000344 [3,] 0.2019573 [4,] 0.2859823 [5,] 0.1932523 [6,] 0.2759323 [7,] -0.2571102 [8,] -0.1037983 [9,] 0.3522018 [10,] 0.1974492

18 Introduction to R:Joseph Powell Data Manipulation Check data –dim() –mydata[1:10, 1:10] –str() –summary() –head() –tail() –table() –etc… > mydata <- read.table("mydata.txt", header=T, sep="\t") > dim(mydata) [1] 642 1470 > mydata[1:10, 1:10] [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 2 2 1 2 1 2 0 1 0 1 [2,] 0 0 2 2 0 0 1 2 1 2 [3,] 0 2 2 2 1 1 0 0 2 1 [4,] 2 0 2 2 2 0 1 2 0 1 [5,] 2 0 0 2 0 1 1 0 2 0 [6,] 2 1 2 1 1 0 2 2 1 1 [7,] 1 1 2 2 1 2 2 2 0 1 [8,] 0 1 0 0 0 1 1 1 1 1 [9,] 0 0 1 2 1 2 2 0 0 1 [10,] 1 0 1 1 2 0 1 0 0 1

19 Introduction to R:Joseph Powell Data Manipulation Reordering –If you have a data.frame or matrix (numbers or letters) –Use: order() –index <- order(old[,1], decreasing=T) > dim(lamb) [1] 1600 5 > head(lamb) Field Weight sire dam sex 1 A 22.92368 1 1 F 2 A 27.52896 1 1 F 3 A 25.52592 1 1 M 4 A 25.56016 1 1 M 5 A 24.53296 1 2 F 6 A 22.03344 1 2 F > lamb <- lamb[order(lamb$sex, decreasing=F), ] > head(lamb) Field Weight sire dam sex 1 A 22.92368 1 1 F 2 A 27.52896 1 1 F 5 A 24.53296 1 2 F 6 A 22.03344 1 2 F 9 A 30.37944 2 1 F 10 A 25.93680 2 1 F

20 Introduction to R:Joseph Powell Data Manipulation Reordering –order() > lamb <- lamb[order(lamb$sex, decreasing=F), ] > rows <- order(lamb$sex, decreasing=F) > lamb <- lamb[rows, ] > index <- order(lamb$sex, decreasing=F) > head(index) [1] 1 2 5 6 9 10 > lamb <- lamb[index, ] Expanded way

21 Introduction to R:Joseph Powell Data Manipulation Replacing –index –which() > class(lamb) [1] “matrix” > head(lamb) Field Weight sire dam sex 1 A 22.92368 1 1 F 2 A 27.52896 1 1 F 3 B 25.52592 1 1 M > index <- lamb[,1]==“A” > head(index) [1] TRUE TRUE FALSE TRUE FALSE > lamb[index, 1] <- ”C” > head(lamb) Field Weight sire dam sex 1 C 22.92368 1 1 F 2 C 27.52896 1 1 F 3 B 25.52592 1 1 M > index <- which(lamb[,1]=="A") > head(index) 1 2 4 6 7 10 > lamb[index, 1] <- ”C” > lamb[which(lamb[,1]==”A”, 1] <- ”C” Put it together

22 Introduction to R:Joseph Powell Data Manipulation Replacing > class(lamb) [1] “matrix” > head(lamb) Field Weight sire dam sex 1 A 22.92368 1 1 F 2 A 27.52896 1 1 F 3 B 25.52592 1 1 M > index <- lamb[,2] <= 22.000 > table(index) index FALSE TRUE 1553 47 > lamb[index, 2] <- ”NA” > which(lamb[,2] >= 20.0 & lamb[,2] <= 21.0) 214 363 496 842 921 983 1103 1126 > which(lamb[,1]==“A” & lamb[,2] >= 20.0 & lamb[,2] <= 21.0) 214 363 496 > new_lamb = 20.0 & lamb[,2] <= 21.0), ] > new_lamb Field Weight sire dam sex 214 A 2046 27 2 F 363 A 2008 46 1 M 496 A 2041 62 2 M

23 Graphics with R: Overview 1.Why graphics? 2.Why graphics in R? 3.The R graphics systems (did you really expect just one?) 4.Graphics basics and examples 5.Customisation of a graphic 6.Overview of different systems and packages Introduction to R:Joseph Powell

24 plot(x, y, …) > ?Formaldehyde > head(Formaldehyde) carb optden 1 0.1 0.086 2 0.3 0.269 3 0.5 0.446 4 0.6 0.538 5 0.7 0.626 6 0.9 0.782 > plot(Formaldehyde) > ?par Introduction to R:Joseph Powell


Download ppt "Introduction to R:Joseph Powell Overall Aims Introduce programming concepts relevant to MX Demonstrate the strengths (and weaknesses) of R."

Similar presentations


Ads by Google