# R Lecture 6 Naomi Altman Department of Statistics Department of Statistics.

## Presentation on theme: "R Lecture 6 Naomi Altman Department of Statistics Department of Statistics."— Presentation transcript:

R Lecture 6 Naomi Altman Department of Statistics Department of Statistics

Odds and Ends speeding things up (a bit) errors in functions and loops source and sink attach and detach resampling garbage collection

Speeding Things Up (I am assuming that R behaves like Splus) Try to minimize the number of expressions evaluated. R is an "interpreted" language, not a "compiled" language. This means that each time R encounters an expression, it has to decipher it and convert it to machine-readable microcode. If you write a loop, then R has to decipher the expressions within the loop again and again. Looking at the code for apply, it seems to create a loop, so apply does not get around this.

Speeding Things Up Create space for new objects. Every time you change the dimensions of an object, R reallocates space. So my loop: intercepts=NULL for (i in 1:1000){y=line+rnorm(25,0,3) lmout=lm(y~x) intercepts=c(intercepts,getIntercept(lmout))} would be better as: intercepts=rep(0,1000) for (i in 1:1000){y=line+rnorm(25,0,3) lmout=lm(y~x) intercepts[i]=getIntercept(lmout))}

Errors in Functions and Loops If an error occurs somewhere within a function or loop, R aborts the whole process and destroys the frame. Nothing will change in your.Rdata directory. There is no such thing as "partial execution"; if even one mistake occurs the whole process is aborted. If you want to "trap" errors so that your routine can recover, the "try" command can be used to return a flag.

"source" and "sink" source("myfile.txt") runs the commands in the text file "myfile" until they are completed source("myOutput.txt") send everything that would be written to the screen to the text file "myOutput" until sink() is typed in

Attach and Detach "attach" and "detach" expand and contract the search path for R. You can attach either a dataframe or a.Rdata workspace. Any new assignments are stored in the current workspace. When you want to stop searching the dataframe or workspace, you can detach it.

Resampling Many statistical methods require sampling from the data already collected. Examples include the bootstrap and permutation methods. sample(x, size, replace = FALSE, prob = NULL) allows you to sample with or without replacement from x, with unequal probabilities if you want

Resampling sample(x,length(x),replace=F) creates a random permutation of the data

Session: Resampling a=c("H","T")sample(a,10,replace=T)sample(a,10,replace=T,prob=c(.2,.8))sample(1:5,5,replace=T)sample(1:5,5,replace=F)

#bootstrap distribution of the intercept attach(myData) intercept=rep(0,1000) l=length(weight) for (i in 1:1000){ samp=sample(1:l,l,replace=T) intercept[i]=coefficients(lm(weight[samp]~ hip[samp]+waist[samp]))[1] } hist(intercept)

Handling Exceptions in Functions # John Hughes # Stat 597C Homework 5 getIntercept = function(mod) { if (class(mod) == "lm") # If mod is of the right type { if (names(mod\$coef)[1] == "(Intercept)") # and it has an intercept, mod\$coef[1] # return the intercept. else stop("model has no intercept") # Otherwise, bark. } else stop("instance of class \"lm\" required") # Do likewise here. }

Invisible Functions methods(plot)plot.densityplot.hclustgetAnywhere(plot.hclust)

Functions that Return Invisible Output x=rnorm(100)hist(x)y=hist(x)yhist.default

Garbage Collection R does not always release memory after loading large data objects. Releasing memory not currently in use is called garbage collection. gc() will "do the trick". So will quitting and restarting R.

Download ppt "R Lecture 6 Naomi Altman Department of Statistics Department of Statistics."

Similar presentations