Essential R Xuhua Xia

2 R language Help Basic utilities Data types and data structures File I/O Data manipulation/transformation Statistical analysis Graphic functions Installation of additional packages Xuhua Xia

3 R Help Command to get help –help(funName), help(package=packageName), … –, ??... –vignette() –args(funName) –RSiteSearch(sTopic) R-related web sites: – – – Xuhua Xia

4 Utility command getwd() setwd(sSubDir), e.g., setwd("c:/users/xxia") save.image() history(), history(100), history(Inf) x <-.Last.value search() library(), library(MASS), detach(package:MASS) install.packages(), e.g., nortest, outliers head(x), tail(x) data() attach(dataName) chooseCRANmirror() Sys.getenv("R_HOME") Xuhua Xia

5 Data types and data structures Data type and data structures –Integer, Numeric, Boolean(TRUE,FALSE) –Factor –Character –Sequence, e.g., 1:9, 2:(n-1), seq(from=1, to=100, by=2|length.out(10)) –Array: myVec <- c(1,1,2,3,4,4,5,6), or c(TRUE, FALSE, FALSE, TRUE) myVec <- c("my name","your name") myVec[2], myVec[1:4], myVec[c(1,3,4)], myVec[-3], myVec[-(1:3)] v[v > mean(v)], v[! & !is.null(v)] names(v) <- c("Ottawa", "Toronto", "Kingston") –Matrix: a vector with dimensions, A <- 1:6, dim(A) <- c(2,3), matrix(vec,2,3) –List, Data frame: mydf <- data.frame() mydf <- edit(mydf) mydf <-data.frame(label = c("Low", "Mid", "Hi"), lb=c(1,2,3),ub=c(4,5,6)) R commands to check data types and data structures –Class –Mode Data manipulation is better done with EXCEL or the like. Xuhua Xia

6 File I/O Input –scan(), e.g., scan(sFileName,skip=3,comment.char="#") –read.table(sFileName,header=TRUE), read.fwf(widths=c(3,5,…), read.csv –readLines(), e.g., readLines(sFileName,NumLines), readLines(sFileName, -1) –load("myData.Rdata") Output –writeLines(): writeLines(sText, sFile, sep="\n") –write.table(): write.table(DFr, sFileName) –cat(sText1,sText2, …, file="filename") –sink("filename") … sink() –print –save(myData, file="myData.RData) Related –cat, print –paste0() –ls(), rm(), rm(list=ls(all=TRUE)) Xuhua Xia

7 Descriptive statistics mean(vector, na.rm=TRUE), median, sd, var, SE, CV, skewness, kurtosis, 95%CL, … Graphic: –hist(x,n) –plot(density(x)) Xuhua Xia

8 Distribution normal: rnorm, dnorm, pnorm, qnorm t: rt, dt, pt, qt ad.test(y) in nortest package. grubbs.test(y) in outliers package Xuhua Xia

9 Graphics par(mfrow=c(2,2)): set a canvas for four graphs plot(x,y,xlab="",ylab="",type="l"): default type is scatterplot histogram: –hist(inVec,xlab=""), hist(inVec,xlab="",freq=FALSE): y is density instead of frequency, curve(dnorm(inVec,mean,sd,add=TRUE,col="blue"): overlay the expected normal curve on the histogram qqnorm(y),qqline(y,col="blue") Xuhua Xia

