Download presentation
1
Data Analysis Using R: 3. Graphical Analyses
Tuan V. Nguyen Garvan Institute of Medical Research, Sydney, Australia
2
Overview Data Barchart Historgram Stripchart Boxplot Scatter plot
3
Data Body composition data measured by dual energy X-ray absorptiometry 43 men and women, aged between 11 and 28 Variable names: id age sex dur weight height lm (lean mass) pclm (percent lean mass) fm (fat mass) pcfm (percent fat mass) bmc (bone mineral contents)
4
Reading data into R setwd(“c:/works/stats”)
bc <- read.table(“comp.txt”, header=T) attach(bc) names(bc) [1] "id" "age" "sex" "dur" "weight" "height" "lm" "pclm" [9] "fm" "pcfm" "bmc"
5
View data bc id age sex dur weight height lm pclm fm pcfm bmc
... M M M M
6
Counting: barplot freq <- table(sex) barplot(freq)
barplot(freq, horiz=T, main="Sex distribution")
7
Counting by group : barplot
agegroup <- cut(age, 3) agesex <- table(sex, agegroup) barplot(agesex)
8
Counting by group : barplot
agegroup <- cut(age, 3) agesex <- table(sex, agegroup) barplot(agesex, xlab="Age group") barplot(agesex, beside=T, xlab="Age group")
9
Distribution of data: Histogram
par(mfrow=c(2,2)) hist(age) hist(age, breaks=20) hist(age, breaks=40) hist(age, breaks=50)
10
Distribution of data: Histogram
par(mfrow=c(2,2)) hist(age) hist(weight) hist(lm) hist(fm)
11
Distribution of data: plot(density)
hist(lm, main="Distribution of lean mass") plot(density(lm), main="Distribution of lean mass")
12
Normal distribution? qqnorm
qqnorm(lm)
13
Contiunity of data: stripchart
stripchart(lm, xlab=“Lean mass; kg") ?
14
Summary of continuous data: boxplot
boxplot(fm) boxplot(lm) LM Min. 1st Qu. Median Mean 3rd Qu. Max. FM
15
Summary of data by group: boxplot
Lean mass by sex Fat mass by sex boxplot(fm ~ sex) boxplot(lm ~ sex)
16
Analysis of association: scatter plot
plot(lm ~ age) plot(lm ~ age, pch=16)
17
Analysis of association: scatter plot
line <- lm(lm ~ age) plot(lm ~ age, pch=16) abline(line)
18
Analysis of association by group: scatter plot
plot(lm ~ age, pch=ifelse(sex=="M", "M", "F"), xlab="Age", ylab="Kg")
19
Analysis of multiple associations
data <- data.frame(age, weight, lm, fm, bmc) pairs(data)
20
Analysis of multiple associations – more fancy graph
matrix.cor <- function(x, y, digits=2, prefix="", cex.cor){ usr <- par("usr"); on.exit(par(usr)) par(usr = c(0, 1, 0, 1)) r <- abs(cor(x, y)) txt <- format(c(r, ), digits=digits)[1] txt <- paste(prefix, txt, sep="") if(missing(cex.cor)) cex <- 0.8/strwidth(txt) test <- cor.test(x,y) # borrowed from printCoefmat Signif <- symnum(test$p.value, corr = FALSE, na = FALSE, cutpoints = c(0, 0.001, 0.01, 0.05, 0.1, 1), symbols = c("***", "**", "*", ".", " ")) text(0.5, 0.5, txt, cex = cex * r) text(.8, .8, Signif, cex=cex, col=2)} pairs(data,lower.panel=panel.smooth, upper.panel=matrix.cor)
21
Results
22
Summary R is a very powerful package for graphical analysis
First step in data analysis: graphical analysis Look for Distributions Differences Associations
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.