Presentation is loading. Please wait.

Presentation is loading. Please wait.

Programming in R Describing multivariate data. In this session I will explain: How to describe two or more categorical variables with tables and stacked.

Similar presentations


Presentation on theme: "Programming in R Describing multivariate data. In this session I will explain: How to describe two or more categorical variables with tables and stacked."— Presentation transcript:

1 Programming in R Describing multivariate data

2 In this session I will explain: How to describe two or more categorical variables with tables and stacked bar charts How to use scatterplots to summarize two numeric variables

3 Two categorical variables Count of TattooTattoo SexNoYesGrand Total Female11918137 Male551368 Grand Total17431205 Count of TattooTattoo SexNoYesGrand Total Female86.86%13.14%100.00% Male80.88%19.12%100.00% Grand Total84.88%15.12%100.00% Contingency table with counts Contingency table with row percents

4 Two categorical variables Contingency tables in R do not have the nice look from the previous slide. The function table() will create the counts The function prop.table() turns the counts into percentages. The function margin.table() calculates the row or column totals.

5 Two categorical variables There is a package called prettyR. It contains a function called xtab which does produce better looking output. The basic function is –Xtab(y~x, dataframe) –Where x and y are the variables you want to relate in model notation. –Usually x is the independent variable –Y is the dependent variable

6 Two categorical variables This is a basic bar chart produced in R using the function barplot().

7 Two categorical variables Nicer graphics using ggplot2 package.

8 Two categorical variables Another graphic from ggplot2 using slightly different options.

9 Three categorical variables Sex(All) Count of TattooAnypierces TattooNoYesGrand Total No31.03%68.97%100.00% Yes12.90%87.10%100.00% Grand Total28.29%71.71%100.00% SexFemale Count of TattooAnypierces TattooNoYesGrand Total No5.04%94.96%100.00% Yes0.00%100.00% Grand Total4.38%95.62%100.00% SexMale Count of TattooAnypierces TattooNoYesGrand Total No87.27%12.73%100.00% Yes30.77%69.23%100.00% Grand Total76.47%23.53%100.00%

10 Three categorical variables R can create counts and percentages for three or more variables using the functions: –table() –prop.table() –margin.table().

11 Three categorical variables Uses the package ggplot2. This is a stacked bar chart that is also grouped by gender.

12 Two numeric variables A simple scatter plot using the plot() function available in R. plot(Penn$Height, Penn$HtChoice, main="Actual Height versus Preferred Height", xlab="Actual ", ylab="Preferred ")

13 One numeric and one categorical variable There are many different ways to “group” by a variable and summarize a second variable. aggregate() tapply() >tapply(Penn$Height, Penn$Sex, mean) –The first argument is the variable to summarize –The second is the “group by” –The third is the function to apply.

14 Histograms Histograms created using the hist function and subsetting the data.


Download ppt "Programming in R Describing multivariate data. In this session I will explain: How to describe two or more categorical variables with tables and stacked."

Similar presentations


Ads by Google