Presentation is loading. Please wait.

Presentation is loading. Please wait.

R objects  All R entities exist as objects  They can all be operated on as data  We will cover:  Vectors  Factors  Lists  Data frames  Tables 

Similar presentations


Presentation on theme: "R objects  All R entities exist as objects  They can all be operated on as data  We will cover:  Vectors  Factors  Lists  Data frames  Tables "— Presentation transcript:

1 R objects  All R entities exist as objects  They can all be operated on as data  We will cover:  Vectors  Factors  Lists  Data frames  Tables  Indexing  R packages and datasets

2 Vectors  Think of vectors as being equivalent to a single column of numbers in a spreadsheet  You can create a vector using the c( ) function (concatenate) as follows: x <- c( )  For example: x <- c(1,2,4,8) creates a column of the numbers 1,2,4,8

3 Vectors Other ways of creating columns of numbers (vectors):  The seq function seq(1,10,1) = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 seq(1,4,0.5) = 1, 1.5, 2, 2.5, 3, 3.5, 4  x:y 1:10 = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 2 * 1:10 = 2, 4, 6, 8, 10, 12, 14, 16, 18, 20  The rep function rep(2,4) = 2, 2, 2, 2 ?seq() ?rep()

4 Indexing Referencing (indexing) specific ‘cells’ in a column: Example: if x is the vector 1, 2, 5 then x [1] = 1, x [2] = 2, x [3] = 5 and x [1:2] = 1, 2first two listed items in x x [2:3] = 2, 52 nd & 3 rd listed items in x x [x>2] = 5use of ‘>’ and ‘<‘ characters Example: if x is the vector 1, 2, 5 then x [1] = 1, x [2] = 2, x [3] = 5 and x [1:2] = 1, 2first two listed items in x x [2:3] = 2, 52 nd & 3 rd listed items in x x [x>2] = 5use of ‘>’ and ‘<‘ characters

5 Performing simple operations on vectors  In R, when you carry out simple operations (+ - * /) on vectors that have the same number of entries, R just performs the normal operations on the numbers in the vector, entry by entry  If the vectors don’t have the same number of entries, then R will cycle through the vector with the smaller number of entries

6 Performing simple operations on vectors Example:

7 Performing simple operations on vectors Examples:

8 Performing simple operations on vectors Example:

9 Performing simple operations on vectors Vectors (columns of numbers) can be assigned by putting together other vectors, for example:

10 Functions  R functions take arguments (information that you put into the function which goes between the brackets) and can perform a range of tasks  In the case of the ‘help’ function the task is to display information from the R documentation files  A comprehensive list of R functions can be obtained from the R reference manual under the help menu

11 Simple statistic functions R comes with some useful functions: sqrt ( ) square root mean ( )arithmetic mean hist ( ) calculating & plotting histograms sqrt ( ) square root mean ( )arithmetic mean hist ( ) calculating & plotting histograms R also comes with pre-loaded datasets, which we’ll discuss later….

12 Basic statistic functions on vectors > X1 <- c(1.1, 4.3, 5, 2, 1, 4, 9.5) > sum(X1)sum = 26.9 > mean(X1)mean = 3.842857 > median(X1)median = 4 > var(X1)variance = 8.762857 > sd(X1)standard deviation = 2.960212 > summary(X1) Min. 1st Qu. Median Mean 3rd Qu. Max. 1.000 1.550 4.000 3.843 4.650 9.500 > quantile(X1) 0% 25% 50% 75% 100% 1.00 1.55 4.00 4.65 9.50

13 Mixing vectors and scalars  R has the very convenient feature of having operators that work with vectors  It is even possible to mix vectors and scalars  For example: > X1 <- c(1.1, 4.3, 5, 2, 1, 4, 9.5) > X1 + 1 [1] 2.1 5.3 6.0 3.0 2.0 5.0 10.5 > X1 * 2 [1] 2.2 8.6 10.0 4.0 2.0 8.0 19.0

14 Vectors to record data > x = c(45,43,46,48,51,46,50,47,46,45) > length(x) [1] 10 > x = c(x,48,49,51,50,49) # append values to x > length(x) [1] 15 > x[16] = 41 # add to a specified index > length(x) [1] 16 > mean(x) [1] 47.1875 > x[17:20] = c(40,38,35,40) # add to many specified indices > length(x) [1] 20 > mean(x) [1] 45.4

15 Factors  A factor is a vector that encodes information about the group to which a particular observation belongs  Categorical data is often used to classify data into various levels or factors  To make a factor is easy, using the factor function

16 Factors – smoking survey example A survey asks people if they smoke or not. The data is: Yes, No, No, Yes, Yes > x=c("Yes","No","No","Yes","Yes") > x # print out values in x [1] "Yes" "No" "No" "Yes" "Yes" > factor(x) # print out value in factor(x) [1] Yes No No Yes Yes Levels: No Yes # notice levels are printed. A survey asks people if they smoke or not. The data is: Yes, No, No, Yes, Yes > x=c("Yes","No","No","Yes","Yes") > x # print out values in x [1] "Yes" "No" "No" "Yes" "Yes" > factor(x) # print out value in factor(x) [1] Yes No No Yes Yes Levels: No Yes # notice levels are printed. Notice the difference in how R treats factors with this example

17 Factors – student height example Suppose the recorded height of South African and British students are as follows heights <- c(1.7,1.95,1.63,1.54,1.29) You make a new vector fac_heights, to record the nationality that each observation pertains to fac_heights <- factor(c(“GB”, “SA”, “GB”, “GB”, “SA”)) Suppose the recorded height of South African and British students are as follows heights <- c(1.7,1.95,1.63,1.54,1.29) You make a new vector fac_heights, to record the nationality that each observation pertains to fac_heights <- factor(c(“GB”, “SA”, “GB”, “GB”, “SA”)) Useful when testing for differences between groups

18 Factors – gender survey example Consider a survey that has data on 691 females and 692 males > gender <- c(rep("female",691), rep("male",692))# create vector > gender <- factor(gender) # change vector to factor Consider a survey that has data on 691 females and 692 males > gender <- c(rep("female",691), rep("male",692))# create vector > gender <- factor(gender) # change vector to factor Once stored as a factor, the space required for storage is reduced Values “female” and “male” are the levels of the factor > levels(gender) # assumes gender is a factor [1] "female" "male" Once stored as a factor, the space required for storage is reduced Values “female” and “male” are the levels of the factor > levels(gender) # assumes gender is a factor [1] "female" "male" Internally, the factor ‘gender’ is stored as 691 1’s, followed by 692 2’s. It has stored with it a table that looks like this:

19 Lists A set of objects (e.g. vectors) can be combined under a single name as a list (similar to a spreadsheet in Excel) Example: x <- c (1, 7, 8, 9, 10) y <- c (“red”, “yellow”, “blue”, “green”) example_list <- list (size = x, colour = y) Example: x <- c (1, 7, 8, 9, 10) y <- c (“red”, “yellow”, “blue”, “green”) example_list <- list (size = x, colour = y) Note: vectors can consist of characters (i.e. letters/words) instead of numbers, but never numbers AND characters

20 Data frames The function data.frame( ):  This is a special kind of list, in which the entries in a specific position in the elements of the list correspond to one another  Each element of the list has the same length  It is a rectangular table, with rows and columns

21 Data frames Example 1:  Simple data frames can be created  Enter the following information at the prompt line: h <- c (150, 170, 168, 179, 130) w <- c (65, 70, 72, 80, 51) patient_data <- data.frame (weight=w, height=h)  Type in patient_data to see what’s just been created…

22 Access of elements in data frames  Individual elements can be accessed using a pair of square brackets “[ ]” and by specifying their index, or name  Here are some ways to access a cell, row or column: patient_data$heightaccesses a column patient_data [, i]accesses the i th column patient_data [ i, ]accesses the i th row patient_data$height [i] i is the cell position in height column patient_data [ i, j ]looking for the j th cell in the i th column

23 Data frames  More complex tables can be created  Data within each column must have the same type (e.g., number, text), but different columns may have different types – like a spreadsheet, as in the example:

24 Data frames Accessing specific cells, or data: Note: "$" is a shortcut; minus "-" sign means not.

25 Tables  We often view categorical data with tables  The table function allows us to look at tables  Its simplest usage is table(x) where x is a categorical variable

26 Tables Example: smoking survey A survey asks people if they smoke or not. The data is: Yes, No, No, Yes, Yes > x=c("Yes","No","No","Yes","Yes") > table(x) x No Yes 2 3 A survey asks people if they smoke or not. The data is: Yes, No, No, Yes, Yes > x=c("Yes","No","No","Yes","Yes") > table(x) x No Yes 2 3 The table command simply adds up the frequency of each unique value of the data

27  View a list of R packages:library()  Access datasets with the data function data( ) provides a list of all the datasets data (Titanic) loads the Titanic dataset summary (Titanic) provides summary information about the Titanic dataset attributes(Titanic) provides more information Titanicdataset name will display the data  List all datasets in a package, e.g., data(package='stats') R packages and datasets

28  List preloaded datasets in R:data( )  Display the “women” dataset :women Now let’s access specific data……  Access data from each column: women$height or women[,1] women$weight or women[,2]  Access data from individual rows: women[1, ] or women[10,] etc.  Try it……. Working through some examples

29 Now that you can access sample data, let’s work with it:  Get the mean weight and height of the women in our example…..  Remember the help function: help(mean)  Also, R can show an example:example(mean) Working through some examples

30 Common useful functions print()# prints a single R object cat()# prints multiple objects, one after the other length()# number of elements in a vector, or of a list mean() median() range() unique()# gives the vector of distinct values sort()# sort elements into order order()# x[order(x)] orders elements of x rev()# reverse the order of vector elements print()# prints a single R object cat()# prints multiple objects, one after the other length()# number of elements in a vector, or of a list mean() median() range() unique()# gives the vector of distinct values sort()# sort elements into order order()# x[order(x)] orders elements of x rev()# reverse the order of vector elements


Download ppt "R objects  All R entities exist as objects  They can all be operated on as data  We will cover:  Vectors  Factors  Lists  Data frames  Tables "

Similar presentations


Ads by Google