R Programming Yang, Yufei. Normal distribution.

Slides:



Advertisements
Similar presentations
Introduction to R Brody Sandel. Topics Approaching your analysis Basic structure of R Basic programming Plotting Spatial data.
Advertisements

R for Macroecology Aarhus University, Spring 2011.
Training on R For 3 rd and 4 th Year Honours Students, Dept. of Statistics, RU Empowered by Higher Education Quality Enhancement Project (HEQEP) Department.
Refresh- Caitlin Collins, Thibaut Jombart MRC Centre for Outbreak Analysis and Modelling Imperial College London Genetic data analysis using
Introduction to Matlab
Basics of Using R Xiao He 1. AGENDA 1.What is R? 2.Basic operations 3.Different types of data objects 4.Importing data 5.Basic data manipulation 2.
Computing for Data Analysis R statistics programming environment Ming Ni 11/14/2014.
Introduction to GTECH 201 Session 13. What is R? Statistics package A GNU project based on the S language Statistical environment Graphics package Programming.
SHOU Haochang ( 寿昊畅 ) Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health July 11th, 2011 Nanjing University, China *Thanks to.
Introduction to Exploratory Descriptive Data Analysis in S-Plus Jagdish S. Gangolly State University of New York at Albany.
C++ for Engineers and Scientists Third Edition
How to Use the R Programming Language for Statistical Analyses Part I: An Introduction to R Jennifer Urbano Blackford, Ph.D. Department of Psychiatry Kennedy.
Introduction to R: The Basics Rosales de Veliz L., David S.L., McElhiney D., Price E., & Brooks G. Contributions from Ragan. M., Terzi. F., & Smith. E.
Statistical Software An introduction to Statistics Using R Instructed by Jinzhu Jia.
3. Functions and Arguments. Writing in R is like writing in English Jump three times forward Action Modifiers.
MATLAB and SimulinkLecture 11 To days Outline  Introduction  MATLAB Desktop  Basic Features  Branching Statements  Loops  Script file / Commando.
INTRO TO PROGRAMMING Chapter 2. M-files While commands can be entered directly to the command window, MATLAB also allows you to put commands in text files.
Basic R Programming for Life Science Undergraduate Students Introductory Workshop (Session 1) 1.
732A44 Programming in R.  Self-studies of the course book  2 Lectures (1 in the beginning, 1 in the end)  Labs (computer). Compulsory submission of.
Hands-on Introduction to R. Outline R : A powerful Platform for Statistical Analysis Why bother learning R ? Data, data, data, I cannot make bricks without.
REVIEW 2 Exam History of Computers 1. CPU stands for _______________________. a. Counter productive units b. Central processing unit c. Copper.
Arko Barman with modification by C.F. Eick COSC 4335 Data Mining Spring 2015.
Sébastien Lê Agrocampus Rennes A very short introduction to “R” The “Rcmdr” package and its environment.
A First Book of ANSI C Fourth Edition
1 Lab of COMP 406 Teaching Assistant: Pei-Yuan Zhou Contact: Lab 1: 12 Sep., 2014 Introduction of Matlab (I)
Session 3: More features of R and the Central Limit Theorem Class web site: Statistics for Microarray Data Analysis.
Data Objects in R Vector1 dimensionAll elements have the same data types Data types: numeric, character logic, factor Matrix2 dimensions Array2 or more.
Piotr Wolski Introduction to R. Topics What is R? Sample session How to install R? Minimum you have to know to work in R Data objects in R and how to.
Vectors and Matrices In MATLAB a vector can be defined as row vector or as a column vector. A vector of length n can be visualized as matrix of size 1xn.
Using the R software R is an open source comprehensive statistical package, more and more used around the world. R project web site:
R-Studio and Revolution Analytics have built additional functionality on top of base R.
Introduction to R. Why use R Its FREE!!! And powerful, fairly widely used, lots of online posts about it Uses S -> an object oriented programing language.
C++ for Engineers and Scientists Second Edition Chapter 11 Arrays.
R packages/libraries Data input/output Rachel Carroll Department of Public Health Sciences, MUSC Computing for Research I, Spring 2014.
Introduction to Programming in R Department of Statistical Sciences and Operations Research Computation Seminar Series Speaker: Edward Boone
What does C store? >>A = [1 2 3] >>B = [1 1] >>[C,D]=meshgrid(A,B) c) a) d) b)
Chapter 5 Reading and Manipulating SAS ® Data Sets and Creating Detailed Reports Xiaogang Su Department of Statistics University of Central Florida.
Programming Logic and Design Fourth Edition, Comprehensive Chapter 8 Arrays.
Introduction to R Introductions What is R? RStudio Layout Summary Statistics Your First R Graph 17 September 2014 Sherubtse Training.
Lecture 26: Reusable Methods: Enviable Sloth. Creating Function M-files User defined functions are stored as M- files To use them, they must be in the.
STAT 534: Statistical Computing Hari Narayanan
Introduction to MATLAB 1.Basic functions 2.Vectors, matrices, and arithmetic 3.Flow Constructs (Loops, If, etc) 4.Create M-files 5.Plotting.
Chapter 8 Arrays. A First Book of ANSI C, Fourth Edition2 Introduction Atomic variable: variable whose value cannot be further subdivided into a built-in.
Introduction to Exploratory Descriptive Data Analysis in S-Plus Jagdish S. Gangolly State University of New York at Albany.
1 An Introduction to R © 2009 Dan Nettleton. 2 Preliminaries Throughout these slides, red text indicates text that is typed at the R prompt or text that.
R objects  All R entities exist as objects  They can all be operated on as data  We will cover:  Vectors  Factors  Lists  Data frames  Tables 
Data & Graphing vectors data frames importing data contingency tables barplots 18 September 2014 Sherubtse Training.
Actor Heights 1)Create Vectors of Actor Names, Heights, Date of Birth, Gender 2) Combine the 4 Vectors into a DataFrame.
Visual C# 2005 Using Arrays. Visual C# Objectives Declare an array and assign values to array elements Initialize an array Use subscripts to access.
Arrays What is an array… –A data structure that holds a set of homogenous elements (of the same type) –Associate a set of numbers with a single variable.
PHP Tutorial. What is PHP PHP is a server scripting language, and a powerful tool for making dynamic and interactive Web pages.
Lecture 11 Introduction to R and Accessing USGS Data from Web Services Jeffery S. Horsburgh Hydroinformatics Fall 2013 This work was funded by National.
Descriptive Statistics using R. Summary Commands An essential starting point with any set of data is to get an overview of what you are dealing with You.
Introduction to R and Data Science Tools in the Microsoft Stack Jamey Johnston.
Pinellas County Schools
Introduction to R Chris Free. Introduction to R Free! Superior (if not comparable) to commercial alternatives Available on all platforms Not just for.
Review > x[-c(1,4,6)] > Y[1:3,2:8] > island.data fishData$weight[1] > fishData[fishData$weight < 20 & fishData$condition.
16BIT IITR Data Collection Module If you have not already done so, download and install R from download.
Vectors and DataFrames. Character Vector: b
Working with data in R 2 Fish 552: Lecture 3. Recommended Reading An Introduction to R (R Development Core Team) –
Introduction to R and Data Science Tools in the Microsoft Stack Jamey Johnston.
Programming in R Intro, data and programming structures
Introduction to R Samal Dharmarathna.
Introduction Osborn.
Basics of R, Ch Functions Help Managing your Objects
Vectors and Matrices In MATLAB a vector can be defined as row vector or as a column vector. A vector of length n can be visualized as matrix of size 1xn.
Programming For Big Data
R Course 1st Lecture.
Data analysis with R and the tidyverse
> Introduction to Nelson Rios, Tulane University
Presentation transcript:

R Programming Yang, Yufei

Normal distribution

How to get R R is freely available from the Comprehensive R Archive Network (CRAN) at project.org. project.org Choosing a mirror which locates in China (e.g., Select the version for your operating system.

R studio Is a powerful and productive user interface for R. It’s free and open source It works great on Windows, Mac, and Linux.

Holding your nose and closing your eyes, could you distinguish Cola from Sprite.

Data type 1.0 means wrong judgement; 1 means right 1,0,0,1,1numeric 2.Record the judgement directly “right”, ”wrong”, ”wrong”, “right”, “right”character 3.Whether it is a right judgement? TRUE, FALSE,FALSE, TRUE, TRUElogical

Vector One-dimensional array Containing only one data type e.g., – 1,0,0,1,1 – “right”, ”wrong”, ”wrong”, “right”, “right” – TRUE, FALSE,FALSE, TRUE, TRUE

Creating data c() a.num <- c(1,0,0,1,1) a.cha <- c(“right”, ”wrong”, ”wrong”, “right”, “right”) a.log <- c(TRUE, FALSE,FALSE, TRUE, TRUE) > print(a.num) [1] > mode(a.num) [1] "numeric" > length(a.num) [1] 5

Creating data colon operator seq() rep() > c <- 1:3 > print(c) [1] > d <- seq(1, 3, by=1) > print(d) [1] > e <- rep(c,times=3) > print(e) [1] > c(1:5,seq(11,13,1)) [1]

logical vector > a.num <- 0:4 > a.cha <- c("a","b","c","d","e") > a.log 2 > print(a.log) [1] FALSE FALSE FALSE TRUE TRUE > a.cha[a.log] [1] "d" "e"

Referring element Using a numeric vector of positions within brackets > b <- c(1, 2, 5, 3, 6, -2, 4) > b[2] [1] 2 > b[2:3] [1] 2 5 > b[c(2,3,5)] [1] 2 5 6

Missing value(NA, NaN) If the value of a variable is unknown, the value is NA. Numeric calculations whose result is undefined produce the value NaN. > print(0/0) [1] NaN > a.miss <- c(25,NA,NaN,11) > is.na(a.miss) [1] FALSE TRUE TRUE FALSE > is.nan(a.miss) [1] FALSE FALSE TRUE FALSE

Arithmetic operator OperatorDescriptionSample +Addition -Subtraction *Multiplication /Division ** or ^Exponentiation(x ^y equal to x y ) %Modulus5%2 is 1 %/%Integer division5%/%2 is 2

> a<-1:3 > a*2 [1] > print(1:3*2) > print(1:(3*2)) [1] [1]

Statistic functions FunctionsDescriptionSample mean()Meanmean(c(1,2,3,6)) returns 3. median()Medianmedian(c(1,2,3,4)) returns 2.5 sd()Standard deviation sd(c(1,2,3,4)) returns 1.29 var()Variancevar(c(1,2,3,4)) returns 1.67 sum()Sumsum(c(1,2,3,4)) returns 10 max()Maxmin(c(1,2,3,4)) returns 1 min()Minmax(c(1,2,3,4)) returns 4

Exercise 1. Creat the vector below using “seq”, “rep” and “:” quiz1 <- c(1,2,3,6,8,10,2,2,2) 2. Counting how many elements in quiz2 is larger than 13. Pick the elements with same index in quiz1. quiz2 <- 11:19 3. Calculating the average difference between elements with same index. Relevant commands and operators – c, seq, rep, :, length, mean, sum, -

Object object is anything that can be assigned to a variable, which could be constants, data structures, functions or graphs). ls() rm() rm(list=ls())

Factor A special type of vector with levels attribute Representing categorical data > a.fac <- factor(c("right", "wrong", "wrong", "right", "right"),levels=c("right","wrong")) > print(a.fac) [1] right wrong wrong right right Levels: right wrong > table(a.fac) a.fac right wrong 3 2

Factor > mode(a.fac) [1] "numeric" > unclass(a.fac) [1] attr(,"levels") [1] "right" "wrong" > b.fac <- factor(c(1,0,0,1,1),levels=c(1,0),labels=c("right","wrong")) > print(b.fac) [1] right wrong wrong right right Levels: right wrong

Exercise quiz <- c(1,2,3,6,8,10,2,2,2) How many elements are equal to 2? Relevant commands and operators factor, table

Matrix Maybe a girl couldn’t distinguish the two drinks, but a boy could  Two-dimensional array  Containing only one data type

Creating data matrix() > a <- c(1,0,1,1,0,0,1,0,1,0) > a.matrix <- matrix(a, nrow=5, ncol=2, dimnames=list(c("first","second","third","four th","fifth"), c("boy","girl"))) > print(a.matrix) boygirl first 1 0 second 0 1 third 1 0 fourth 1 1 fifth 0 0

Creating data dim() > a <- 1:6 > dim(a)<-c(2,3) > a [,1] [,2] [,3] [1,] [2,] 2 4 6

Creating data cbind() rbind() > a <- 1:3 > b <- 11:13 > cbind(a,b) a b [1,] 1 11 [2,] 2 12 [3,] 3 13 > rbind(a,b) [,1] [,2] [,3] a b

Referring element > a.matrix[1,1] [1] 1 > a.matrix[1,] boy girl 1 0 > a.matrix[,1] first second third fourth fifth > a.matrix[1:2,1:2] boy girl first 1 0 second 0 1

Exercise 1. Creating a matrix below: 2. Exchange the rows and columns

Data frame Several people are tested, we want to summary the results. namegenderrighttotal Siyufemale410 Shaohuanfemale511 Xiaomale611 Xiaoqingfemale712 character numeric

Creating data data.frame() > a.data <- data.frame (name=c("Siyu","Shaohuan","Xiao","Xiaoqing"), gender=c("female","female","male","female"), right=4:7, total=rep(10,4)) > print(a.data) name gender righttotal 1Siyu female410 2Shaohuan female511 3Xiao male 611 4Xiaoqing female712 > length(a.data) [1] 4

Referring element > a.data["name"] name 1 Siyu 2 Shaohuan 3 Xiao 4 Xiaoqing > a.data$name [1] Siyu Shaohuan Xiao Xiaoqing Levels: Shaohuan Siyu Xiao Xiaoqing >as.character(a.data$name) [1] "Siyu" "Shaohuan" "Xiao" "Xiaoqing"

Referring element > a.data[1:2,c("name","gender")] name gender 1 Siyu female 2 Shaohuan female > a.data[a.data$gender=="female" & a.data$right>5,] name gender right total 4 Xiaoqing female 7 11

logical operator R commandmeaning >larger than <less than >=larger than or equal to <=less than or equal to ==equal to !not &and |or

Editing data frame > a.data$prefer <- c("Cola","Cola","Sprite","Sprite") > print(a.data) name gender right total prefer 1 Siyu female 4 10 Cola 2 Shaohuan female 5 11 Cola 3 Xiao male 6 11 Sprite 4 Xiaoqing female 7 12 Sprite

stringsAsFactors option > out<-vector(length=2,mode="character") > mydata <- data.frame(a=1:4,name=c("a","b","c","d")) > mydata_FALSE <- data.frame(a=1:4, name=c("a","b","c","d"), stringsAsFactors=F) > out[1] <- mydata[1,2] > out[2] <- as.character(mydata[1,2]) > out[3] <- mydata_FALSE[1,2] > print(out) [1] "1" "a" "a" logical: should character vectors be converted to factors?

with() > with(a.data,{ + a.test 5] + b.test 5] + print(a.test) + print(b.test) + }) [1] Xiaoqing Levels: Shaohuan Siyu Xiao Xiaoqing [1] Xiaoqing Levels: Shaohuan Siyu Xiao Xiaoqing > print(a.test) Error in print(a.test) : object 'a.test' not found > print(b.test) [1] Xiaoqing Levels: Shaohuan Siyu Xiao Xiaoqing

Exercise Produce a data frame named cola to store the value Calculate the proportion of right judgement and add this to the data frame How many females have a more than 50% frequence to give the right judgement. namegenderrighttotal Siyufemale410 Shaohuanfemale511 Xiaomale611 Xiaoqingfemale712

data structure Vector Factor Matrix Data frame Array List

Setting working directory R needs to read or write some files, you should told it a directory. setwd() getwd() > getwd() [1] "C:/Users/dell/Documents" > setwd("C:/Users/dell/Documents")

Data output write.table() “prints its required argument x (after converting it to a data frame if it is not one nor a matrix) to a file ” write.table(x, file = "", append = FALSE, quote = FALSE, sep = “\t", row.names = FALSE, col.names = TRUE) getwd()

Exercise Write the data frame cola to a file named “cola_sprite” Try the options

Data input read.table() “Reads a file in table format and creates a data frame from it.” read.table(file, header = FALSE, sep = "", quote = "\"'", as.is = !stringsAsFactors, skip = 0, comment.char = "#")

Exercise Add “#” to the begin of the first column, second line. Read the data from “cola_sprite” to a new data frame Try the options

Packages install.packages(): install the package library() : load the package install.packages("effects") library("effects")

How to get help Built-in help system – e.g., “?matrix” Google Mail list Forums

Conditional execution if (cond) statement if (cond) statement1 else statement2 > a <- 1:2 > for(i in 1:length(a)){ + if(a[i]>1){ + print(paste(a[i],"is larger than 1",sep=" ")) + } + else{ + print(paste(a[i],"is not larger than 1",sep=" ")) + } [1] "1 is not larger than 1" [1] "2 is larger than 1"

Loop for (var in seq) statement > a <- 1:4 > for (i in 1:length(a)){ + print("hello world") + } [1] "hello world"

> a.vec <- vector(mode="numeric",length=5) > print(a.vec) [1]

Exercise Using “for” to loop the cola data frame, using if to print the lines where right is larger than 5 times.

Data converting a.num <- 0:4 a.cha <- c("a","b","c","d") a.log <- c(T,F,F,T) as.logical(a.num) as.character(a.num) as.logical(a.cha) as.numeric(a.cha) as.logical(a.log) as.numeric(a.log)