Writing functions in R Some handy advice for creating your own functions.

Slides:



Advertisements
Similar presentations
Introduction to R Brody Sandel. Topics Approaching your analysis Basic structure of R Basic programming Plotting Spatial data.
Advertisements

 Statistics package  Graphics package  Programming language  Can be used to share/reproduce analyses  Many new packages being created - can be downloaded.
Data in R. General form of data ID numberSexWeightLengthDiseased… 112m … 256f3.61 NA1… 3……………… 4……………… n91m5.1711… NOTE: A DATASET IS NOT A MATRIX!
Basics of Using R Xiao He 1. AGENDA 1.What is R? 2.Basic operations 3.Different types of data objects 4.Importing data 5.Basic data manipulation 2.
Course Orientation Assignments Tool. If the Assignments tool has been added to the course, use the Assignments link in the Course Menu to access upcoming.
Lecture 2 LISAM. Statistical software.. LISAM What is LISAM? Social network for Creating personal pages Creating courses  Storing course materials (lectures,
Data Management: Documentation & Metadata Types of Documentation.
How to Use the R Programming Language for Statistical Analyses Part I: An Introduction to R Jennifer Urbano Blackford, Ph.D. Department of Psychiatry Kennedy.
ATM 315 Environmental Statistics Course Goto Follow the link and then choose the desktop application.
1 An Introduction – UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt & Eric D. Stolen Getting Started with R (with speaker notes)
732A44 Programming in R.  Self-studies of the course book  2 Lectures (1 in the beginning, 1 in the end)  Labs (computer). Compulsory submission of.
Data, graphics, and programming in R 28.1, 30.1, Daily:10:00-12:45 & 13:45-16:30 EXCEPT WED 4 th 9:00-11:45 & 12:45-15:30 Teacher: Anna Kuparinen.
Intro to R R is a free version of S-plus R is a free version of S-plus Can be used interactively but script or syntax files are commonly used to record.
Introduction to R Part 2. Working Directory The working directory is where you are currently saving data in R. What is the current working directory?
Introduction to R Part 1. First Note: I am not an expert at R. – I’ve been hiking up the learning curve for about a year. You can learn R. – You will.
Introduction to R Lecture 3: Data Manipulation Andrew Jaffe 9/27/10.
Piotr Wolski Introduction to R. Topics What is R? Sample session How to install R? Minimum you have to know to work in R Data objects in R and how to.
Renesas Technology America Inc. 1 SKP8CMINI Tutorial 2 Creating A New Project Using HEW.
U T N A T U R A L T o o l S u i t e Torvald Hessel Brick Jones Steve Rung.
Data Manipulation Steve Allison
Outline Comparison of Excel and R R Coding Example – RStudio Environment – Getting Help – Enter Data – Calculate Mean – Basic Plots – Save a Coding Script.
United Nations Economic Commission for Europe Statistical Division The Importance of Databases in the Dissemination Process Steven Vale, UNECE.
Level 1 Tutorial Project How to put a movie player on your Weebly website using an HTML code.
STAT 251 Lab 1. Outline Lab Accounts Introduction to R.
INTRODUCTION TO MINECRAFT FORGE CSCI 3130 SUMMER 2014.
Introduction to R Introductions What is R? RStudio Layout Summary Statistics Your First R Graph 17 September 2014 Sherubtse Training.
Introduction to R Carol Bult The Jackson Laboratory Functional Genomics (BMB550) Spring 2011.
Lab00-Getting Started with VC Launch VS 2005 Launch Visual Studio 2005 – Start > All Programs > Microsoft Visual Studio 2005 > Microsoft Visual.
Digital Media Lecture 0: It’s all just bits! Georgia Gwinnett College School of Science and Technology Dr. Jim Rowan.
Chapter 3-4 More R functions Graphs!. Random note The package DSUR from the Field book is not a thing. ◦ That’s ok! We’ll figure it out.
Lab 9: practice with functions Some tips to make your functions a little more interesting.
MIS2502: Data Analytics Introduction to Advanced Analytics and R.
Review > x[-c(1,4,6)] > Y[1:3,2:8] > island.data fishData$weight[1] > fishData[fishData$weight < 20 & fishData$condition.
Learning and remembering.
CS 200- Web Technology I Quiz #2 (15 minutes) Web Authoring Software Kompozer.
Introduction to R Dr. Satish Nargundkar. What is R? R is a free software environment for statistical computing and graphics. It compiles and runs on a.
Introduction to Data Manipulation, Analysis, and Visualization with R Patrick Grof-Tisza.
Introduction to R user-friendly and absolutely free
Block 1: Introduction to R
R Brown-Bag Seminar 2.1 Topic: Introduction to R Presenter: Faith Musili ICRAF-Geoscience Lab.
Data Tools: R and RStudio
CSCE 587: Big Data Analytics
Lecture 2: Introduction to R
LISAM. Statistical software.
Introduction to R Carolina Salge March 29, 2017.
NewMia Integrated Aircraft Maintenance Software
R: A Story of automation
Introduction to R Programming with AzureML
Introduction to R Commander
Installing R and R Studio
R Programming.
Getting started in Eclipse
CMPE 152: Compiler Design ANTLR 4 and C++
Lab 1 Introductions to R Sean Potter.
Introduction to R.
R Data Manipulation Bootstrapping
An introduction to data analysis using R
Paul Rockley FSLG - 29th September 2016
Code is on the Website Outline Comparison of Excel and R
Installing Packages Introduction to R, Part II
MIS2502: Data Analytics Introduction to R and RStudio
Install MySQL Community Server and MySQL Workbench
Figure 16.1 Setting the Nios II IDE workspace to the Nios II reference design software directory.
Under “view my data”, scroll down to last option “access raw data”, and click on the icon to “download your data”
Cygwin: getting the setup tool
LANGUAGE EDUCATION.
Debt Relief Advice Cpshelps.com Check Out for Debt Relief Advice The debt relief advice offered by experts really come in handy.
Make EML with r and share on github
McAfee.com/activate. Mcafee Activate
Presentation transcript:

Writing functions in R Some handy advice for creating your own functions

A quick review of R R is a statistical software package and an object-oriented programming language Terms to remember:  Vectors, matrices, and dataframes  Indices  Functions

Warm up Download the data for lab 3 In Rstudio, go to Workspace → Import Dataset → From Text File  Make sure to select the header option  If you're not using Rstudio, the code is: data_lab_3 <- read.csv("~/documents/classes/Psych 1950/mood.csv")  Where ~ is the path name

Warming up a little more Use the help() function to read about the read.csv() function How could we use it to read in a file with no header?  read.csv(“filename”,header=FALSE) We can also use R to read in SPSS files, but for now we'll stick with read.csv()

Last page of warm-up (I promise!) Find the standard deviation ( sd() ) of the second column ( puDay2call1 ) of your dataframe Uh-oh! That output isn't helpful Add the following argument to the standard deviation function:  na.rm=TRUE

A slight modification Suppose that we want to calculate the standard deviation using the population formula Check the help file for sd(). Is there a way to do that? Nope! We'll need to make our own....

Making a function Let's start with something easier We'll make our own mean() function What should it do?  We'll pass* it a vector of numbers as arguments*  It should return* the mean *programming jargon

The function syntax getMean <- function(arguments){ commands go here } The name of the function is getMean() (this is usually a verb) The arguments are the values and instructions we give to the function The body is where the work happens

Iteration 1 getMean <- function(x){ return(sum(x)/length(x)) } Try this on the second column How can we handle NAs in the function, assuming we ALWAYS want to remove them?

Iteration 2 getMean <- function(x){ return(sum(x,na.rm=T)/length(x)) } Now try this one, and compare your results to R's built-in mean function Why aren't the values the same?  Hint: what's the length of a vector that contains NAs?

Iteration 3 getMean <- function(x){ return(sum(x,na.rm=T)/length(na.om it(x)) } Another R function saves the day! Thanks, R! Compare your results to the built-in function

Another way to do it We've been leaning heavily on the sum() function Sometimes, though, we need to tell R to do a certain operation a number of times To do that, we use an operation called a for loop  There are other loops as well, but we'll stick with a for loop

The anatomy of a for loop getFactorial <- function(number){ j=1 for (index in 1:number){ j <- j*index } return(j) } What will this function do?

One more concept Sometimes, we need a function to make a decision Here, we use conditionals if(condition){ #if the condition is true Something #do this } else{ #if it's false something else #do this instead }

For examples if (!is.na(x)){ #if x isn't an NA print(x) #write x. If it is, nothing } #will happen if (x<=4){ #if x is less than 4 print(x-1) } if (x==5){ #if x is exactly 5 print(“Five”) }

Looping to get the mean getMean_3 <- function(x){ sum <- 0 length <- 0 for (i in 1:length(x)){ if (!is.na(x[i])){ #exclude NAs sum <- sum+x[i] #keep a running tally of the sum length <- length+1 #and the length } return(sum/length) #this is the mean }

Adding some complexity It's your turn now: Write two functions to compute the sum of squared deviations from the mean of a vector  In one version, use the sum() function  In the other, use a for loop Try to allow your function to work with a vector that includes some NAs

Remember The formula for the sum of squares of a set of numbers is the sum of (x i – mean(x)) 2 Now make R do it for you!

Last of all Make a new function that finds the (population) standard deviation of the vector Find the sum of squares, divide by the number of observations, and take the square root Test your function to make sure it's working