Introduction to R Part 2. Working Directory The working directory is where you are currently saving data in R. What is the current working directory?

Slides:



Advertisements
Similar presentations
A gentle introduction to R – how to load in data and produce summary statistics BRC MH Bioinformatics group.
Advertisements

Introduction to R Brody Sandel. Topics Approaching your analysis Basic structure of R Basic programming Plotting Spatial data.
Microsoft ® Office 2007 Training Security II: Turn off the Message Bar and run code safely P J Human Resources Pte Ltd presents:
R Packages Davor Cubranic SCARL, Dept. of Statistics.
R for Macroecology Aarhus University, Spring 2011.
 Statistics package  Graphics package  Programming language  Can be used to share/reproduce analyses  Many new packages being created - can be downloaded.
Theoretical Probability Distributions We have talked about the idea of frequency distributions as a way to see what is happening with our data. We have.
1 An Introduction to IBM SPSS PSY450 Experimental Psychology Dr. Dwight Hennessy.
LSP 121 Week 2 Intro to Statistics and SPSS/PASW.
Lecture 2 LISAM. Statistical software.. LISAM What is LISAM? Social network for Creating personal pages Creating courses  Storing course materials (lectures,
Tutor: Prof. A. Taleb-Bendiab Contact: Telephone: +44 (0) CMPDLLM002 Research Methods Lecture 9: Quantitative.
ALEXANDER C. LOPILATO R: Because the names of other stat programs don’t make sense so why should this one?
An introduction to R: get familiar with R Guangxu Liu Bio7932.
Hands-on Introduction to R. Outline R : A powerful Platform for Statistical Analysis Why bother learning R ? Data, data, data, I cannot make bricks without.
Microsoft ® Office 2007 Training Security II: Turn off the Message Bar and run code safely presents:
Have you tried Track Changes? I think you are going to like it.
Instructor: Chris Trenkov Hands-on Course Python for Absolute Beginners (Spring 2015) Class #001 (January 9, 2015)
Introduction to R Part 1. First Note: I am not an expert at R. – I’ve been hiking up the learning curve for about a year. You can learn R. – You will.
An Introduction to Designing and Executing Workflows with Taverna Katy Wolstencroft University of Manchester.
Questionnaire Development: SPSS and Reliability Personality Lab October 8, 2010.
Path Analysis. Remember What Multiple Regression Tells Us How each individual IV is related to the DV The total amount of variance explained in a DV Multiple.
Hands-on Introduction to R. We live in oceans of data. Computers are essential to record and help analyse it. Competent scientists speak C/C++, Java,
An Introduction to Designing and Executing Workflows with Taverna Aleksandra Pawlik materials by: Katy Wolstencroft University of Manchester.
Comments in Java. When you create a New Project in NetBeans, you'll notice that some text is greyed out, with lots of slashes and asterisks:
What is SPSS  SPSS is a program software used for statistical analysis.  Statistical Package for Social Sciences.
Accuracy Chapter 5.1 Data Screening. Data Screening So, I’ve got all this data…what now? – Please note this is going to deviate from the book a bit and.
Introduction to R Part 1. First Note: I am not an expert at R. – I’ve been hiking up the learning curve for about a year. You can learn R. – You will.
Outline Comparison of Excel and R R Coding Example – RStudio Environment – Getting Help – Enter Data – Calculate Mean – Basic Plots – Save a Coding Script.
Analyses using SPSS version 19
Chapter 3 MATLAB Fundamentals Introduction to MATLAB Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Dr. Engr. Sami ur Rahman Research Methods in Computer Science Lecture: Data Analysis (Introduction to SPSS)
Multigroup Models Byrne Chapter 7 Brown Chapter 7.
Multigroup Models Beaujean Chapter 4 Brown Chapter 7.
Game Maker – Getting Started What is Game Maker?.
11/25/2015Slide 1 Scripts are short programs that repeat sequences of SPSS commands. SPSS includes a computer language called Sax Basic for the creation.
CFA: Basics Beaujean Chapter 3. Other readings Kline 9 – a good reference, but lumps this entire section into one chapter.
The Report Generator Viewing Student Outcomes. Install the Report Generator In a browser, go to Click.
Introduction to R Introductions What is R? RStudio Layout Summary Statistics Your First R Graph 17 September 2014 Sherubtse Training.
Assumptions 5.4 Data Screening. Assumptions Parametric tests based on the normal distribution assume: – Independence – Additivity and linearity – Normality.
Frequency Distributions
Lab 2: Descriptive Statistics. Today’s Activities Complete Brief Personality Inventory – Must put PID on Scantron Calculate Scales and Compute Descriptive.
Introduction to Computer Programming - Project 2 Intro to Digital Technology.
Chapter 3-4 More R functions Graphs!. Random note The package DSUR from the Field book is not a thing. ◦ That’s ok! We’ll figure it out.
Learn R Toolkit D Kelly O'DayInstall & SetupMod 1 - Setup: 1 Module 1 Installing & Setting Up R Do See & HearRead Learn PowerPoint must be in View Show.
Preparing to collect data. Make sure you have your materials Surveys –All surveys should have a unique numerical identifier on each page –You can write.
Chris Knight Beginners’ workshop.
Hands-on Introduction to R. We live in oceans of data. Computers are essential to record and help analyse it. Competent scientists speak C/C++, Java,
Frequency Distributions Chapter 2. Descriptive Statistics Distributions are part of descriptive statistics…we are learning how to describe some data by.
MIS2502: Data Analytics Introduction to Advanced Analytics and R.
SPSS: Using statistical software — a primer
Weebly Elements, Continued
R programming language
Getting Started with R.
By Dr. Madhukar H. Dalvi Nagindas Khandwala college
Eclipse Navigation & Usage.
DEPARTMENT OF COMPUTER SCIENCE
Working with Data in Windows
ECONOMETRICS ii – spring 2018
Data Entry and Managment
Macrosystems EDDIE: Getting Started + Troubleshooting Tips
Lab 1 Introductions to R Sean Potter.
Crash course in R – short introduction
Code is on the Website Outline Comparison of Excel and R
This is where R scripts will load
Macrosystems EDDIE: Getting Started + Troubleshooting Tips
MIS2502: Data Analytics ICA #7 Introduction to R and RStudio - Recap
This is where R scripts will load
Macrosystems EDDIE: Getting Started + Troubleshooting Tips
Macrosystems EDDIE: Getting Started + Troubleshooting Tips
A brief introduction to the nutrient tool-kit, getting R Studio to work and checking the data Martyn Kelly
Presentation transcript:

Introduction to R Part 2

Working Directory The working directory is where you are currently saving data in R. What is the current working directory? – Type in getwd() – You’ll see the path for your directory Note: I’m using a Mac.

Working Directory How to set the working directory: – setwd(“PATH”) If you aren’t really familiar or good with using PATH values, here’s a trick:

Working Directory Now you can pick the folder you are interested in saving your files to. Once you do that, the bottom right window will show you that folder.

Working Directory Why is all this important? – You can use getwd and setwd in saved R scripts to point the analyses to specific files. – Basically, you can set it to import a file from a specific spot and use that over and over, rather than importing the file each time you open R.

Packages Packages are add-ons to R that allow you to do different types of analyses, rather than code them yourself. R comes with many pre-programming functions – lovingly called base R. At the top of the help window, you can tell which package a function is included in.

Packages Packages are checked/monitored by the CRAN people. – That means there’s some oversight to them. – Many other types of functions can be downloaded from GitHub. Use at your own risk.

Packages Note: each time R updates, the packages sometimes come with it, sometimes they don’t. – If you are looking for a specific package, and it doesn’t want to install the normal way (next couple slides), but you know it exists  google it and get the TAR files. – You can install them from the TAR files.

Packages How to install: – Console: install.packages(“NAME OF PACKAGE”) – Let’s try it! install.packages(“car”) Note: you have to be connected to the internet for packages to install.

Packages How to install: – Through RStudio – Click on packages, click on install. – Note: you can see here in this window what all you have installed, and if you click on them, you will load that help file (or click the check box to load them).

Packages Start typing the name of the package – a drop down will appear with all the options.

Packages Now it’s installed! Awesome! That doesn’t mean that it loads every time. – Imagine this: if SPSS had a function that knew how to do regression, but it didn’t load every time. – Annoying! – But this saves computing power by not loading unless you need it. – You will run something without turning on the right package. It’s cool – all the cool kids do it.

Packages Packages are also called libraries. You can load them two ways: – In the console: library(car) (look no “” this time). – In the packages window by clicking on the check box.

Packages I suggest adding the code to your script to load the packages you need to save yourself the headache of trying to remember which ones were important.

Working with Files Data files (like the airquality dataset) come with base R. – You don’t technically have to load them, but you can get them to appear in environment window by: – data(SET NAME)

Working with Files If you want to see what’s available, type data() Use the help(DATA SET NAME) or ?DATA SET NAME to see what is included/is part of the data set.

Working with Files Data files are not nearly as visual as Excel or SPSS  But RStudio can give you somewhat of a visual. – Type View(airquality) to get a visual (note V is capital) – Or click on it in environment window

Working with Files You can import all types of files, including SPSS files. – I find.csv easiest but that’s me. – You can do.txt with any separator (comma, space, tab)

Working with Files Import from Rstudio – Pick your file and click open

Working with Files

This process is the same as: – real_words <- read.csv(”FILE NAME") – The read.csv function – which has a lot more settings, but this process make it easy to start working with files.

Working with Files You can also use the read.table function – which reads more than just csv files, allows you more flexibility in how you import the files.

Working with Files Importing SPSS files. – You need the memisc package. as.data.set(spss.system.file(SPSS DATA), use.value.labels=TRUE, to.data.frame=TRUE)

Working with Files All of these options will import your data set as a data frame.

Working with Files Clear the workspace – You don’t have to do this, but it helps if you want to start over. Click on clear in the environment window. – rm() and remove() functions do the same thing, but you have to type the object names.

Functions Functions are pre-written code to help you run analyses (so you don’t have to do the math yourself!). – So there are functions for the mean, variance, z- tests, ANOVA, regression, etc.

Functions How to get help on a function (or anything really) – ?function/name/thing – Try ?lm

Functions More help on functions: – help(function) – same as ?function – args(function) – tells you all the arguments that the function takes

Functions What do you mean arguments? Functions have a couple of parts – The name of the function – like lm, mean, var – The arguments – all the pieces inside the () that are required for the function to run.

Functions Get help on functions: – example(function) – Gives you an example of the function in action.

Functions Let’s write a very simple function to exponents. You do have to save them, set them equal to something. – pizza = function(x) { x^2 } – You can make more complex function, adding more to the (x) part like (x,y,z). – The variables can be named anything, they just have to match in () and within the {}.

Functions pizza = function(x) { x^2 } – This part is called the formal argument – that’s where you define the function. pizza(2) – The actual argument – that’s where you call the function and use it.

Functions Example functions: – table() – summary() – cov() – cor() – mean() – var() – sd() – scale() – recode()** In car package – relevel() – lower2full() ** Specific to lavaan

Table Function The table function gives you a frequency table of the values in a vector/column. table(OBJECT NAME)

Summary Function The summary function has several uses: – On a vector/data frame, it will give you basic statistics on that information – On a statistical analysis, it will give you the summary output (aka the boxes you are used to looking at in SPSS).

Summary Function

Descriptives Basic Descriptives – cov() – covariance table – cor() – correlation table – mean() – average – var() – variance – sd() – standard deviation

Descriptives Try taking the average of airquality$Ozone mean(airquality$Ozone) – Darn! – Stupid NAs! We’ve talked about how to deal with NAs globally, but here’s how they are handled in functions (generally)

Descriptives Try this line instead: – mean(airquality$Ozone, na.rm=TRUE) – Na.rm = remove NAs ?? – The default is FALSE (lame). So you can subset the data or use that argument to tell it to ignore NAs.

Descriptives Help / args are your friend**. – The var() function has na.rm – Cov() and cor() do not. **when they are actually helpful that is.

Descriptives Try: – cor(airquality, use = “pairwise.complete.obs”)

Rescoring Functions scale() will mean center or z-score your column. – scale(VARIABLE) Z-scored – scale(VARIABLE, scale=FALSE) Mean centered

Rescoring Functions recode() – in the car package, will allow you to reverse code/change the coding of a column – recode(COLUMN/VECTOR, “something=something”)

Rescoring Functions Not quite rescoring, but super handy is – relevel() – Which allows you to change the reference group for dummy coded (factor) variables – relevel(FACTOR, ref=“GROUP NAME”)

Lavaan Package We will use the lower2full function to build covariance matrices to run for SEM. – However, that function is depreciated. – So, use lav_matrix_lower2full(VECTOR OF NUMBERS)