Piotr Wolski Introduction to R. Topics What is R? Sample session How to install R? Minimum you have to know to work in R Data objects in R and how to.

Slides:



Advertisements
Similar presentations
Introduction to R Brody Sandel. Topics Approaching your analysis Basic structure of R Basic programming Plotting Spatial data.
Advertisements

Introduction to S-Plus by Francesco Ferretti Analysis of Biological Data Course Winter term 2007 Dalhousie University.
Introduction to Programming using Matlab Session 2 P DuffourJan 2008.
R for Macroecology Aarhus University, Spring 2011.
Training on R For 3 rd and 4 th Year Honours Students, Dept. of Statistics, RU Empowered by Higher Education Quality Enhancement Project (HEQEP) Department.
Introduction to Matlab
Introduction to Matlab Workshop Matthew Johnson, Economics October 17, /13/20151.
 Statistics package  Graphics package  Programming language  Can be used to share/reproduce analyses  Many new packages being created - can be downloaded.
The Web Warrior Guide to Web Design Technologies
Introduction to GTECH 201 Session 13. What is R? Statistics package A GNU project based on the S language Statistical environment Graphics package Programming.
Programming Fundamentals. Programming concepts and understanding of the essentials of programming languages form the basis of computing.
SHOU Haochang ( 寿昊畅 ) Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health July 11th, 2011 Nanjing University, China *Thanks to.
Lecture 4 Sept 8 Complete Chapter 3 exercises Chapter 4.
Introduction to MATLAB Northeastern University: College of Computer and Information Science Co-op Preparation University (CPU) 10/22/2003.
Introduction to MATLAB Week 13 – 4/21/09. Instructor: Kate Musgrave Time: Tuesdays 3-5pm Office Hours: Tuesdays 1:30-3pm
Introduction to MATLAB Northeastern University: College of Computer and Information Science Co-op Preparation University (CPU) 10/22/2003.
R – a brief introduction Johannes Freudenberg Cincinnati Children’s Hospital Medical Center
How to Use the R Programming Language for Statistical Analyses Part I: An Introduction to R Jennifer Urbano Blackford, Ph.D. Department of Psychiatry Kennedy.
SPSS Statistical Package for the Social Sciences is a statistical analysis and data management software package. SPSS can take data from almost any type.
Introduction to Array The fundamental unit of data in any MATLAB program is the array. 1. An array is a collection of data values organized into rows and.
 2004 Prentice Hall, Inc. All rights reserved. Chapter 25 – Perl and CGI (Common Gateway Interface) Outline 25.1 Introduction 25.2 Perl 25.3 String Processing.
Introduction to R Statistical Software Anthony (Tony) R. Olsen USEPA ORD NHEERL Western Ecology Division Corvallis, OR (541)
Applied Bioinformatics Introduction to Linux and R Bing Zhang Department of Biomedical Informatics Vanderbilt University
Digital Image Processing Lecture3: Introduction to MATLAB.
Basic R Programming for Life Science Undergraduate Students Introductory Workshop (Session 1) 1.
1 An Introduction – UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt & Eric D. Stolen Getting Started with R (with speaker notes)
Introduction to MATLAB Session 1 Prepared By: Dina El Kholy Ahmed Dalal Statistics Course – Biomedical Department -year 3.
Introduction to Python
732A44 Programming in R.  Self-studies of the course book  2 Lectures (1 in the beginning, 1 in the end)  Labs (computer). Compulsory submission of.
Data, graphics, and programming in R 28.1, 30.1, Daily:10:00-12:45 & 13:45-16:30 EXCEPT WED 4 th 9:00-11:45 & 12:45-15:30 Teacher: Anna Kuparinen.
REVIEW 2 Exam History of Computers 1. CPU stands for _______________________. a. Counter productive units b. Central processing unit c. Copper.
Objectives Understand what MATLAB is and why it is widely used in engineering and science Start the MATLAB program and solve simple problems in the command.
Intro to R R is a free version of S-plus R is a free version of S-plus Can be used interactively but script or syntax files are commonly used to record.
IPC144 Introduction to Programming Using C Week 1 – Lesson 2
1 Lab of COMP 406 Teaching Assistant: Pei-Yuan Zhou Contact: Lab 1: 12 Sep., 2014 Introduction of Matlab (I)
Computational Methods of Scientific Programming Lecturers Thomas A Herring, Room A, Chris Hill, Room ,
Introduction to Engineering MATLAB – 6 Script Files - 1 Agenda Script files.
Using the ‘R’ Language for Bioinformatics
1 Computer Programming (ECGD2102 ) Using MATLAB Instructor: Eng. Eman Al.Swaity Lecture (1): Introduction.
Using Software in Teaching Statistics Damon Berridge, Centre for Applied Statistics, Dept of Mathematics & Statistics ESRC NCRM.
Getting Started with MATLAB 1. Fundamentals of MATLAB 2. Different Windows of MATLAB 1.
Getting started: Basics Outline: I.Connecting to cluster: ssh II.Connecting outside UCF firewall: VPN client III.Introduction to Linux IV.Intoduction to.
Introduction to R. Why use R Its FREE!!! And powerful, fairly widely used, lots of online posts about it Uses S -> an object oriented programing language.
Creating Graphical User Interfaces (GUI’s) with MATLAB By Jeffrey A. Webb OSU Gateway Coalition Member.
Chapter 3 MATLAB Fundamentals Introduction to MATLAB Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
CONTENTS Processing structures and commands Control structures – Sequence Sequence – Selection Selection – Iteration Iteration Naming conventions – File.
An Introduction to R Statistical Computing AMS 597 Stony Brook University Spring 2009 By Tianyi Zhang.
Introduction to R Introductions What is R? RStudio Layout Summary Statistics Your First R Graph 17 September 2014 Sherubtse Training.
Lecture 26: Reusable Methods: Enviable Sloth. Creating Function M-files User defined functions are stored as M- files To use them, they must be in the.
Introduction to Python Dr. José M. Reyes Álamo. 2 Three Rules of Programming Rule 1: Think before you program Rule 2: A program is a human-readable set.
STAT 534: Statistical Computing Hari Narayanan
Digital Image Processing Introduction to MATLAB. Background on MATLAB (Definition) MATLAB is a high-performance language for technical computing. The.
R objects  All R entities exist as objects  They can all be operated on as data  We will cover:  Vectors  Factors  Lists  Data frames  Tables 
Math 252: Math Modeling Eli Goldwyn Introduction to MATLAB.
PHP Tutorial. What is PHP PHP is a server scripting language, and a powerful tool for making dynamic and interactive Web pages.
Lecture 11 Introduction to R and Accessing USGS Data from Web Services Jeffery S. Horsburgh Hydroinformatics Fall 2013 This work was funded by National.
With the support of the LPP programme of the European Union 1 This project has been funded with support from the European Commission. This publication.
R Brown-Bag Seminar 2.1 Topic: Introduction to R Presenter: Faith Musili ICRAF-Geoscience Lab.
Programming in R Intro, data and programming structures
R programming language
INTRODUCTION TO BASIC MATLAB
MATLAB DENC 2533 ECADD LAB 9.
Use of Mathematics using Technology (Maltlab)
Installing Packages Introduction to R, Part II
CSCI N317 Computation for Scientific Applications Unit 1 – 1 MATLAB
MIS2502: Data Analytics Introduction to R and RStudio
R Course 1st Lecture.
Using R for Data Analysis and Data Visualization
Computer Simulation Lab
> Introduction to Nelson Rios, Tulane University
Presentation transcript:

Piotr Wolski Introduction to R

Topics What is R? Sample session How to install R? Minimum you have to know to work in R Data objects in R and how to manipulate them Exercises

What Is R? a programming “environment” – in fact a programming language Operated through command line, no point and click Rather relaxed approach to term GUI – R GUI is in fact an interface to the command line object-oriented Freeware Cross-platform (windows, linux, mac) Scriptable - thus good to analyse large datasets, Good with matrices and multidimensional arrays excellent graphics capabilities supported by a large user network (you can always ask for help online, or search through mailing list archives) Contributed packages provide multitude of procedures

Where does R come from? R started in the early 1990’s as a project by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand, intended to provide a statistical environment in their teaching lab. The lab had Macintosh computers, for which no suitable commercial environment was available. It is based on an earlier statistical programming language called S

Installing R download from CRAN (Comprehensive R Archive Network) project.org/ project.org/ …follow instructions This will load R engine, GUI and base packages Extra packages/libraries can be downloaded and installed from within R (easy), or from CRAN website (not so easy)

R GUI and environment R GUI offers some administrative options, but all analyses done through command line or scripts Working directory is where data are stored Working directory depends on where you invoke R from, but can be changed during session R session - when you actually start R Data generated during the session are held in a workspace which can be saved into a file only one workspace per session You can import data from other workspace files into current workspace You cannot see data (objects) unless you command to see them

R command line You have to type Basic syntax: >command [enter] Two “types” of commands: >function()[enter] Runs a function >object [enter] Returns the object (prints object contents to the screen) Since a function in R is also an object: >function[enter] will display the function, but won’t execute it! Up and down arrows will invoke previous/next command There is also history - list of all issued commands, accessed from menu

Creation of objects By assignment “<-” used to indicate assignment > x<-c(1,2,3,4,5,6,7) > x<-c(1:7) > x<-b > x<- “b” > x<- -2 > x<-read.table(“data.txt”) Special case: empty vector: > x<-c() –

Naming Convention Names of objects must start with a letter (A-Z or a-z) can contain letters, digits (0-9) and periods “.” case-sensitive mydata different from MyData Names of objects do not have parentheses > “myData” is a one element vector, and that element is a string > myData is an object, and it can be a vector, array etc. Balance between length and meaning: X or tmin B or tmin.clim Climatological.mean.of.monthly.minimum.temperature or tmin.clim

Managing workspace during an R session, all objects are stored in a temporary working memory, or workspace list objects in workspace > ls() remove objects from workspace > rm(object) > rm(list=c(“object1”,”object2”)) > rm(list=ls()) objects that you want to access later must be saved in a workspace file – from the menu bar – from the command line: > save(x,file=“MyData.Rdata”) To save all the objects: > save.image(“myData.Rdata”) Previously saved workspace can be loaded with: > load(“MyData.Rdata”)

Managing working directory All interaction with the permanent data storage – reading files and workspace from, saving to – takes place within working directory Unless you specify the path explicitly > load(“/data/projects/MyData.Rdata”) > load(“c:\data\projects\MyData.Rdata”) Working directory can be checked with: > getwd() Can be changed with: > setwd(“/new/working/directory”) > setwd(“c:\new\working\directory”)

How to get help? Within R > help.start() Will start manual/help/tutorial in a web browser To display help on given function use: > help(function) or > ?function e.g. help on function mean(): > help(mean) or > ?mean to search help database for a string and return all functions that contain it: > ??string

Other sources: CRAN website ( – Manuals – FAQ – Contributed documents – a mine! Rseek it:

R object types Vector Array (with special case: matrix) Data frame List Factor Function

Vector A sequence of values (one dimensional) only one mode (numeric, character, complex, or logical) allowed can be created using function concatenate: c() > x<-c(1,2,5,2,1) > y<-c(“may”,”june”,”july”,”august”,“september”) Vector has length: > length(x) Logical vectors: > b<- c(TRUE,TRUE,FALSE, FALSE, TRUE) > b 4

Working with vectors select only one element > x[2] select range of elements > x[1:3] select all but one element > x[-3] slicing: including only part of the object using index vector > x[c(1,2,5)] select elements based on logical operator > x[x>3] > x[y==“july”] Inverting a vector > x[10:1] > x[length(x):1]

Working with vectors Create sequence of numbers > seq(10) > seq(1, 10) > seq(1, 100,5) Repeating elements of a vector > rep(seq(3), 10) Repeating elements of a vector in a different way > rep(seq(3), each=10)

R magic - vector arithmetic Arithmetic operations are performed on EACH value of vector

R magic - vector arithmetic Vector operations are performed element by element

R magic - vector arithmetic R recycles vector elements

Vector functions Basic statistics

Exercise 1: Create a sequence from 1 to 100. Create the following sequence: 99, 96, 93, …0 Create a sequence of values 1,2,…12, repeated 10 times Create a vector of 30 number (just use a “random number generator” in your head;-) Calculate its mean, standard deviation, variance, minimum, maximum and sum of all values Calculate median and 5 th percentile Calculate minimum value of the first half of the vector (i.e. of first 15 values), and of the second half (i.e. of the last 15 values) Select every second value from that vector, and calculate their mean value