R – a brief introduction Johannes Freudenberg Cincinnati Children’s Hospital Medical Center

Slides:



Advertisements
Similar presentations
Introduction to R Brody Sandel. Topics Approaching your analysis Basic structure of R Basic programming Plotting Spatial data.
Advertisements

Introduction to S-Plus by Francesco Ferretti Analysis of Biological Data Course Winter term 2007 Dalhousie University.
Writing functions in R Some handy advice for creating your own functions.
Training on R For 3 rd and 4 th Year Honours Students, Dept. of Statistics, RU Empowered by Higher Education Quality Enhancement Project (HEQEP) Department.
Actuarial Modeling in R CAS Predictive Modeling Seminar Las Vegas October, 2007 Glenn Meyers, FCAS, MAAA Jim Guszcza, FCAS, MAAA.
 Statistics package  Graphics package  Programming language  Can be used to share/reproduce analyses  Many new packages being created - can be downloaded.
Pasewark & Pasewark Microsoft Office XP: Introductory Course 1 INTRODUCTION Lesson 1 – Microsoft Office XP Basics and the Internet.
How to Guide: Step-by-Step introduction on how to Manage your References Pavlinka Kovatcheva, Sciences Librarian Library training instruction for Sciences.
Introduction to GTECH 201 Session 13. What is R? Statistics package A GNU project based on the S language Statistical environment Graphics package Programming.
Programming Fundamentals. Programming concepts and understanding of the essentials of programming languages form the basis of computing.
SHOU Haochang ( 寿昊畅 ) Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health July 11th, 2011 Nanjing University, China *Thanks to.
R for Research Data Analysis using R Day1: Basic R Baburao Kamble University of Nebraska-Lincoln.
An introduction to R Honors 207 Cognitive Science (These Slides were Shamelessly Stolen from Dr. Pablo Gomez, DePaul University)
R – a brief introduction Johannes Freudenberg Cincinnati Children’s Hospital Medical Center
Alternative text for elementary statistics –Elementary Concepts –Basic Statistics.
How to Use the R Programming Language for Statistical Analyses Part I: An Introduction to R Jennifer Urbano Blackford, Ph.D. Department of Psychiatry Kennedy.
The “R” Statistical Package Naomi Altman Dept. of Statistics PSU.
Training on R-language Mārtiņš Liberts Central Statistical Bureau of Latvia.
Introduction to R: The Basics Rosales de Veliz L., David S.L., McElhiney D., Price E., & Brooks G. Contributions from Ragan. M., Terzi. F., & Smith. E.
Baburao Kamble (Ph.D) University of Nebraska-Lincoln Data Analysis Using R Week2: Data Structure, Types and Manipulation in R.
Introduction to R Statistical Software Anthony (Tony) R. Olsen USEPA ORD NHEERL Western Ecology Division Corvallis, OR (541)
Applied Bioinformatics Introduction to Linux and R Bing Zhang Department of Biomedical Informatics Vanderbilt University
1 An Introduction – UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt & Eric D. Stolen Getting Started with R (with speaker notes)
R – a brief introduction Statistical physics – lecture 11 Szymon Stoma.
Tutor: Prof. A. Taleb-Bendiab Contact: Telephone: +44 (0) CMPDLLM002 Research Methods Lecture 9: Quantitative.
Microsoft Word 2000: Mail Merge Basics Peggy Serfazo Marple Molly Calvello Support Professionals Business Applications - Desktop Microsoft Corporation.
732A44 Programming in R.  Self-studies of the course book  2 Lectures (1 in the beginning, 1 in the end)  Labs (computer). Compulsory submission of.
A B C Q R S! Coilín Minto Department of Biology, Dalhousie University.
Windows Management Computer Literacy 1 Transition Plus Services.
Intro to R R is a free version of S-plus R is a free version of S-plus Can be used interactively but script or syntax files are commonly used to record.
Microsoft Word: Mail Merge Basics Presenter: Jolanta Soltis.
Sébastien Lê Agrocampus Rennes A very short introduction to “R” The “Rcmdr” package and its environment.
Session 3: More features of R and the Central Limit Theorem Class web site: Statistics for Microarray Data Analysis.
Piotr Wolski Introduction to R. Topics What is R? Sample session How to install R? Minimum you have to know to work in R Data objects in R and how to.
Using the ‘R’ Language for Bioinformatics
Using Software in Teaching Statistics Damon Berridge, Centre for Applied Statistics, Dept of Mathematics & Statistics ESRC NCRM.
Hands-on Introduction to R. We live in oceans of data. Computers are essential to record and help analyse it. Competent scientists speak C/C++, Java,
R Programming Yang, Yufei. Normal distribution.
Teachers Discovering Computers Integrating Technology and Digital Media in the Classroom 5 th Edition Let’s Review Lesson 2! Who Wants to Be a Computer.
Technology: Statistical Analysis Software & R
Introduction to R. Why use R Its FREE!!! And powerful, fairly widely used, lots of online posts about it Uses S -> an object oriented programing language.
* Property of STI Page 1 of 18 Software: Systems and Applications Basic Computer Concepts Software  Software: can be divided into:  systems software.
Blackboard 8: Grade Center This workshop is for existing users of Blackboard interested in keeping track of student grades online. Blackboard replaced.
An Introduction to R Statistical Computing AMS 597 Stony Brook University Spring 2009 By Tianyi Zhang.
The Report Generator Viewing Student Outcomes. Install the Report Generator In a browser, go to Click.
Introduction to R Introductions What is R? RStudio Layout Summary Statistics Your First R Graph 17 September 2014 Sherubtse Training.
WRITING REPORTS Introduction Section 0 Lecture 1 Slide 1 Lecture 6 Slide 1 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2015.
Introduction to R Carol Bult The Jackson Laboratory Functional Genomics (BMB550) Spring 2011.
STAT 534: Statistical Computing Hari Narayanan
© 2015 by Wade Rogers Introduction to R Cytomics Workshop December, 2015.
To find journals by language of publication, click on the Languages bar in the horizontal frame. The Languages drop down menu appear and we will choose.
Welcome to 1, 2, 3 Math Fonts The BEST Way To Create Your Own Materials!
Introductory Data Analysis F73DA2. Contact Times (Spring Term 2008) Monday 4: : Lecture in LT3 Tuesday 2: : Lecture in LT3 Wednesday
Chris Knight Beginners’ workshop.
Review > x[-c(1,4,6)] > Y[1:3,2:8] > island.data fishData$weight[1] > fishData[fishData$weight < 20 & fishData$condition.
Working with data in R 2 Fish 552: Lecture 3. Recommended Reading An Introduction to R (R Development Core Team) –
How to get started with RefWorks
R Brown-Bag Seminar 2.1 Topic: Introduction to R Presenter: Faith Musili ICRAF-Geoscience Lab.
R programming language
Second Annual Cytomics Workshop April, 2017
Naomi Altman Department of Statistics (Based on notes by J. Lee)
How to get started with RefWorks
Lab 1 Introductions to R Sean Potter.
Introduction to Advanced UNIX
Statistics 540 Computing in Statistics
Statistics for the Social Sciences
MIS2502: Data Analytics Introduction to R and RStudio
Statistics for the Social Sciences
Using R for Data Analysis and Data Visualization
A brief introduction to the nutrient tool-kit, getting R Studio to work and checking the data Martyn Kelly
Presentation transcript:

R – a brief introduction Johannes Freudenberg Cincinnati Children’s Hospital Medical Center

Overview History of R Getting started R as a calculator Data types Missing values Subsetting Importing/Exporting data Plotting and Summarizing data Resources

History of R Statistical programming language S developed at Bell Labs since 1976 (at the same time as UNIX) Intended to interactively support research and data analysis projects Exclusively licensed to Insightful (“S-Plus”) R: Open source platform similar to S developed by R. Gentleman and R. Ihaka (U of Auckland, NZ) during the 1990s Since 1997: international “R-core” developing team Updated versions available every couple months

What R is and what it is not R is –a programming language –a statistical package –an interpreter –Open Source R is not –a database –a collection of “black boxes” –a spreadsheet software package –commercially supported

Getting started To obtain and install R on your computer 1)Go to to choose a mirror near youhttp://cran.r-project.org/mirrors.html 2)Click on your favorite operating system (Linux, Mac, or Windows) 3)Download and install the “base” To install additional packages 1)Start R on your computer 2)Choose the appropriate item from the “Packages” menu

R as a calculator R can be used as a calculator: > 5 + (6 + 7) * pi^2 [1] > log(exp(1)) [1] 1 > log(1000, 10) [1] 3 > sin(pi/3)^2 + cos(pi/3)^2 [1] 1 > Sin(pi/3)^2 + cos(pi/3)^2 Error: couldn't find function "Sin"

Basic (atomic) data types Logical > x <- T; y <- F > x; y [1] TRUE [1] FALSE Numerical > a <- 5; b <- sqrt(2) > a; b [1] 5 [1] Character > a <- "1"; b <- 1 > a; b [1] "1" [1] 1 > a <- "character" > b <- "a"; c <- a > a; b; c [1] "character" [1] "a" [1] "character"

Vectors, Matrices, Arrays Vector –Ordered collection of data of the same data type –Example: last names of all students in this class Mean intensities of all genes on an oligonucleotide microarray –In R, single number is a vector of length 1 Matrix –Rectangular table of data of the same type –Example Mean intensities of all genes measured during a microarray experiment Array –Higher dimensional matrix

Vectors Vector: Ordered collection of data of the same data type > x <- c(5.2, 1.7, 6.3) > log(x) [1] > y <- 1:5 > z <- seq(1, 1.4, by = 0.1) > y + z [1] > length(y) [1] 5 > mean(y + z) [1] 4.2

Matrices Matrix: Rectangular table of data of the same type > m <- matrix(1:12, 4, byrow = T); m [,1] [,2] [,3] [1,] [2,] [3,] [4,] > y <- -1:2 > m.new <- m + y > t(m.new) [,1] [,2] [,3] [,4] [1,] [2,] [3,] > dim(m) [1] 4 3 > dim(t(m.new)) [1] 3 4

Missing values R is designed to handle statistical data and therefore predestined to deal with missing values Numbers that are “not available” > x <- c(1, 2, 3, NA) > x + 3 [1] NA “Not a number” > log(c(0, 1, 2)) [1] -Inf > 0/0 [1] NaN

Subsetting It is often necessary to extract a subset of a vector or matrix R offers a couple of neat ways to do that > x <- c("a", "b", "c", "d", "e", "f", "g", "h") > x[1] > x[3:5] > x[-(3:5)] > x[c(T, F, T, F, T, F, T, F)] > x[x <= "d"] > m[,2] > m[3,]

Other Objects and Data Types Functions Factors Lists Dataframes We’ll talk about them later in the course

Importing/Exporting Data Importing data –R can import data from other applications –Packages are available to import microarray data, Excel spreadsheets etc. –The easiest way is to import tab delimited files > my.data<-read.table("file",sep=",") * ) > SimpleData <- read.table(file = " header = TRUE, quote = "", sep = "\t", comment.char="") Exporting data –R can also export data in various formats –Tab delimited is the most common > write.table(x, "filename") * ) * ) make sure to include the path or to first change the working directory

Analyzing/Summarizing data First, let’s take a look… > SimpleData[1:10,] Mean, Variance, Standard deviation, etc. > mean(SimpleData[,3]) > mean(log(SimpleData[,3])) > var(SimpleData[,4]) > sd(SimpleData[,3]) > cor(SimpleData[,3:4]) > colMeans(SimpleData[3:14])

Plotting Scatter plot > plot(log(SimpleData[,"C1"]), log(SimpleData[,"W1"]), xlab = "channel 1", ylab = "channel 2") Histogram > hist(log(SimpleData[,7])) > hist(log(SimpleData[,7]),nclass = 50, main = "Histogram of W3 (on log scale)") Boxplot > boxplot(log(SimpleData[,3:14])) > boxplot(log(SimpleData[,3:14]), outline = F, boxwex = 0.5, col = 3, main = "Boxplot of SimpleData")

Getting help… and quitting Getting information about a specific command > help(rnorm) > ?rnorm Finding functions related to a key word > help.search("boxplot") Starting the R installation help pages > help.start() Quitting R > q()

Resources Books –Assigned text book –For an extended list visit project.org/doc/bib/R- publications.html project.org/doc/bib/R- publications.html Mailing lists –R-help ( project.org/mail.html) project.org/mail.html –Bioconductor ( ailList.html) ailList.html –However, first read the posting guide/ general instructions and search archives Online documentation –R Project documentation ( Manuals FAQs … –Bioconductor documentation ( Vignettes Short Courses … –Google Personal communication – me: –Ask other R users

References H Chen: R-Programming. programming.ppt programming.ppt WN Venables and DM Smith: An Introduction to R labs.com/cm/ms/departments/sia/S/history.htmlhttp://cm.bell- labs.com/cm/ms/departments/sia/S/history.html