Presentation is loading. Please wait.

Presentation is loading. Please wait.

Second Annual Cytomics Workshop April, 2017

Similar presentations


Presentation on theme: "Second Annual Cytomics Workshop April, 2017"— Presentation transcript:

1 Second Annual Cytomics Workshop April, 2017
Introduction to R Second Annual Cytomics Workshop April, 2017

2 Outline Background Motivating examples Starting R, entering commands
Bioconductor Motivating examples Starting R, entering commands How to get help R fundamentals Sequences and Repeats Characters and Numbers Vectors and Matrices Data Frames and Lists Importing data from spreadsheets briefly emphasize that R is an excellent tool for data (statistical) analysis - powerful array of analysis tools - for flow, can help eliminate human bias - automate repetitive analysis - operate on very large data sets

3 R R Is an integrated suite of software facilities for data manipulation, simulation, calculation and graphical display. It handles and analyzes data very effectively and it contains a suite of operators for calculations on arrays and matrices. In addition, it has the graphical capabilities for very sophisticated graphs and data displays. It is an elegant, object-oriented programming language. Started by Robert Gentleman and Ross Ihaka (hence “R”) in 1995 as a free, independent, open-source implementation of the S programming language (now part of Spotfire) Currently, maintained by the R Core development team – an international group of hard-working volunteer developers

4

5 Bioconductor Bioconductor
“Is an open source and open development software project to provide tools for the analysis and comprehension of genomic data.” Goals To provide widespread access to a broad range of powerful statistical and graphical methods for the analysis of genomic data. To provide a common software platform that enables the rapid development and deployment of extensible, scalable, and interoperable software. To further scientific understanding by producing high-quality documentation and reproducible research. To train researchers on computational and statistical methods for the analysis of genomic data.

6

7 Flow Cytometry in Bioconductor
About 40 packages specific to flow cytometry available in Bioconductor What’s so different about flow cytometry anyway?

8 A motivating example I’ve just collected data from a T cell stimulation experiment in a 96-well plate format. I need to gate the data on CD3/CD4. How consistent are the distributions, so that I can establish one set of gates for the whole plate and be confident that the results are valid for all of the wells?

9 A motivating example

10 Another motivating example
I’m concerned that drawing gates to analyze my data introduces unintended bias. Additionally, since I have multiple data files, drawing multiple gates is time consuming. Can I use R to compute gates and then apply these same objective gating criteria to multiple data files?

11 Another motivating example
Automated gating of rare events

12 A third example I often drain my tubes since I’m trying to acquire as many events as I can from a limited sample for a rare event assay. I’m concerned that the disruption of flow near the beginning and end of the acquisition (and sometimes in the middle due to minor clogs) may introduce an “artificial phenotype”. Is there some way to automatically detect and edit out portions of a file that aren’t consistent with the rest? Cleaning data when the tube runs out

13 A third example Cleaning data when the tube runs out

14 Back to the basics R is a command-line driven program
the prompt is: > you type a command (shown in blue), and R executes the command and gives the answer (shown in black) R follows exactly the directions you give it – even if these are not the directions you mean to give it! You must be very precise since R is case sensitive and has many syntactical requirements. However, once you learn these simple rules, it is an extremely fast and dynamic tool to analyze data. And don’t worry, there are many many help tools which we will explore later…

15 Simple example: enter a set of measurements
use the function c()to combine terms together Create a variable named mfi Put the result of c() into mfi using the assignment operator <- (you can also use =) The [1] indicates that the result is a vector Everything in R is a function [e.g. c()] Emphasize the create a variable step as crucial – in R you are constantly defining and redefining variables, so it’s a good idea to keep track of the variables you assign!

16 Rstudio Rstudio is an Integrated Development Environment (IDE) for R.

17 Rstudio Console

18 Rstudio Editor

19 Rstudio Env, History

20 Rstudio Your best friend

21 Rstudio lower right pane

22 Rstudio lower right pane

23 Rstudio lower right pane

24 Rstudio lower right pane

25 Rstudio help

26 Rstudio help

27 Rstudio help

28 Rstudio help

29 Package Vignette – really good help!

30 basic data structures

31 Sequences and Repeats

32 Characters and Numbers
Characters and character strings are enclosed in “” or ‘’ Special numbers NA – “Not Available” Inf – “Infinity” NaN – “Not a Number”

33 Factors Factors capture categorical data (variables that take on discrete, often descriptive, values) We’ll see more about factors when we talk about data frames …

34 Vectors and Matrices

35 Vectors and Matrices The subset operator for vectors and matrices is [ ] Explain what a subset operator IS.

36 Vectors and Matrices You can extend the length of a vector via subsetting … but not a matrix

37 Vectors and Matrices However, all's not lost if you want to extend either the columns … … or rows

38 Data Frames A Data Frame is like a matrix, except that the data type in each column need not be the same (data polymorphism) Often, a Data Frame is created from an Excel spreadsheet using the function read.table() or read.csv() Save As… a tab-delimited text file.

39 Data Frames from spreadsheets

40 Data Frames from spreadsheets

41 Data Frames from spreadsheets

42 Lists Lists are to Vectors as Data Frames are to Matrices


Download ppt "Second Annual Cytomics Workshop April, 2017"

Similar presentations


Ads by Google