Using R Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.

Slides:



Advertisements
Similar presentations
Distributed Algorithms – 2g1513 Lecture 1b – by Ali Ghodsi Models of distributed systems continued and logical time in distributed systems.
Advertisements

Statistika (intro. to Statistics). What is Statistics? We muddle through life Making choices based on Incomplete information.
A very achievable course  a very achievable course ( for those who WORK!! ) an interesting course  an interesting course.
Uses of Spreadsheets S S T : S P R E A D S H E E T S SST 1c Spreadsheet Part (1c) Uses of spreadsheet.
Proportion Data Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
EXAMPLE 4 Solve a multi-step problem Write an exponential growth model giving the number n of incidents t years after About how many incidents were.
Introduction to Bayesian Statistics Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Data Handling & Analysis ZO4030 Andrew Jackson
Count Data Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
SADC Course in Statistics Analysing numeric variables Module B2, Session 15.
Probability Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Statistical Analysis Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Statistica outline. Correlation analisys. Basic operations in Statistica Create a new spreadsheet file Fill two data columns: text values and data Save.
T T07-01 Sample Size Effect – Normal Distribution Purpose Allows the analyst to analyze the effect that sample size has on a sampling distribution.
1 Econ 240A Power Three. 2 Summary: Week One Descriptive Statistics –measures of central tendency –measures of dispersion Exploratory data Analysis –stem.
Lies, Damned Lies, and Statistics A course for Statistical Literacy Katherine Tranbarger Amherst College NECQL X April 29, 2006.
Contrasts Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Data Handling & Analysis BD Andrew Jackson Zoology, School of Natural Sciences
Use intercepts to graph an equation
Multivariate Statistics Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Guide to Using Minitab 14 For Basic Statistical Applications To Accompany Business Statistics: A Decision Making Approach, 8th Ed. Chapter 7: Introduction.
Regression Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Comparing Two Samples Harry R. Erwin, PhD
Analysis of Covariance Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Multivariate Analysis Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Central Tendency Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
EXAMPLE 2 Graph an exponential function Graph the function y = 2 x. Identify its domain and range. SOLUTION STEP 1 Make a table by choosing a few values.
Summary of Remainder Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Experimental Design and Modelling Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
computer
Variance Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Statistical Modelling Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
EXAMPLE 3 Find the inverse of a 3 × 3 matrix Use a graphing calculator to find the inverse of A. Then use the calculator to verify your result. 2 1 – 2.
Introduction to Statistics Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Figure: Graph of linear equation.. Figure: Computer printout of the hearing data.
Exponential Functions Graphing. Exponential Functions  Graphing exponential functions is just like graphing any other function.  Look at the graph.
SOLUTION STEP 1 Use intercepts to graph an equation EXAMPLE 2 Graph the equation x + 2y = 4. x + 2y = 4 x =  x- intercept 4 Find the intercepts. x + 2(0)
Binary Response Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Multiple Regression Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
©Evergreen Public Schools /3/2010 Compare Boxplots.
Statistical Inference Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
EXAMPLE 3 Graph a function of the form y = ax 2 + bx + c Graph y = 2x 2 – 8x + 6. SOLUTION Identify the coefficients of the function. The coefficients.
Solve the equation for y. SOLUTION EXAMPLE 2 Graph an equation Graph the equation –2x + y = –3. –2x + y = –3 y = 2x –3 STEP 1.
CS 450 – Modeling and Simulation Dr. X. Topics System Analysis Arrival process The Queue The Server The simulation.
Warm Up Use the dot plot to answer the following. 1.Which dot plot has a larger range? What is the range of that dot plot ? 2.What is the largest bear.
Introductory Statistics for Laboratorians dealing with High Throughput Data sets Centers for Disease Control.
Bayesian Modelling Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Introduction to Micro-economics Technote Creating a distribution; calculating a probability.
/
Creating User Interfaces Qualitative vs Quantitative research. Sampling. Panels. Homework: Post proposal & work on user observation study. Next week:Review.
Cartoon Guide to Statistics By: Larry Gonick & Woollcott Smith Chapter 1 summary Jim Casey GEOG Advanced Geographic Statistics.
Measurement & Analysis: The Missing Link Katherine McKnight, Ph.D. Director of Research & Evaluation Pearson School Achievement Services.
BIS 220 Introduction to Computer Application and Systems Check this A+ tutorial guideline at 220/BIS-220-Complete-Class-Guide.
Graph Rational Numbers on a Coordinate Plane…
Analysis of Variance Harry R. Erwin, PhD
Introduction to R.
Introduction The slopes of parallel lines are always equal, whereas the slopes of perpendicular lines are always opposite reciprocals. It is important.
Nuffield Free-Standing Mathematics Activity
A graphing calculator is required for some problems or parts of problems 2000.
Charts, Graphs & Cartoons
y x y = x + 2 y = x + 4 y = x – 1 y = -x – 3 y = 2x y = ½x y = 3x + 1
A paired-samples t-test compares the means of two related sets of data to see if they differ statistically. IQ Example We may want to compare the IQ scores.
Unit 3 Review (Calculator)
SUMMARISING NUMERICAL DATA
Algorithms Lecture # 27 Dr. Sohail Aslam.
Calculate 9 x 81 = x 3 3 x 3 x 3 x 3 3 x 3 x 3 x 3 x 3 x 3 x =
Introduction to OpenGL
Line Graphs.
Continuous distribution curve.
Test criterion based on F distribution.
Presentation transcript:

Using R Harry R. Erwin, PhD School of Computing and Technology University of Sunderland

Resources Crawley, MJ (2005) Statistics: An Introduction Using R. Wiley. Gonick, L., and Woollcott Smith (1993) A Cartoon Guide to Statistics. HarperResource (for fun).

Lecture Outline R as a statistical calculator Creating data Graphing and plotting Statistical distributions Dataframes Summarising data

Using R We will work through a few examples of statistical calculations and creating data. y<-c(3,7,9,11) z<-scan() a<-1:6 b<-seq(0.5,0.0,-0.1) rep(value,count) creates a vector with value count times. gl(upTo,repeats) can be used to generate factor data

Graphics Examples of plot() ?par for help on graphics parameters

Working with Dataframes R works with data in dataframes, objects with rows and columns. Each row is an observation or a measurement Each column contain the values of a variable. Variable types include numbers, text (factors), dates, or logical variables. Columns have names. Rows have row.names.

Reading a Dataframe worms<-read.table("worms.txt", header = T, row.names = 1) attach(worms) names(worms) If the row.names or the names are bad, you can set them to values. worms summary(worms)

Selecting Rows or Columns worms[,1:3] for all the rows and columns 1-3. worms[5:15,] for the middle rows worms[Area>3 & Slope < 3,] for logical tests To sort a dataframe, you have to designate the columns to be sorted and the column to base the sort on: worms[order(worms[,1]),1:6] Example of a reverse sort

Vectors y<-c(5,7,7,8,2,5,6,6,7,5,8,3,4) z<-13:1 Try mean, var, range, max, min, summary, IQR, fivenum y[3] y[3:7] y[c(3,5,6,9)] y[-1] y[-length(y)] y[y>6] z[y>6] y[y%3!=0]

Vector Operations * is vector multiplication –If they are not the same length, the shorter vector is repeated as needed. To join vectors, use the c() function ?c Subscripting can be based on a number, vector, or test. To drop an element, subscript with a minus sign in front Vectors can be combined with cbind() and rbind()

Arrays, etc. Like vectors or dataframes with multiple dimensions Lists can be used to combine data of different types. val <- list(varname=value,…) Although vectors are subscripted using [], lists are subscripted with [[]] Factors are special –citizen <- factor(c("US","US","UK”)) Examples from book.

Sorting and Ordering Never sort a dataframe column on its own. The other columns are not sorted. So don’t use sort() Instead use order(), since it leaves the dataframe unmodified. It returns a vector of subscripts, not values, but then you can apply the dataframe to the reordered vector to show it in the new order.

Table Suppose vals is a collection of vectors table(vals) reports the count of each unique value tapply takes three arguments –Variable or dataframe to be summarised –Variable by which the summary is classified –Function to apply Examples

Data Manipulation To convert a continuous variable into a categorical variable, use cut(vals,levels) You can also specify the break points split() can be used to generate a list of vectors on the basis of the levels of a factor. Example

Saving your Work history(Inf) savehistory("filename") save(list=ls(), file = "filename") Tidying up –rm(var) any temporary variables –detach(dataframes) rm(list=ls()) will clean up everything

Conclusions There are other tools and languages –Minitab –SAS –Spreadsheets Use what you’re comfortable with. But professional statisticians use R.