Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Damaris Zurell Dynamic Macroecology Swiss Federal Research Institute WSL

Similar presentations


Presentation on theme: "Introduction to Damaris Zurell Dynamic Macroecology Swiss Federal Research Institute WSL"— Presentation transcript:

1 Introduction to Damaris Zurell Dynamic Macroecology Swiss Federal Research Institute WSL damaris.zurell@wsl.ch http://www.r-project.org/

2 R is a tool … Data manipulation y ~ x Data modelling Data visualisation

3 R is a tool … Data manipulation y ~ x Data modelling Data visualisation Integrating different data sources Aggregating data, disintegrate, transform... Statistical modelling Numeric simulations Visualising models Make your own graphics

4 R is an environment The R environment: „more than an incremental accretion of very specific and inflexible tools“ „fully planned and coherent system“

5 R history First, there was S – developed in 1976 byJohn Chambers in Bell Laboratories at AT&T, as programming language for statistics, stochastic simulation, and graphical display 1988, commercial implementation in S-PLUS (Insightful Corp.) 1992, Ross Ihaka and Robert Gentleman start free implementation R under the GNU General Public License, mainly for teaching purposes

6 R history 1997, founding of R Development Core Team (abbrev: R Core Team) with today 20 persons from science and economy 1998, founding of Comprehensive R Archive Network (CRAN) – today >4000 additional packages 2000, first version completely compatible with S : R -1.0.0

7 R pros R cons Open Source, on many operating systems „at the pulse of science“ – new methods by scientists/developers implemented in R and available as packages Publication ready graphics Excellent for simulations, programming, computer intensive analyses, automating Best option for statistical computing Active user community: help by R Core Team, R-Help mailing list, fast bug-fixing

8 R pros R cons Open Source, on many operating systems „at the pulse of science“ – new methods by scientists/developers implemented in R and available as packages Publication ready graphics Excellent for simulations, programming, computer intensive analyses, automating Best option for statistical computing Active user community: help by R Core Team, R-Help mailing list, fast bug-fixing no fancy graphical user interface, bulky – steep learning curve for newbies, high beginner‘s frustration Easy to make mistakes Computation of big data sets is limited by RAM „Many ways lead to Rome“

9 R is … An interpreted programming language – Commands are executed immediately Data types: empty values, numerical, logical, character Data structures/object types: scalar, vector, matrix, array,data frame, list During one session, all objects are stored in your workspace built-in and self-defined functions

10 R is plain

11 Command line language: This is the prompt: > All commands follow after the prompt

12 R is a great calculator Simple algebra > 2+2 4 Assign your results to a variable > X <- 2+2 # assignment operator „<-“ > x^2 16 Vector based calculations > mass<- c(10,13,6)# 3 Massen > acceleration <- c(2.2,1.7,3.1) > (force <- mass * acceleration ) 22.0 22.1 18.6

13 R is a great calculator Simple statistics > (x <- sample (1:20,10)) 4 15 12 14 18 3 9 20 19 16 > mean(x) 13 > sd(x) 5.981453 Set operations unionintersectsetdiff Advanced statistics pbinom(40,100,0.5) # coin toss: is the coin unbiased? 0.02844397 (pshare <- pbirthday(18,366,coincident=2)) 0.3461382

14 R is a numeric simulator Built-in functions for common probability distributions e.g. simulate 10 000 pseudo- random numbers from 100 coin tosses – How often do you get heads? > heads<-rbinom(10000,100,0.5) > hist(heads)

15 R Probability distributions functions: d (density) probability density function p (probability) cummulative distribution function Q calculate quantiles R draw random numbers Examples: Normal dnormpnormqnormrnorm Binomial dbinompbinom … Poisson dpois..

16 R Probability distributions ? distributions FunctionDistribution _beta() Beta _binom() Binomial _cauchy() Cauchy _chisqu() χ2χ2 _exp() Exponential _f() F _gamma() Gamma _geom() Geometric _hyper() Hypergeometric _logis() Logistic _lnorm() Lognormal _multinom() Multinomial _nbinom() Negative binomial _norm() Normal _pois() Poisson _signrank() Wilcox signed rank statistic (One sample case) _t() T _unif() Uniform _weibull() Weibull _wilcox() Wilcox signed rank statistic (Two sample case)

17 R accepts all kinds of data sources Files (text, binary, data sets from other statistic programs) > Example <- read.csv(“example.csv",header=T) > example2 <- read.table(“example2.txt",header=T) Cclipboard > cohesion<-read.table(file="clipboard",sep="\t",header=T) Database > library(RODBC) > mdbConnect<-odbcConnectAccess("GPDDdist") > sqlTables(mdbConnect) Web > con <- url('http:/anywebsite.com/test.txt') > example3 <- read.table(con, header=T) R Objects (binary) > load(“example.RData")

18 R writes to all kinds of data sources to files > write.csv(example,“example.csv") > write.table(example,“example2.txt",row.names=F) to the clipboard > write.table(CORMAT,file="clipboard",sep="\t",col.names=NA) to data bases > channel <- odbcConnect("test") > sqlSave(channel, USArrests, rownames = "state", addPK=TRUE) > close(channel) to R Objects > save(example3,“example.RData")

19 R visualising Many graphic functions are generic – they respond „intelligently“ to different object types > plot(iris) > plot(Petal.Length,Petal.Width, pch=as.numeric(Species))

20 R visualising Many graphic functions are generic – they respond „intelligently“ to different object types > boxplot(iris) > boxplot(Petal.Length~Species, data=iris,ylab="Petal.Length")

21 R visualising R Graph Gallery: http://gallery.r-enthusiasts.com/thumbs.php

22 R statistical modelling Linear model > fm <- lm(y ~ x, data=dummy) > summary(fm) Call: lm(formula = y ~ x, data = dummy) Residuals: Min 1Q Median 3Q Max -4.3400 -1.7353 -0.2107 1.4644 4.8445 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.9150 1.2155 1.575 0.133 x 0.8581 0.1015 8.457 1.1e-07 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 2.617 on 18 degrees of freedom Multiple R-squared: 0.7989, Adjusted R-squared: 0.7877 F-statistic: 71.52 on 1 and 18 DF, p-value: 1.102e-07

23 R statistical modelling And much more... Dormann & Kühn (2009): Angewandte Statistik für die biologischen Wissenschaften.

24 R geostatistical analyses variograms, Kriging etc. www.mathworks.de Hengl 2009

25 R bayesian statistics Link to BUGS http://www.lanl.gov/bayesian/

26 R as programming language > hi.there <- function() { + cat("Hello World!\n") + } > hi.there() Hello World!

27 R as programming language > hi.there <- function() { + cat("Hello World!\n") + } > hi.there() Hello World! Built your own function to keep your code tidy Built „new“ functions (and write packages) Dynamic models …

28 R extensions Integrate other source codes Batch processing Call from terminal

29 R GUIs http://www.rcommander.com/ http://rstudio.org/

30 R Literature http://www.r-project.org/ – Manuals – „Contributed Documentation“


Download ppt "Introduction to Damaris Zurell Dynamic Macroecology Swiss Federal Research Institute WSL"

Similar presentations


Ads by Google