Presentation on theme: "Introduction to R Tara Jensen National Center for Atmospheric Research Boulder, Colorado USA"— Presentation transcript:
Introduction to R Tara Jensen National Center for Atmospheric Research Boulder, Colorado USA email@example.com
R Exercises Find sample data and R scripts at: ftp://ftp.ncmrwf.gov.in/pub/outgoing/rag hu/6WVMW/Tutorial/Day1/R-tutorial ftp://ftp.ncmrwf.gov.in/pub/outgoing/rag hu/6WVMW/Tutorial/Day1/R-tutorial Download to directory on your computer Start R Open intro2R.2014wmo.R
What is R? A statistical programming and graphics language In part, developed from the S Programming Language from Bell Labs (John Chambers) Created to: Allow rapid development of methods for use in different types of data. Require small amounts of system resources
Why R? R ~ the dominant language in the statistical research community. R is Open Source and free. Runs on most operating systems Nearly 2,400 packages contributed. Packages and applications in nearly every field of science, business and economics. See R Notes, R Journal and Journal of Statistical Software. www.jstatsoft.orgwww.jstatsoft.org More than 100 books with accompanying code Very large, active user base. Many default parameters are chosen, but users retain complete control.
Why not R? NCL, IDL, Matlab, SAS, … are all viable alternatives to R. If you are a part of an active community of researchers using another language, do likewise. R may be limited by memory. For verification of large gridded datasets – consider using Model Evaluation Tools (MET) R is does not produce a compiled executable so may not be desirable to some operational centers
The R Community Developers R Core Group (20 members), only 2 have left since 1997 Major update in April/October (freeze dates, beta versions, bug tracking,...) Mailing lists Help list ~ 150 messages/day, archived, searchable. http://www.r-project.org/mail.html 5 International Conferences, 2 US, 1 China
Everything about R is at www.r-project.org Source code Binary compilations (Windows, Mac OS, Linux Documentation ( Main documents, plus numerous contributed. Some in foreign languages.) Newsletter (replaced by R Journal.) Mailing list (Several search engines) Packages on every topic imaginable Wiki with examples Reference list of books using R. ( more than 100) Task Manager
Use R with scripts In Linux - Emacs Speaks Statistics Provides syntax-based Object name completion Key stroke short cuts Command history Alt-x R to invoke R with Xemacs. In Windows, use editor Added GUI features R sends a line or highlighted section into R. Install package with GUIs Save graphics by point and click. Mac OS Similar to Windows with advantages of system calls.
R Coding principles Make verification code transparent and easy to read Comment and document liberally Archive your code Share your code Label and save your data Share your data
Packages in R Contributed by people world wide. Allow scientists or statisticians to push their ideas. Apply and extend R capabilities to meet the needs of specific communities. Accompany many statistical textbooks Accompany applied articles (Adrian Raftery, Doug Nychka, Tilman Gneiting, Barbara Casati, Matt Briggs)
R Packages Mirror must be selected Packages -> Set CRAN mirror chooseCRANmirror() Packages must be installed to call Packages -> Install Package(s) install.packages(c("package 1","package 2","package 3", etc.)) Packages must be loaded (aka called into use) Packages -> Load Package(s) library(“package1”) library(“package2”) etc… Base packages are installed by default To see what packages are installed Packages -> Load Package(s) installed.packages(.Library, priority="package 1") To see what packages are installed remove.packages(package1,package2, lib=file.path("path to library" ) Windows or Mac Linux
A sample of useful packages verification fields (spatial stats) radiosondes extRemes BMA(Bayesian Model Averaging) BMAensemble circular Rsqlite SpatialVx Rgis, spatstat (GIS) ncdf ( support for netcdf files ) rgdal (support for grib1 files) rNOMADS (support for grib2 files archived by NCEP) Rcolorbrewer randomForests
Very useful functions in R q( ) – allows you to exit R – you will then be asked if you would like to save your workspace ls( ) – shows you the objects in your workspace rm( ) – allows you to remove an object system( ) – allows you to call system command from R help(package or function) – brings up help page ?(package or function) – brings up a help page read.fwf – read fixed width format data read.table – read text file with delimiters
More useful functions aggregate - applies a function to groups of data subset by categories. apply - incredibly efficient in avoiding loops. Applies functions across dimensions of arrays. %in% - returns logical showing which elements in A are in B. (e.g A%in%B) table – create contingency table counts. boot – apply bootstrap function correctly par – control everything in a graph pairs – the most under utilized plot – plots a matrix of 4 columns in a 4x4 plot layout xyplot (in the lattice package) slightly advance graphic techniques
R Exercises Find sample data and R scripts at: ftp://ftp.ncmrwf.gov.in/pub/outgoing/raghu/6 WVMW/Tutorial/Day1/R-tutorial Download to directory on your computer Start R Click on on your desktop type R at command line Open intro2R.2014wmo.R Select File -> Open Script -> select intro2R.2014wmo.R Open in another window using your favorite editory Windows or Mac Linux