I❤RI❤R Kin Wong (Sam) Game Plan Intro R Import SPSS file Descriptive Statistics Inferential Statistics GraphsQ&A.

Slides:



Advertisements
Similar presentations
Summary Statistics/Simple Graphs in SAS/EXCEL/JMP.
Advertisements

David Pieper, Ph.D. STATISTICS David Pieper, Ph.D.
By Wendiann Sethi Spring  The second stages of using SPSS is data analysis. We will review descriptive statistics and then move onto other methods.
QUANTITATIVE DATA ANALYSIS
Chapter 13 Conducting & Reading Research Baumgartner et al Data Analysis.
Chapter 13 Analyzing Quantitative data. LEVELS OF MEASUREMENT Nominal Measurement Ordinal Measurement Interval Measurement Ratio Measurement.
Chapter 14 Analyzing Quantitative Data. LEVELS OF MEASUREMENT Nominal Measurement Nominal Measurement Ordinal Measurement Ordinal Measurement Interval.
A Simple Guide to Using SPSS© for Windows
Descriptive Statistics
Educational Research by John W. Creswell. Copyright © 2002 by Pearson Education. All rights reserved. Slide 1 Chapter 8 Analyzing and Interpreting Quantitative.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
SPSS Statistical Package for the Social Sciences is a statistical analysis and data management software package. SPSS can take data from almost any type.
Introduction to SPSS Short Courses Last created (Feb, 2008) Kentaka Aruga.
Stats & Excel Crash Course Jim & Sam April 8, 2014.
Intro to SPSS Kin 260 Jackie Kiwata. Overview Intro to SPSS Defining Variables Entering Data Analyzing Data SPSS Output Analyzing Data Max, Min, Range.
Statistical Analysis I have all this data. Now what does it mean?
Inferential Statistics: SPSS
How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window.
Review of Econ424 Fall –open book –understand the concepts –use them in real examples –Dec. 14, 8am-12pm, Plant Sciences 1129 –Vote Option 1(2)
Class Meeting #11 Data Analysis. Types of Statistics Descriptive Statistics used to describe things, frequently groups of people.  Central Tendency 
APPENDIX B Data Preparation and Univariate Statistics How are computer used in data collection and analysis? How are collected data prepared for statistical.
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
Choosing and using statistics to test ecological hypotheses
Statistical Analysis I have all this data. Now what does it mean?
Question 4 What are data and what do they mean to a scientist?
Variability.
Describing Behavior Chapter 4. Data Analysis Two basic types  Descriptive Summarizes and describes the nature and properties of the data  Inferential.
Introduction to Biostatistics and Bioinformatics Exploring Data and Descriptive Statistics.
What is SPSS  SPSS is a program software used for statistical analysis.  Statistical Package for Social Sciences.
Introduction to SPSS. Object of the class About the windows in SPSS The basics of managing data files The basic analysis in SPSS.
Chapter 2: Descriptive Statistics Adding MegaStat in Microsoft Excel Measures of Central Tendency Mode: The most.
Recap of data analysis and procedures Food Security Indicators Training Bangkok January 2009.
Determination of Sample Size: A Review of Statistical Theory
1 An Introduction to SPSS for Windows Jie Chen Ph.D. 6/4/20161.
Agenda Descriptive Statistics Measures of Spread - Variability.
Analyses using SPSS version 19
June 21, Objectives  Enable the Data Analysis Add-In  Quickly calculate descriptive statistics using the Data Analysis Add-In  Create a histogram.
September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses.
 Two basic types Descriptive  Describes the nature and properties of the data  Helps to organize and summarize information Inferential  Used in testing.
Introduction to Basic Statistical Tools for Research OCED 5443 Interpreting Research in OCED Dr. Ausburn OCED 5443 Interpreting Research in OCED Dr. Ausburn.
Appendix B: Statistical Methods. Statistical Methods: Graphing Data Frequency distribution Histogram Frequency polygon.
Mr. Magdi Morsi Statistician Department of Research and Studies, MOH
Data Science and Big Data Analytics Chap 3: Data Analytics Using R
1 UNIT 13: DATA ANALYSIS. 2 A. Editing, Coding and Computer Entry Editing in field i.e after completion of each interview/questionnaire. Editing again.
Descriptive Statistics. Outline of Today’s Discussion 1.Central Tendency 2.Dispersion 3.Graphs 4.Excel Practice: Computing the S.D. 5.SPSS: Existing Files.
Outline Research Question: What determines height? Data Input Look at One Variable Compare Two Variables Children’s Height and Parents Height Children’s.
Statistics with TI-Nspire™ Technology Module E Lesson 1: Elementary concepts.
Elementary Analysis Richard LeGates URBS 492. Univariate Analysis Distributions –SPSS Command Statistics | Summarize | Frequencies Presents label, total.
Power Point Slides by Ronald J. Shope in collaboration with John W. Creswell Chapter 7 Analyzing and Interpreting Quantitative Data.
Data Analysis. Statistics - a powerful tool for analyzing data 1. Descriptive Statistics - provide an overview of the attributes of a data set. These.
Statistical Fundamentals: Using Microsoft Excel for Univariate and Bivariate Analysis Alfred P. Rovai Descriptive Statistics – Measures of Relative Position.
Chapter 6 Becoming Acquainted With Statistical Concepts.
Lecture 7: Bivariate Statistics. 2 Properties of Standard Deviation Variance is just the square of the S.D. If a constant is added to all scores, it has.
Data Workshop H397. Data Cleaning  Inputting data  Missing Values  Converting String Variables  Creating Scales  Creating Dummy Variables.
Frequency Distributions Chapter 2. Descriptive Statistics Distributions are part of descriptive statistics…we are learning how to describe some data by.
Central Tendency and Variability Chapter 4. Variability In reality – all of statistics can be summed into one statement: – Variability matters. – (and.
IENG-385 Statistical Methods for Engineers SPSS (Statistical package for social science) LAB # 1 (An Introduction to SPSS)
Becoming Acquainted With Statistical Concepts
Descriptive Statistics
Descriptive measures Capture the main 4 basic Ch.Ch. of the sample distribution: Central tendency Variability (variance) Skewness kurtosis.
DEPARTMENT OF COMPUTER SCIENCE
به نام خدا كاربرد كامپيوتر در مهندسي صنايع نرم افزار spss
Description of Data (Summary and Variability measures)
ASPIRE Workshop 5: Analysis Supplementary Slides
Univariate Data Exploration
Basic Statistical Terms
SPSS (Statistical Package for Social Science)
Lecture 4 Psyc 300A.
Finding Correlation Coefficient & Line of Best Fit
Biostatistics Lecture (5).
Presentation transcript:

I❤RI❤R Kin Wong (Sam)

Game Plan Intro R Import SPSS file Descriptive Statistics Inferential Statistics GraphsQ&A

Intro R

R Small, Fast, and Open Source (Window, Linux, and Mac) Write your own package or improve existing packages. Free packages For Downloads (5000+) From Forensic to Finance, there is a package right for you. Disadvantage: Command Driven & Debugging

R

Exercise print() Use print() to print your name ? is your best friend, use ? for help ?print Calculate Calculate 888*888

Enter data c() Use c() to enter data into R Try Store 1,2,3,4, and 5 into data variable data =c(1,2,3,4,5) Type data to call your number data

Import CSV in R Store your file address in dataset variable. dataset ="D:/accidents.csv“ Warning: R uses “/” instead of “\” Load csv file into data variable: data=read.table(dataset, header=T, sep=",")

Import SAV in R SAV = SPSS File

tcltk (Select a File with GUI) library() loads tcltk package into memory library(tcltk) R opens a select file window dataset <- tclvalue(tkgetOpenFile(filetypes="{{All files} *}")) Check dataset file location: dataset

tcltk (Successful)

Import SAV in R Install foreign package to import SPSS file install.packages(c("foreign"), repos=" project.org" ) Load foreign package import SPSS file. library(foreign) No error message = Command is correct.

Import SAV in R Copy & Paste: data=read.spss(dataset, use.value.labels=TRUE,max.value.labels=Inf, to.data.frame=TRUE) Use read.spss() function to import SPSS file. dataset is your SPSS file location. to.data.frame=TRUE means import as spreadsheet.

Attach data attach() function mounts your data. If you do not mount the data, you need to identify your variables with data$. Try: attach(data)

Show all Variables ls() function lists all variables names Try: ls(data)

R Code (Load SPSS file) library(tcltk) dataset <- tclvalue(tkgetOpenFile(filetypes="{{All files} *}")) library(foreign) data=read.spss(dataset, use.value.labels=TRUE,max.value.labels=Inf, to.data.frame=TRUE) attach(data) ls(data)

Descriptive Statistics Replace w/ Your Variable

Frequency table table( ) Total Frequency length( ) Missing length(which(is.na( ))) Valid length( )-length(which(is.na( )))

Percentile Quartiles quantile( ) Percentile quantile(, c(0,.50,1)) c() allows you to input as many percentile as you wanted. From 0 to 1.

Central Tendency Mean mean( ) Median median( ) Mode names(sort(-table( )) Sum sum( )

Dispersion Range = Max - Min range( )[2]-range( )[1] Variance var( ) Standard deviation sd( ) Standard error sd( )/sqrt(length( )-length(which(is.na( ))))

Distribution Install e1071 package to import SPSS file install.packages(c("e1071"), repos=" ) Load e1071 package in order to use skewness and kurtosis function. library( e1071 )

Distribution Skewness skewness( ) Kurtosis kurtosis( )

Compare Mean is the dependent variable  is the independent variable Copy & Paste: (Compare Mean) tapply(, ,mean) Note: You can change mean to other R functions. Copy & Paste: (Compare Range) tapply(, ,range)

Inferential Statistics

One sample t-test t.test(,mu=0) mu = 0 means that population mean = 0. You can change 0 to you desired population mean.

Pair sample t-test t.test(, ,paired=T) is the first variable  is the second variable paired=T means that this is a pair sample t-test.

Independent sample t-test Install car package to run Levene’s test install.packages(c(“car"), repos=" project.org" ) Load car package library(car)

Independent sample t-test is dependent variable  is independent variable Levene’s test leveneTest(, ,'mean') ‘mean’ uses original Levene’s test

Independent sample t-test Set values for independent sample t-test Test1=  =='boy‘ Test2=  ==‘girl' Test1 holds independent variable’s boy value You can change Test2 holds independent variable’s girl value boy/girl to your value.

Independent sample t-test Set Groups Group1=dataset[Test1,]$ Group2=dataset[Test2,]$ Runs equal variance assumed independent sample t-test t.test(Group1,Group2,var.equal=T) Runs equal variance not assumed independent sample t-test t.test(Group1,Group2,var.equal=F)

ANOVA is dependent variable  is independent variable Levene’s Test leveneTest(, ,'mean') Anova Table (Equal-variance Assumed) summary(aov( ~  ))

ANOVA One-way table (Equal-variance not assumed) oneway.test( ~  ) Post-hoc test – Tukey posthoc(, ,'Tukey') Post-hoc test – Tukey posthoc(, ,'Games-Howell')

Correlation Install Hmisc package to generate correlation table install.packages(c(“Hmisc"), repos=" project.org" ) Load foreign package library( Hmisc )

Correlation is variable y.  is variable x. Correlation table rcorr( ,,type='pearson')

Linear Regression is dependent variable  is independent variable Linear Regression: summary(lm( ~  ))

Crosstab Install gmodels package to generate crosstab table install.packages(c(“gmodels"), repos=" project.org" ) Load gmodels package library(gmodels)

Crosstab is row variable  is column variable Crosstab table CrossTable(, ,expected=TRUE,prop.chisq=TRUE)

R Graphs

Game Plan ggplot2 1)Bar Chart3)Boxplot 2)Histogram4)Scatter plot R Graphs

without ggplot2

Bar Chart Simple Bar Plot Simple Horizontal Bar Plot Staked Bar Plot Grouped Bar Plot

Bar Chart - Simple Bar Plot

Copy & Paste counts <- table(gender) barplot(counts, main=" Gender",xlab="Frequency",col=c("skyblue","pink")) barplot() requires input variable to sum up(table()) before calculation. main() is the header xlab() is the footer col() allows you to define color for value 1, value 2, and etc…

Bar Chart - Simple Horizontal Bar Plot

Copy & Paste counts <- table(gender) barplot(counts, main=" Gender",xlab="Frequency",col=c("skyblue","pink"), horiz=TRUE) When you add horiz=TRUE, your bar chart will rotate.

Bar Chart - Staked Bar Plot

Copy & Paste counts <- table(gender,urban) barplot(counts, main="Gender & Geography", xlab="Frequency of Gender", col=c("skyblue","pink"), legend = rownames(counts))

Bar Chart - Grouped Bar Plot

Copy & Paste counts <- table(gender, urban) barplot(counts, main="Gender & Geography", xlab="Number of Gender", col=c("skyblue","pink"), legend = rownames(counts), beside=TRUE)

Histogram

Copy & Paste hist(achmat10, col="red", xlab="Math Achievement Score", main="Math Achievement Score 2010“, breaks=9) breaks() tells R to produce X amount of bar(s)

Histogram w/ Normal Curve

Copy & Paste x <- achmat10 h<-hist(x, breaks=50, col="red", xlab="Math Achievement Score", main="Math Achievement Score 2010") xfit<-seq(min(x),max(x),length=40) yfit<-dnorm(xfit,mean=mean(x),sd=sd(x)) yfit <- yfit*diff(h$mids[1:2])*length(x) lines(xfit, yfit, col="blue", lwd=2)

Boxplot

Copy & Paste boxplot(achmat10,main="Math Achievement Score ",ylab="Math Score")

Multi-Boxplot

Boxplot Copy & Paste boxplot(achmat10~gender, main="Math Score & Gender",ylab="Math Score", xlab="Gender", col=(c("skyblue","pink"))) achmat10 is dependent variable gender is independent variable

Scatter plot

Copy and Paste plot(achmat10,achsci12,main="Math & Science Scatterplot",xlab="Math Score ", ylab="Science Score", pch=1)

Scatter plot w/ Regression line

Copy and Paste abline(lm(achmat10~achsci12), col="red") Add regression line to plot

ggplot2 Quick & High Quality Graphs

ggplot2 qplot() Quick high-quality graph development Little room for improvement ggplot() Slow graph development (lines of code) Very Elegant

Import ggplot2 in R Install ggplot2 package install.packages(c(“ggplot2"), repos=" ) Load ggplot2 package into memory. library(ggplot2)

Bar Chart

Copy and Paste qplot(factor(gender),geom="bar", fill=gender,xlab="Gender",ylab="Frequency",main="Gender")

Histogram

Copy and Paste a=qplot(achmat10,xlab="Math Score",ylab="Frequency",main="Math Achievement Score 2010", binwidth = 1) a+geom_histogram(colour = "black", fill = "red", binwidth = 1)

Boxplot

Copy and Paste a=qplot(factor(gender),achmat10, geom = "boxplot",ylab="Math Score",xlab="Gender",main="Math Achievement Score 2010") a + geom_boxplot(aes(fill = factor(gender)))

Scatter plot

Copy and Paste a=qplot(achmat10,achsci10) a+geom_smooth(method=lm,se=FALSE)

Scatter plot

Copy and Paste a=qplot(achmat10,achsci10,color=gender) a+geom_smooth(method=lm,se=FALSE)

Source R Graphs statmethods.net ggplot2 Cookbook for R

Question & Answer Kin Wong (Sam)