R graphics  R has several graphics packages  The plotting functions are quick and easy to use  We will cover:  Bar charts – frequency, proportion 

Slides:



Advertisements
Similar presentations
So What Do We Know? Variables can be classified as qualitative/categorical or quantitative. The context of the data we work with is very important. Always.
Advertisements

Describing Quantitative Variables
Statistics 100 Lecture Set 6. Re-cap Last day, looked at a variety of plots For categorical variables, most useful plots were bar charts and pie charts.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Descriptive Statistics Summarizing data using graphs.
Beginning the Visualization of Data
Graphic representations in statistics (part II). Statistics graph Data recorded in surveys are displayed by a statistical graph. There are some specific.
Chapter 5: Understanding and Comparing Distributions
Chapter 1 Introduction Individual: objects described by a set of data (people, animals, or things) Variable: Characteristic of an individual. It can take.
Statistics Lecture 2. Last class began Chapter 1 (Section 1.1) Introduced main types of data: Quantitative and Qualitative (or Categorical) Discussed.
Presenting information
SW318 Social Work Statistics Slide 1 Using SPSS for Graphic Presentation  Various Graphics in SPSS  Pie chart  Bar chart  Histogram  Area chart 
Introduction to SPSS Short Courses Last created (Feb, 2008) Kentaka Aruga.
Introduction to Excel 2007 Part 3: Bar Graphs and Histograms Psych 209.
Histogram A frequency plot that shows the number of times a response or range of responses occurred in a data set.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 2 Descriptive Statistics: Tabular and Graphical Methods.
How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 12 Describing Data.
Data Handling Collecting Data Learning Outcomes  Understand terms: sample, population, discrete, continuous and variable  Understand the need for different.
Lecture 8 Distributions Percentiles and Boxplots Practical Psychology 1.
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
Descriptive Statistics: Tabular and Graphical Methods
Tutor: Prof. A. Taleb-Bendiab Contact: Telephone: +44 (0) CMPDLLM002 Research Methods Lecture 9: Quantitative.
Graphical Analysis. Why Graph Data? Graphical methods Require very little training Easy to use Massive amounts of data can be presented more readily Can.
Variable  An item of data  Examples: –gender –test scores –weight  Value varies from one observation to another.
Quantitative Skills 1: Graphing
The Scientific Method Honors Biology Laboratory Skills.
Statistics 2. Variables Discrete Continuous Quantitative (Numerical) (measurements and counts) Qualitative (categorical) (define groups) Ordinal (fall.
Introduction to SPSS. Object of the class About the windows in SPSS The basics of managing data files The basic analysis in SPSS.
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
1 Graphs Greg C Elvers, Ph.D.. 2 What Are Graphs? Graphs are a non-textual means of presenting information Graphs quickly summarize large sets of data.
Categorical vs. Quantitative…
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 2 Descriptive Statistics: Tabular and Graphical Methods.
Descriptive Statistics Summarizing data using graphs.
Unit 4 Statistical Analysis Data Representations.
GrowingKnowing.com © Frequency distribution Given a 1000 rows of data, most people cannot see any useful information, just rows and rows of data.
SPSS Instructions for Introduction to Biostatistics Larry Winner Department of Statistics University of Florida.
June 21, Objectives  Enable the Data Analysis Add-In  Quickly calculate descriptive statistics using the Data Analysis Add-In  Create a histogram.
Descriptive statistics Petter Mostad Goal: Reduce data amount, keep ”information” Two uses: Data exploration: What you do for yourself when.
Statistical Analysis Topic – Math skills requirements.
Mr. Magdi Morsi Statistician Department of Research and Studies, MOH
UNIT #1 CHAPTERS BY JEREMY GREEN, ADAM PAQUETTEY, AND MATT STAUB.
Histograms, Frequency Polygons, and Ogives. What is a histogram?  A graphic representation of the frequency distribution of a continuous variable. Rectangles.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 2 Descriptive Statistics: Tabular and Graphical Methods.
Histograms, Frequency Polygons, and Ogives
Describing Distributions Statistics for the Social Sciences Psychology 340 Spring 2010.
Outline of Today’s Discussion 1.Displaying the Order in a Group of Numbers: 2.The Mean, Variance, Standard Deviation, & Z-Scores 3.SPSS: Data Entry, Definition,
MATH 2311 Section 1.5. Graphs and Describing Distributions Lets start with an example: Height measurements for a group of people were taken. The results.
Statistical Fundamentals: Using Microsoft Excel for Univariate and Bivariate Analysis Alfred P. Rovai Charts Overview PowerPoint Prepared by Alfred P.
Chapter 0: Why Study Statistics? Chapter 1: An Introduction to Statistics and Statistical Inference 1
1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
1 Take a challenge with time; never let time idles away aimlessly.
Chapter 5: Organizing and Displaying Data. Learning Objectives Demonstrate techniques for showing data in graphical presentation formats Choose the best.
Graphs with SPSS Aravinda Guntupalli. Bar charts  Bar Charts are used for graphical representation of Nominal and Ordinal data  Height of the bar is.
1 By maintaining a good heart at every moment, every day is a good day. If we always have good thoughts, then any time, any thing or any location is auspicious.
IENG-385 Statistical Methods for Engineers SPSS (Statistical package for social science) LAB # 1 (An Introduction to SPSS)
DISPLAYING DATA DIAGRAMMATICALLY. The Aim By the end of this lecture, the students will be aware of graphical representation of data and by using SPSS.
Descriptive Statistics
Unit 4 Statistical Analysis Data Representations
MATH 2311 Section 1.5.
Topic 5: Exploring Quantitative data
Displaying and Summarizing Quantitative Data
Ten things about Descriptive Statistics
Descriptive Statistics
Displaying data Seminar 2.
Introduction to Excel 2007 Part 3: Bar Graphs and Histograms
MATH 2311 Section 1.5.
Presentation transcript:

R graphics  R has several graphics packages  The plotting functions are quick and easy to use  We will cover:  Bar charts – frequency, proportion  Pie charts  Histograms  Box plots  Scatter plots  Explore further on your own - R help, demo(graphics)

Bar charts  A bar chart draws a bar with a height proportional to the count in the table  The height could be given by the frequency, or the proportion, where the graph will look the same, but the scales may be different  Use scan() to read in the data from a file or by typing  Try ?scan for more information  Usage is simple: type in the data. It stops adding data when you enter a blank row

Bar charts Example:  Suppose, a group of 25 animals are surveyed for their feeding preference. The categories are (1) grass, (2) shrubs, (3) trees and (4) fruit. The raw data is  Let's make a barplot of both frequencies and proportions…

Bar chart - frequency 1: : Example: Feeding preference > feed = scan() Read 25 items > barplot(table(feed)) Frequency Note: barplot(feed) is not correct. Use table command to create summarized data, and the result of this is sent to barplot creating the barplot of frequencies

Bar chart - proportion > barplot(table(feed)/length(feed)) # divide by n for proportion Example cont… > table(feed)/length(feed) feed

Pie charts  The same data can be studied with pie charts, using the pie function  Following are some simple examples illustrating usage - similar to barplot(), but with some added features  We use names to specify names to the categories  We add colour to the pie chart by setting the pie chart attribute col  The help command (?pie) gives some examples for automatically getting different colours

Pie charts > feed.counts = table(feed)# store the table result > pie(feed.counts) # first pie -- kind of dull Boring pie Named pie Coloured pie > names(feed.counts) = c(“grass",“shrubs", “trees",“fruit") # give names > pie(feed.counts) # prints out names > pie(feed.counts,col=c("purple","green2","cyan","white")) # with colour

Histograms  Histograms are similar to the bar chart, but the bars are touching  The height can be the frequencies, or the proportions  In the latter case, the areas sum to 1 -- a property you should be familiar with, since you’ve already studied probability distributions  In either case the area is proportional to probability

Histograms  To draw a histogram, the hist() function is used  A nice addition to the histogram is to plot the points using the rug command  As you will see in the next example, it is used to give the tick marks just above the x-axis. If the data is discrete and has ties, then the rug(jitter(x)) command will give a little jitter to the x values to eliminate ties

Histograms Example: Suppose a lecturer recorded the number of hours that 15 students spent studying for their exams during one week Example: Suppose a lecturer recorded the number of hours that 15 students spent studying for their exams during one week Enter the data: > a=scan() 1: : Read 15 items

Histograms Draw a histogram: > hist(a) # frequencies > hist(a,probability=TRUE) # proportions (or probabilities) > rug(jitter(a)) # add tick marks NULL histogram of frequencies (default) preferred histogram of proportions (total area = 1) Note different y-axis

Histograms  The basic histogram has a predefined set of break points for the bins  You can, however, specify the number of breaks or break points Use: hist(a,breaks=3) or hist(a,3) Try it….

Boxplots  The boxplot is used to summarize data succinctly, quickly displaying whether the data is symmetric or has suspected outliers  Typical boxplot: Lower hinge/quartile Upper hinge/quartile Whiskers Median Upper extreme Lower extreme

Boxplots  To showcase possible outliers, a convention is adopted to shorten the whiskers to a length of 1.5 times the box length - any points beyond that, are plotted with points MinMaxOutliers  Thus, the boxplots allows us to check quickly for symmetry (the shape looks unbalanced) and outliers (lots of data points beyond the whiskers)  In the example we see a skewed distribution with a long tail

Boxplots  To draw boxplots, the boxplot function is used  As sample data, let’s get R to produces random numbers with a normal distribution: > z = rnorm(100)# generate random numbers > z# list numbers in z > z = rnorm(100)# generate random numbers > z# list numbers in z  Because the generated numbers are produced at random, each time you execute this command, different numbers will be produced

Boxplots  Now you draw a boxplot of the dataset (z, in this case)….  Use the boxplot command, in conjunction with various arguments  You must indicate the dataset name, but then you can also label the plot and orientate the plot  A notch function is useful to put a notch on the boxplot, at the median > boxplot(z,main="Horizonal z boxplot",horizontal=TRUE) > boxplot(z,main="Vertical z boxplot",vertical=TRUE) > boxplot(z,notch=T) > boxplot(z,main="Horizonal z boxplot",horizontal=TRUE) > boxplot(z,main="Vertical z boxplot",vertical=TRUE) > boxplot(z,notch=T)  What do you get, when you try it?

Boxplots A side-by-side boxplot to compare two treatments Data: experimental: control: > x = c(5, 5, 5, 13, 7, 11, 11, 9, 8, 9) > y = c(11, 8, 4, 5, 9, 5, 10, 5, 4, 10) > boxplot(x,y) Data: experimental: control: > x = c(5, 5, 5, 13, 7, 11, 11, 9, 8, 9) > y = c(11, 8, 4, 5, 9, 5, 10, 5, 4, 10) > boxplot(x,y)

Plotting  The functions plot(), points(), lines(), text(), mtext(), axis(), identify(), legend() etc. form a suite that plots points, lines, and text, gives fine control over axis ticks and labels, and adds a legend as specified  Change the default parameter settings -permanently using the par() function -only for the duration of the function call e.g., > plot(x, y, pch="+") # produces scatterplot using a + sign  Time restriction - but you should be aware of the power of R, and explore these options further

Scatter plots  The plot function will draw a scatter plot  Additional descriptions of the plot can be included  Using the data from the previous example, draw some scatter plots…. > plot(x) > plot(x,y) > plot(y,x) # change axis > plot(x,pch=c(2,4)) # print character > plot(x,col=c(2,4)) # adds colour > plot(x) > plot(x,y) > plot(y,x) # change axis > plot(x,pch=c(2,4)) # print character > plot(x,col=c(2,4)) # adds colour

Linear regression  Linear regression is the name of a procedure that fits a straight line to the data  Remember the equation of the line:y = b 0 + b 1 x  The abline(lm(y ~ x)) function will plot the points, find the values of b 0, b 1, and add a line to the graph  The lm function is that for a linear model  The funny syntax y ~ x tells R to model the y variable as a linear function of x