Data analysis: Explore GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 9.

Slides:



Advertisements
Similar presentations
Population vs. Sample Population: A large group of people to which we are interested in generalizing. parameter Sample: A smaller group drawn from a population.
Advertisements

Chapter 3, Numerical Descriptive Measures
Describing Quantitative Variables
DESCRIBING DISTRIBUTION NUMERICALLY
Lesson Describing Distributions with Numbers parts from Mr. Molesky’s Statmonkey website.
Descriptive Measures MARE 250 Dr. Jason Turner.
GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 7 SPSS: Recode and Compute.
Measures of Dispersion
Descriptive Statistics
Measures of Dispersion or Measures of Variability
Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson2-1 Lesson 2: Descriptive Statistics.
Descriptive Statistics – Central Tendency & Variability Chapter 3 (Part 2) MSIS 111 Prof. Nick Dedeke.
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Summarising and presenting data
Descriptive Statistics
Statistical Analysis SC504/HS927 Spring Term 2008 Week 17 (25th January 2008): Analysing data.
Analysis of Research Data
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Measures of Dispersion
Chapter In Chapter 3… … we used stemplots to look at shape, central location, and spread of a distribution. In this chapter we use numerical summaries.
Central Tendency and Variability Chapter 4. Central Tendency >Mean: arithmetic average Add up all scores, divide by number of scores >Median: middle score.
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Describing Data: Numerical
Programming in R Describing Univariate and Multivariate data.
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
Numerical Descriptive Techniques
1 Laugh, and the world laughs with you. Weep and you weep alone.~Shakespeare~
Table of Contents 1. Standard Deviation
Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots.
1 PUAF 610 TA Session 2. 2 Today Class Review- summary statistics STATA Introduction Reminder: HW this week.
Chapter 2 Describing Data.
Skewness & Kurtosis: Reference
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
1 Univariate Descriptive Statistics Heibatollah Baghi, and Mastee Badii George Mason University.
Measures of Central Tendency And Spread Understand the terms mean, median, mode, range, standard deviation.
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
INVESTIGATION 1.
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
Practice Page 65 –2.1 Positive Skew Note Slides online.
CHAPTER 3  Descriptive Statistics Measures of Central Tendency 1.
INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures.
LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
Unit 2 (F): Statistics in Psychological Research: Measures of Central Tendency Mr. Debes A.P. Psychology.
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Unit 3: Averages and Variations Week 6 Ms. Sanchez.
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
Chapter 3 Averages and Variation Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
Statistics topics from both Math 1 and Math 2, both featured on the GHSGT.
LIS 570 Summarising and presenting data - Univariate analysis.
Outline of Today’s Discussion 1.Displaying the Order in a Group of Numbers: 2.The Mean, Variance, Standard Deviation, & Z-Scores 3.SPSS: Data Entry, Definition,
1 Day 1 Quantitative Methods for Investment Management by Binam Ghimire.
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
Descriptive Statistics(Summary and Variability measures)
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
Statistical Methods Michael J. Watts
Statistical Methods Michael J. Watts
Descriptive measures Capture the main 4 basic Ch.Ch. of the sample distribution: Central tendency Variability (variance) Skewness kurtosis.
Research Methods in Psychology PSY 311
Averages and Variation
Descriptive Statistics
Description of Data (Summary and Variability measures)
Descriptive Statistics
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Ms. Saint-Paul A.P. Psychology
MCC6.SP.5c, MCC9-12.S.ID.1, MCC9-12.S.1D.2 and MCC9-12.S.ID.3
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
Descriptive Statistics
Presentation transcript:

Data analysis: Explore GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 9

Objectives To define a standard set of descriptive statistics used to analyse continuous variables To examine the Explore facility in SPSS To introduce the analysis of a continuous variable according to values of a categorical variable, an example of bivariate analysis To introduce further SPSS Help options To reinforce the use of SPSS syntax

SPSS Descriptive Statistics Analyse/Descriptive Statistics/Frequencies Analyse/Descriptive Statistics/Explore Analyse/Descriptive Statistics/Descriptives

Exercise: continuous variable Generate a set of standard summary statistics for the continuous variable Age

Explore: Age

Explore: Descriptive Statistics StatisticStd. Error AGEMean % Confidence Interval for Mean Lower Bound31.16 Upper Bound % Trimmed Mean31.31 Median31.00 Variance Std. Deviation Minimum1 Maximum77 Range76 Interquartile Range20.00 Skewness Kurtosis Descriptives

Exercise: Help What’s This? Results Coach Case Studies

Measures of central tendency Most commonly: –Mode –Median –Mean 5 per cent trimmed mean

The mode The mode is the most frequently occurring value in a dataset Suitable for nominal data and above Example: –The mode of the first most frequently used drug is Alcohol, with 717 cases, approximately 46 per cent of valid responses

Bimodal Describes a distribution Two categories have a large number of cases Example: –The distribution of Employment is bimodal, employment and unemployment having a similar number of cases and more cases than the other categories

The median The middle value when the data are ordered from low to high is the median Half the data values lie below the median and half above The data have to be ordered so the median is not suitable for nominal data, but is suitable for ordinal levels of measurement and above

Example: median Seizures of opium in Germany, (Kilograms) Source: United Nations (2000). World Drug Report 2000 (United Nations publication, Sales No. GV.E ). Year Seizure

Sort the seizure data in ascending order The middle value is the median; the median annual seizures of opium for Germany between 1994 and 1998 was 42 kilograms Year Seizure Ranked:

The mean Add the values in the data set and divide by the number of values The mean is only truly applicable to interval and ratio data, as it involves adding the variables It is sometimes applied to ordinal data or ordinal scales constructed from a number of Likert scales, but this requires the assumption that the difference between the values in the scale is the same, e.g. between 1 and 2 is the same as between 5 and 6

Example: mean Seizures of opium in Germany, Sample size = = /5 = 84.8 Year Seizure

The 5 per cent trimmed mean The 5 per cent trimmed mean is the mean calculated on the data set with the top 5 per cent and bottom 5 per cent of values removed An estimator that is more resistant to outliers than the mean

95 per cent confidence interval for the mean An indication of the expected error (precision) when estimating the population mean with the sample mean In repeated sampling, the equation used to calculate the confidence interval around the sample mean will contain the population mean 95 times out of 100

Measures of dispersion The range The inter-quartile range The variance The standard deviation

The range A measure of the spread of the data Range = maximum – minimum

Quartiles 1 st quartile: 25 per cent of the values lie below the value of the 1 st quartile and 75 per cent above 2 nd quartile: the median: 50 per cent of values below and 50 per cent of values above 3 rd quartile: 75 per cent of values below and 25 per cent of the values above

Inter-quartile range IQR = 3 rd Quartile – 1 st Quartile The inter-quartile range measures the spread or range of the mid 50 per cent of the data Ordinal level of measurement or above

Variance The average squared difference from the mean Measured in units squared Requires interval or ratio levels of measurement

Standard deviation The square root of the variance Returns the units to those of the original variable

Example: standard deviation and variance Seizures of opium in Germany, YearSeizureDeviationsSquared deviations Total Count55 Mean84.8Variance10230 Standard deviation 101

Distribution or shape of the data The normal distribution Skewness: –Positive or right-hand skewed –Negative or left-hand skewed Kurtosis: –Platykurtic –Mesokurtic –Leptokurtic

Symmetrical data: the mean, the median and the mode coincide Mean Median Mode f(X) X The normal distribution

Right-hand skew (+) Right-hand skew: the extreme large values drag the mean towards them f(X) XModeMedianMean

Left-hand skew (-) Left-hand skew: the extreme small values drag the mean towards them ModeMeanMedianX f(X)

Bivariate analysis Continuous Dependent Variable Categorical Independent Variable

Explore

Explore: Options button

Explore: Plots button

Explore: Statistics button

GenderStatisticStd. Error AGEMaleMean % Confidence Interval for Mean Lower Bound30.76 Upper Bound % Trimmed Mean31.03 Median30.00 Variance Std. Deviation Minimum1 Maximum70 Range69 Interquartile Range19.00 Skewness Kurtosis FemaleMean % Confidence Interval for Mean Lower Bound31.84 Upper Bound % Trimmed Mean32.77 Median33.00 Variance Std. Deviation Minimum14 Maximum77 Range63 Interquartile Range23.00 Skewness Kurtosis Descriptives

Male Female

Boxplot of Age vs Gender Median Inter-quartile range Outlier

Syntax: Explore EXAMINE VARIABLES=age BY gender /ID=id /PLOT BOXPLOT HISTOGRAM /COMPARE GROUP /STATISTICS DESCRIPTIVES /CINTERVAL 95 /MISSING LISTWISE /NOTOTAL.

Summary Measures of central tendency Measures of variation Quantiles Measures of shape Bivariate analysis for a categorical independent variable and continuous dependent variable Histograms Boxplots