SADC Course in Statistics Numerical summaries for quantitative data Module I3 Sessions 4 and 5.

Slides:



Advertisements
Similar presentations
Quality control tools
Advertisements

CS1512 Foundations of Computing Science 2 Week 3 (CSD week 32) Probability © J R W Hunter, 2006, K van Deemter 2007.
1 Adding a statistics package Module 2 Session 7.
SADC Course in Statistics Basic summaries for epidemiological studies (Session 04)
The Poisson distribution
SADC Course in Statistics Further ideas concerning confidence intervals (Session 06)
Correlation & the Coefficient of Determination
SADC Course in Statistics Confidence intervals using CAST (Session 07)
SADC Course in Statistics Processing single and multiple variables Module I3 Sessions 6 and 7.
SADC Course in Statistics Session 4 & 5 Producing Good Tables.
SADC Course in Statistics Exploratory Data Analysis (EDA) in the data analysis process Module B2 Session 13.
SADC Course in Statistics Graphical summaries for quantitative data Module I3: Sessions 2 and 3.
SADC Course in Statistics Common complications when analysing survey data Module I3 Sessions 14 to 16.
SADC Course in Statistics Introduction to Statistical Inference (Session 03)
SADC Course in Statistics Reporting on the web site Module I4, Sessions 14 and 15.
SADC Course in Statistics Review of ideas of general regression models (Session 15)
SADC Course in Statistics Producing a product portfolio Module I3 Session
The MDGs and School Enrolment: An example of administrative data
SADC Course in Statistics Handling Data Module B2.
SADC Course in Statistics Objectives and analysis Module B2, Session 14.
SADC Course in Statistics Risks and return periods Module I3 Sessions 8 and 9.
SADC Course in Statistics Analysing Data Module I3 Session 1.
SADC Course in Statistics Excel for statistics Module B2, Session 11.
SADC Course in Statistics Module B2, Session3
SADC Course in Statistics Exploratory Data Analysis for single variables Module B2 Session 12.
3/20/2003CVEN Maxwell 1 PERT – Program Evaluation & Review Technique Module: PERT Modified: February 20, 2003.
CHAPTER 14: Confidence Intervals: The Basics
Describing Quantitative Variables
SUMMARIZING DATA: Measures of variation Measure of Dispersion (variation) is the measure of extent of deviation of individual value from the central value.
HSW – Answering AS questions How Science Works Answering AS questions Version 1.01 Copyright © 2008 AQA and its licensors. All rights reserved.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
SADC Course in Statistics Taking measurements Module I1, Session 17.
Tables and graphs for frequencies and summary statistics
SADC Course in Statistics Adding a statistics package Module I3, Session 13.
1 Business 260: Managerial Decision Analysis Professor David Mease Lecture 1 Agenda: 1) Course web page 2) Greensheet 3) Numerical Descriptive Measures.
Probability and Statistics in Engineering Philip Bedient, Ph.D.
Describing distributions with numbers
CHAPTER 2: Describing Distributions with Numbers ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
BIOSTAT - 2 The final averages for the last 200 students who took this course are Are you worried?
K-1 TIPM3 Dr. Monica Hartman Cathy Melody and Gwen Mitchell November 2, 2011.
BASIC STATISTICAL METHODS FOR QUALITY ASSURANCE(QA) AND QUALITY CONTROL (QC) SCHEMES IN WATER LABORATORIES Presented by: A. MANOHARAN Scientist Central.
Measures of Variability In addition to knowing where the center of the distribution is, it is often helpful to know the degree to which individual values.
Standard Deviation Z Scores. Learning Objectives By the end of this lecture, you should be able to: – Describe the importance that variation plays in.
1 Statistical concepts Module 1, Session 2. 2 Objectives From this session participants will be able to: Define statistics Enter simple datasets once.
Lecture 3 Describing Data Using Numerical Measures.
Measures of Dispersion
1 Excel for statistics Module 1, Session 4. 2 Learning Objectives participants should be able to: Explain how an Excel add-in can provide the equivalent.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Describing Distributions Numerically.
Discovering Mathematics Week 5 BOOK A - Unit 4: Statistical Summaries 1.
Data: Categorical vs. Quantitative Mr. Diaz Math 3.
Measures of Dispersion Advanced Higher Geography Statistics.
Shoe Size  Please write your shoe size on the board.  Girls put yours on the girl’s chart  Boys put yours on the boy’s chart.
Quantitative variables continued
Introduction to Statistics
Introduction to Statistics
Describing Distributions Numerically
CHAPTER 2: Describing Distributions with Numbers
U4D3 Warmup: Find the mean (rounded to the nearest tenth) and median for the following data: 73, 50, 72, 70, 70, 84, 85, 89, 89, 70, 73, 70, 72, 74 Mean:
How could data be used in an EPQ?
Common Core Math I Unit 1 Day 2 Frequency Tables and Histograms
Describing Distributions with Numbers
Measure of Center And Boxplot’s.
Measure of Center And Boxplot’s.
Good research questions
CHAPTER 2: Describing Distributions with Numbers
Summary (Week 1) Categorical vs. Quantitative Variables
Summary (Week 1) Categorical vs. Quantitative Variables
Describing Distributions Numerically
Common Core Math I Unit 1 Day 2 Frequency Tables and Histograms
Presentation transcript:

SADC Course in Statistics Numerical summaries for quantitative data Module I3 Sessions 4 and 5

Learning objectives Students should be able to: Explain why it is important to summarise the variability of a dataset Provide from first principles and explain the role of the common summary statistics for average and spread for a simple dataset Visualise a dataset to estimate the standard deviation from a graph of the data Visualise a dataset to construct a histogram or boxplot, given a numerical summary Explain the formulae for the variance, standard deviation and mean deviation

Contents Activity 1: Power point presentation To stress the importance of understanding summary statistics. Activity 2: Practical 1 Calculate averages and measures of variation Activity 3: Practical 2 Interpret and explain averages and measures of variation Activity 4: Review of key points and concepts

Why variation is SO important From D. S. Moore In Statistics: A Guide to the Unknown – 4 th Edition Variation is everywhere Individuals vary. Repeated measurements on the same individual vary. The science of statistics provides tools for dealing with variation Give examples of the two statements in blue: time of arrival at a lecture, blood pressure, reaction times, penalty taking in football.

Look at the wide range of situations! Record some examples on the board or flip chart. How many people said the same thing? How many areas of application can be considered?

CAST and summary statistics CAST will be used extensively in one of the practicals

DFID and climate – was this area mentioned? Reducing the vulnerability of the poor to current climate variability is the starting point for adaptation to climate change. Climatic variability is a fundamental driver of poverty in poor countries. The climate is changing and it is highly likely that it will worsen poverty and hinder efforts to achieve the Millennium Development Goals. The poor cannot cope with current climatic variation in many parts of the world, but this issue is often ignored in poverty assessments or national development planning. Responses to existing climatic variability should be mainstreamed into national development plans and processes. Current responses by individuals and governments to the impacts of climate variability can be used as the basis for adaptation to the increasing climate variability that will be associated with longer-term climate change.

So To practice statistics You must be able to summarise sets of data Including giving a measure of average And particularly to summarise the variability The simple summaries of variability are easy The extremes (maximum and minimum) and the range The quartiles But the most used measure of variation Is called the standard deviation You can calculate it easily – in Excel!!! But you must understand and be able to interpret it And that is what you need to learn from these sessions

Activity 2: Practical 1 Trivial data sets By hand – for understanding And using Excel To explain the formulae So you can also use them Including the coefficient of variation (cv) Which provides a good initial test of your understanding The cv is useful, but also overused We ask you to explain when it should NOT be used

Activity 3: Using CAST for help You work in pairs Learning from CAST and then taking on a teachers role You need to understand a topic well To be able to explain it to someone else CAST also gives exercises To estimate the variability from a histogram or boxplot To draw the histogram or boxplot, given the summary values You also try these tasks With your partner to help – or hinder!

Discussion From practical 1: Suppose marks in a test are 12, 15, … so the mean = 20 and the s.d. = 8 Students are all given 15 marks bonus for attending They all attended, so all get the extra 15 What is the mean and what is the standard deviation?

A possible problem with Excel Software should give the right answer We show that Excel standard functions did not – though SSC-Stat is OK Give the mean and standard deviation of: mean = 3s.d. = 1.58 What is the mean and s.d. if we add 10? mean = ??? s.d. = ???

A possible problem with Excel Software should give the right answer We show that Excel standard functions did not – though SSC-Stat is OK Give the mean and standard deviation of: mean = 3s.d. = 1.58 What is the mean and s.d. if we add 10? mean = 13s.d. = 1.58 again * Check you are absolutely clear that this is true And if you add 100 the s.d. = ??? And if you add 1000 the s.d. = ???

Standard deviation in Excel 2000 Same as previous slideooops!

This problem with Excel It was fixed in Excel 2003 But it should make you worry that other answers might still be wrong We return to this point in Session 13 Now the key idea is your understanding of the measures of variation

The coefficient of variation – (cv) It is popular in some areas of application And easy to misuse It is given by cv = 100 * s.d./mean When should it NOT be used 1.When the s.d. should not be used. When is that? 2.When it is not sensible to divide by the mean. When is that?

Training – how did it go? Did you get good marks as trainers? What suggestions did you have for improvements?

Exercises – how did you do?

My reasoning was as follows: In the figure, everything is between 100 and 300 Most data (not quite all) are within 2 * s.d., so s.d. must be less than 50. So I said 45!

Learning objectives Are you now able to: Explain why it is important to summarise the variability of a dataset Provide from first principles and explain the role of the common summary statistics for average and spread for a simple dataset Visualise a dataset to estimate the standard deviation from a graph of the data Visualise a dataset to construct a histogram or boxplot, given a numerical summary Explain the formulae for the variance, standard deviation and mean deviation

Now you know about the common summary statistics, the next sessions put them to use