# SADC Course in Statistics Exploratory Data Analysis for single variables Module B2 Session 12.

SADC Course in Statistics Exploratory Data Analysis for single variables Module B2 Session 12

Learning Objectives students should be able to Explain the importance of exploring data at the start of the analysis Use two new tools for exploration Dot plots and stem & leaf plots Construct simple and jittered dot plot Draw a stem and leaf plot Use training resources more effectively CAST as a training resource Excel as a training and analysis tool With charts and graphs Explain the difference between exploratory and presentation graphs

Stages in processing the data Entry and checking the data Organising the data for analysis Exploring the data Analysis Reporting The middle 3 stages are iterative and can be repeated Some exploration can be before the organising Continue to explore through the analysis

In this session Two new tools are introduced Dot plots Stem and leaf plots They are to process numeric data So far we have concentrated on categorical data Now we start to redress the balance In the next session We apply these tools

Jittered dot plots in CAST and Excel CAST EXCEL Rainfall data: 608, 746, 767, ….. 1395, 1425, 1482

Stem and leaf plots – survey yields Single stem Split stem Stem - tens digit Leaves - units digit Dec point - truncated Yields: 19.1, 24.3, 24.7,….. 59.3, 61.4, 62.1

Exploratory and presentation graphs Dot plots and stem and leaf plots show all the data to help with exploration to look for oddities and to prepare for the analysis They are for data exploration the graphs have to be effective, not pretty Bar charts and pie charts show summaries to present results to others in reports and presentations They are for presentation

Practical Activity 2 – uses CAST for dot plots Activity 3 – uses Excel to produce dot plots Activity 4 – uses both for stem and leaf plots It also prepares for the future Efficient use of CAST Effective use of Excel

Learning to use resources fully CAST is a new type of resource You may have to take some time to learn how to use it fully so you gain the maximum from the resource When you do It can help during this training and also afterwards Because it supports self-study It is part of an effort to change learning towards a voyage of discovery

Using CAST fully? Follow the instructions to take advantage of the dynamic elements Also think why the action is useful You also saw this earlier In the tutorial introduction

Did you puzzle, or just click? Did you follow these instructions to scan down the list and look for the pattern Or did you take the easy way out and just click

So making full use of CAST

Interact and read the text as well Instructions Instructions and statistics Important points are in white

Using Excel effectively Dot plots are not on Excels menus Dot plots are not in Excels help But you decided to do dot plots in Excel! You therefore need to understand them better So you can construct them yourself And this understanding is good anyway And helps with effective data analysis It is an example Of you controlling the software And not being limited by it That applies to all software

Jittered dot plots in CAST and Excel CAST EXCEL Why are the vertical heights different in the 2 cases? Do you ALL know?

Excel for analysis and training Excel is not designed as a training resource Unlike CAST – that is all CAST is for Excel is to support data organisation and analysis But here we have used it also for training With dot plots And stem and leaf plots Neither of which are in the Excel menus

Summary Dot plots and stem & leaf plots give simple tools to look at the actual data in a simple and concise way It is important to look at the data itself before starting on the actual analysis so any patterns or oddities can be identified and necessary steps taken to deal with them When dealing with large sets of data, computers are needed to do the exploration; However the importance of this work should be stressed right at the data entry stage and could even become part of the data checking procedures

The next session will extend and apply the tools from this session to real data

