Promoting Good Statistical Practices

Slides:



Advertisements
Similar presentations
© Jim Barritt 2005School of Biological Sciences, Victoria University, Wellington MSc Student Supervisors : Dr Stephen Hartley, Dr Marcus Frean Victoria.
Advertisements

Chapter 12 Decision Support Systems
Chapter 1: The Database Environment
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Author: Julia Richards and R. Scott Hawley
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
UNITED NATIONS Shipment Details Report – January 2006.
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
Properties of Real Numbers CommutativeAssociativeDistributive Identity + × Inverse + ×
Custom Statutory Programs Chapter 3. Customary Statutory Programs and Titles 3-2 Objectives Add Local Statutory Programs Create Customer Application For.
Lecture 2 ANALYSIS OF VARIANCE: AN INTRODUCTION
1 Correlation and Simple Regression. 2 Introduction Interested in the relationships between variables. What will happen to one variable if another is.
1 From the data to the report Module 2. 2 Introduction Welcome Housekeeping Introductions Name, job, district, team.
Module Introduction and Getting Started with Stata
1 Adding a statistics package Module 2 Session 7.
SADC Course in Statistics Session 4 & 5 Producing Good Tables.
SADC Course in Statistics Exploratory Data Analysis (EDA) in the data analysis process Module B2 Session 13.
SADC Course in Statistics Graphical summaries for quantitative data Module I3: Sessions 2 and 3.
SADC Course in Statistics Common complications when analysing survey data Module I3 Sessions 14 to 16.
SADC Course in Statistics Introduction to the module and the sessions Module I4, Sessions 1 and 2.
SADC Course in Statistics Handling Data Module B2.
SADC Course in Statistics Objectives and analysis Module B2, Session 14.
SADC Course in Statistics Risks and return periods Module I3 Sessions 8 and 9.
SADC Course in Statistics Analysing Data Module I3 Session 1.
SADC Course in Statistics Excel for statistics Module B2, Session 11.
SADC Course in Statistics Module B2, Session3
Projects in Computing and Information Systems A Student’s Guide
Chapter 7 Sampling and Sampling Distributions
American Society for Quality Certification Programs Presented 21 July 2009 by Diane G. Kulisek
For Translators and Translation Editors Note-Taking presents... by Riccardo Schiaffino CTA 3rd Annual Conference Boulder, May © Riccardo Schiaffino,
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
1. 2 Objectives Become familiar with the purpose and features of Epsilen Learn to navigate the Epsilen environment Develop a professional ePortfolio on.
Solve Multi-step Equations
Biostatistics Unit 5 Samples Needs to be completed. 12/24/13.
EU market situation for eggs and poultry Management Committee 20 October 2011.
CS 6143 COMPUTER ARCHITECTURE II SPRING 2014 ACM Principles and Practice of Parallel Programming, PPoPP, 2006 Panel Presentations Parallel Processing is.
2 |SharePoint Saturday New York City
Green Eggs and Ham.
Benchmark Series Microsoft Excel 2013 Level 2
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
Jim Haywood (Product Manager for Statutory Returns) Adopted from Care - Spring Release 2014.
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
© 2012 National Heart Foundation of Australia. Slide 2.
Statistical Analysis SC504/HS927 Spring Term 2008
How creating a course on the e-lastic platform 1.
Model and Relationships 6 M 1 M M M M M M M M M M M M M M M M
Arithmetic of random variables: adding constants to random variables, multiplying random variables by constants, and adding two random variables together.
Analyzing Genes and Genomes
Systems Analysis and Design in a Changing World, Fifth Edition
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
1 Interpreting a Model in which the slopes are allowed to differ across groups Suppose Y is regressed on X1, Dummy1 (an indicator variable for group membership),
Exponents and Radicals
Improving Achievement
Intracellular Compartments and Transport
PSSA Preparation.
Essential Cell Biology
Organization Theory and Health Services Management
CINAHL Keyword Searching. This presentation will take you through the procedure of finding reliable information which can be used in your academic work.
Chapter 13 Web Page Design Studio
Energy Generation in Mitochondria and Chlorplasts
January Structure of the book Section 1 (Ch 1 – 10) Basic concepts and techniques Section 2 (Ch 11 – 15): Inference for quantitative outcomes Section.
Minitab® 16 Workshop Presented by Arved Harding Your friendly, neighborhood statistician.
User Security for e-Post Applications Dr Chandana Gamage University of Moratuwa.
Benchmark Series Microsoft Excel 2013 Level 2
SADC Course in Statistics Adding a statistics package Module I3, Session 13.
Good Statistics with Microsoft Excel Howard Grubb, Roger Stern and Colin Grayer Department of Applied Statistics 6th June 2001.
Presentation transcript:

Promoting Good Statistical Practices Roger Stern - SSC, Reading WMO/FAO training workshop - November 2005

PROMOTING GOOD STATISTICAL PRACTICE Univ of Botswana, 2005 Contents Understanding the present situation: The need for (basic) training in statistics Past training in statistics Developments in statistical computing And in statistical analyses Possibilities for the future Resources statistical software (freely available in Africa) materials to promote good statistical practices training materials Spatial analysis In conclusion These are exciting times - let’s look forwards not backwards Start with some stories – relevant to Africa It is a question of attitude – Rockefeller 2000 Research projects in African Universities – done by MSc students – held back by poor statistical knowledge of students. Many attempts to change this, but a difficult problem. I visited many of the Universities to meet the students, the staff and the statisticians. One result from the students was – please don’t start reforms at pg level, that is already too late. Change the u/g teaching. Easy to see the difficulties – large classes, no computers, no money for demonstrators, etc. Now – BUCS, open consultancy, so funds for MSc students to demonstrate, lab of 60 (2nd hand) computers, all with internet, new teaching – all describptive in year 2. Show some of this. Show CAST for climatic – electronic textbook! Excel – needed, because jobs. UK story – led to SSC-Stat – see later Climatic – why I am here – Instat. – see later Statistics for government Kenya Polytechnic, (changed their teaching) Univ of Makerere, statistical institute (harder to change) – WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

Training in statistics Univ of Botswana, 2005 Training in statistics It is difficult to practice good statistics unless we have had appropriate training For example seasonal forecasting Uses PCA Spatial methods mentioned in this workshop include: Kriging, and co-kriging PCA and clustering When many staff find more basic concepts difficult Percentiles and return periods – (show CAST as preview) Standard errors, etc So they have to accept (advanced) methods in an unquestioning way WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

Past training in statistics Training for (non-statistician) users in the past has been problematical consequently they fear statistics and hence also statisticians Similarly, insufficient soft training for statisticians consequently they sometimes lack communication skills and marketing skills and are often side-lined in important development and research projects just like Met staff perhaps??? WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

Common training problems for non-statisticians Training is dominated by analysis with little on data management or on design A recipe-book approach is used hence e.g. overuse of irrelevant significance tests little understanding of principles Training emphasises hand computation for understanding (which they don’t get!) but not needed later and little experience of computers for statistical work Presentation is too mathematical not conceptual AND often taught by someone who has little interest in the student’s main subject areas WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

PROMOTING GOOD STATISTICAL PRACTICE RESULT! Users with near universal dislike of statistics and statisticians? strong demand for relevant in-service training in statistics Most of these past weaknesses in training are the same for statisticians who can be too pedantic and inflexible in their advice and are then feared and ignored, where possible, by potential clients We see later how this can now easily change for both statisticians and for others who need to generate and use statistics WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

Advances in statistical computing History 1960’s SAS and SPSS started A long way back in computer terms By early 1980’s Statistics packages well established Micro-computers appeared – too small for these packages So lots of other statistics packages that made the same mistakes as SAS and SPSS a generation earlier it is easy to write statistical software, but difficult to write good software WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

Statistics packages : THEN In the 1990’s Standard statistics packages dominant again compare other types of software With some additions e.g. Stata All command-driven So you had to learn the language (for SPSS, or SAS) So people and training courses used just one package Data transfer between packages was difficult Training courses often confused learning the package with learning statistics c.f. data management – learning concepts or learning Access WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

PROMOTING GOOD STATISTICAL PRACTICE A big advance….. Windows appeared & EXCEL ruled the world for better for worse! WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

Statistics packages : NOW All common packages are in Windows Very similar interface Like other Windows software So very easy to learn And to add to Excel so you can still keep your “security blanket” And easy to add another package hence not so critical what package is used for statistics training Data transfer has also become easy Hardly need a training course for the software so can concentrate on training in statistics again! WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

Advances in statistical analysis The “estuary model” ever-increasing unity to the methods this makes training much easier if we build a solid foundation special methods are then seen as such WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

PROMOTING GOOD STATISTICAL PRACTICE Start in 1960’s In the mountains there were little streams Regression and Analysis of variance These were for normally distributed data In another valley parameter estimation was for other distributions, like Poisson and binomial And leading to another valley the chi square analysis for categorical data WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

PROMOTING GOOD STATISTICAL PRACTICE Then In the late 1960’s Chi-square tests joined with other ways of looking at multidimensional contingency tables to become log-linear models In the early 1970’s log-linear models joined probit analysis into the general stream of generalized linear models that also included ANOVA and regression for normal and non-normal data WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

PROMOTING GOOD STATISTICAL PRACTICE And finally for us here In the 1980’s REML started and is for data at multiple levels By the 1990’s it had joined the mainstream and included powerful methods for spatial modelling So now same modelling ideas used for a wide range of problems Making both training and analysis simpler and more coherent as long as the trainers know. BUT some are still up in the mountains! WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

PROMOTING GOOD STATISTICAL PRACTICE So where are we now? Statistical software has developed and so has user’s computing skills Statistical methods have developed and are easier to use And the resources to bring the two together are now being made available and are becoming accessible throughout We describe some of these resources First generally And then look briefly at methods for spatial modelling WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

PROMOTING GOOD STATISTICAL PRACTICE Software includes: SSC-Stat add-in for Excel to encourage good use with a tutorial guide and guides for good tables and good graphs for example it provides boxplots Instat+ first simple statistics package for ‘Excel-lers’ supports good teaching of statistics stepping stone to other statistics packages tutorial guide, introductory guide and climatic guide, now updated for Instat Version 3 for example for data summary or training Genstat One of the major statistics packages (like SPSS, Systat) For modern statistical modelling, like GLMs and REML And good facilities for spatial modelling WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

PROMOTING GOOD STATISTICAL PRACTICE Excel Add-In A simple add-in developed by SSC, Reading University To make EXCEL more effective for simple data processing Can be installed like other add-ins to ordinary EXCEL WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

PROMOTING GOOD STATISTICAL PRACTICE SSC-Stat Descriptive statistics using SSC-Stat e. g. parallel box plots, To show outliers, means and medians for a variate, split by different factors WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

PROMOTING GOOD STATISTICAL PRACTICE INSTAT+ Simple statistics package, developed by SSC, Reading To facilitate training in good statistical practice Also provides a painless way of preparing for a move to a major statistical package Special facilities for dealing with (daily) climatic data WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

PROMOTING GOOD STATISTICAL PRACTICE Instat A graph from Instat Showing climatic summaries A useful aid to exploring statistics concepts WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

PROMOTING GOOD STATISTICAL PRACTICE Instat – for training Instat using graphs to explore distributions emphasizing the need to understand the underlying structure WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

PROMOTING GOOD STATISTICAL PRACTICE Genstat Specially for agricultural applications And now with added climatic features Like extremes, and circular plots Plus a climatic guide WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

Modern Statistical Methods Dialogue from GenStat Showing an easy way to handle more general statistical models WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

Resources for good statistical practice Good practice guides Mini-guides for statistical sceptics designed originally to promote good statistical practice in DFID projects covering design, data management analysis and presentation a book is now available And so much more: Participatory (QQA) stuff, important for Met services Now a book is available, based on Malawi’s “starter pack” Data management – where Met services can support other groups WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

PROMOTING GOOD STATISTICAL PRACTICE Links (for Slide 23) Good Practice Guides: Can be viewed by adding a link to the Instat cd or live through the SSC web site The Green Book: add the link to the Green Book cd WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

PROMOTING GOOD STATISTICAL PRACTICE QQA Bringing together quantitative and qualitative data together in a meaningful way Based on work in Malawi WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

Training resources include Statistical games to help teach statistics Reading and BUCS For example PADDY, the rice survey game Materials for distance learning Now CAST in general But can now be adapted for African needs With support from the Rockefeller foundation WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

Interesting ways of learning Training software Statistics concepts through CAST WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

Interesting ways of learning Training software Statistics concepts through CAST WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

Interesting ways of learning Training software Statistics concepts through CAST WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

Interesting ways of learning Statistical Games Simulating a survey based on a real crop cutting survey in Sri Lanka WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

Interesting ways of learning Statistical Games Simulating a survey based on a real crop cutting survey in Sri Lanka WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

PROMOTING GOOD STATISTICAL PRACTICE And in climatology Providing the basic statistical skills Now through a facilitated e-learning course Tested in 2005, and provided from 2006 For staff in HQ and (hopefully) in outstation offices Because decentralisation is important Using a specially adapted version of CAST That can be provided to African Services You have seen this earlier Also software (Instat) plus Genstat Each with their special climatic guide WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

PROMOTING GOOD STATISTICAL PRACTICE Spatial ideas More to spatial analysis than just maps Remember the data – when will you map? Daily – many “layers” Annually (e.g. date of start of the season) Averages – take care of different years at different stations Example where map does not give the full answer Southern Zambia – risky for maize Suggest strategy – say farmers overall have 20% (1 year in 5) risk of replanting How much seed should be stocked? Map – very simple 20% everywhere – does it answer the question? Need spatial correlations – why? WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

PROMOTING GOOD STATISTICAL PRACTICE GIS and mapping Many problems can be mapped effectively Then much “spatial analysis” is descriptive statistics Selection of subsets, Transformations to provide new layers Logical calculations Etc This is non-controversial Simple smoothing to provide contours is the same As long as the spatial “averaging” e.g. splines, inverse distance is recognised as such But kriging, etc is moving into inferential ideas And statistical packages could also be used for such operations WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

Spatial statistics with statistical software Many statistical packages, e.g. Genstat Provide some facilities for spatial analysis For example kriging And REML – for the future WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

PROMOTING GOOD STATISTICAL PRACTICE Demonstration Show two examples of Genstat First is a simple contour plot Shows the value of a log file of commands Second is an example of kriging Shows more facilities in fitting and plotting Other facilities include Co-kriging REML for “proper” spatial modelling Within which kriging is a special case More “research” and case studies are needed WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

PROMOTING GOOD STATISTICAL PRACTICE In conclusion The time is right: Statistics has changed Training methods can change The resources are here And in Africa: Evidence-based decision making is (more) encouraged Met Services are key organisations Because climatic data are needed in so many applications Challenge: How will you proceed?? WMO/FAO Training workshop, November 2005 PROMOTING GOOD STATISTICAL PRACTICE

Univ of Botswana, 2005 Thank you