Presentation is loading. Please wait.

Presentation is loading. Please wait.

The following lecture has been approved for University Undergraduate Students This lecture may contain information, ideas, concepts and discursive anecdotes.

Similar presentations


Presentation on theme: "The following lecture has been approved for University Undergraduate Students This lecture may contain information, ideas, concepts and discursive anecdotes."— Presentation transcript:

1 The following lecture has been approved for University Undergraduate Students This lecture may contain information, ideas, concepts and discursive anecdotes that may be thought provoking and challenging It is not intended for the content or delivery to cause offence Any issues raised in the lecture may require the viewer to engage in further thought, insight, reflection or critical evaluation

2 Background to Statistics Distributions Data collection Data presentation Dr. Craig Jackson Senior Lecturer in Health Psychology Trauma & Critical Care Faculty of Health & Community Care UCE Birmingham craig.jackson@uce.ac.uk

3 33% 123 Problem: stick with initial choice or choose another door ? 50% ? Solution: probability says that you stand a better chance of finding the cash if you SWAP The Monty Hall Problem

4 Door 1Door 2Door 3 Never swap WIN LOSE LOSE Always swap LOSE WIN WIN Marilyn vos Savant The Monty Hall Problem

5 Dispersion RangeSpread of data MeanArithmetic average MedianLocation ModeFrequency SDSpread of data about the mean Range50-112 mmHg Mean82mmHgMedian82mmHgMode82mmHg SD± 10mmHg

6 Types of Data / Variables ContinuousDiscrete BPChildren HeightNo. colds in last 12 months WeightAge last birthday Age OrdinalNominal Grade of conditionSex Positions 1 st 2 nd 3 rd Hair colour “Better - Same – Worse”Blood group Height groupsEye colour Age groups

7 Conversion & Re-classification Easier to summarise Ordinal / Nominal data Cut-off Points(who decides this?) Allows Continuous variables to be changed into Nominal variables BP> 90mmHg=Hypertensive BP=< 90mmHg=Normotensive Easier clinical decisions Categorisation reduces quality of data Statistical tests may be more sensational Good for summariesBad for analyses

8 Histograms and Bar-Charts Distinction is often lost Histograms The distribution of a continuous variable No gaps between the bars Bar-Chart Spaces between the bars Distribution of discrete / categorical data

9 Types of statistics / analyses DESCRIPTIVE STATISTICSDescribing a phenomena FrequenciesHow many… Basic measurementsMeters, seconds, cm 3, IQ INFERENTIAL STATISTICSInferences about phenomena Hypothesis TestingProving or disproving theories Confidence IntervalsIf sample relates to the larger population CorrelationAssociations between phenomena Significance testinge.g diet and health

10 Types of Data QUALITITATIVE Data expressed by type Data that has been described QUANTITATIVE Data classified by numeric value Data that has been measured or counted QUALITITATIVE and QUANTITATIVE data are not mutually exclusive Use of the two data types in research is ok

11 values have no “real” meaning values have “real” meaning Categorical Data NOMINAL DATA values that the data may have do not have specific order values act as labels with no real meaning e.g. hair colourbrown =1blond =2black =100 ORDINAL DATA values with some kind of ordering data that has been measured or counted e.g. social class:upper1middle = 2working = 3 e.g. glioblastoma tumor grade:12345 e.g. position in a race:1 st 2 nd 3 rd

12 Quantitative Data DISCRETE distinct or separate parts, with no finite detail e.g children in family CONTINUOUS between any two values, there would be a third e.g between meters there are centimetres INTERVAL equal intervals between values and an arbitrary zero on the scale e.g temperature gradient RATIO equal intervals between values and an absolute zero e.g body mass index

13 Quantitative Data COUNTS number of items having a particular shared characteristic PROPORTIONS number of items with a particular characteristic / by the number of the total population PERCENTAGES a proportion multiplied by 100 represents “parts per hundred” RATIO alternative to proportions - number with the characteristic / by the number without RATES A variance of the proportion method, expressed as counts per 1000

14 Terminology - Variables INDEPENDENT - Working hours, exposure, worker attitudes, policies - Chemical exposure in workplace DEPENDENT - Symptomotology, productivity, accident rates, attitudes, health - Performance on neuropsychological test CONTROLLED - Working hours, temperatures, exposure, diet, class, income - Ambient noise and temperature in testing room

15 White Hot Red Hot Cold “Dangerous” “Unpleasant” “Uncomfortable” “Tolerable” “Comfortable” “Cold” 80 o C 60 o C 40 o C 20 o C 10 o C Unsafe Safe Levels of Variables Temperature

16 5’6” 5’7” 5’8” 5’9” 5’10” 5’11” 6’ 6’1” 6’2” 6’3” 6’4” 5’6” 5’7” 5’8” 5’9” 5’10” 5’11” 6’ 6’1” 6’2” 6’3” 6’4”Height % of population Distributions Sir Francis Galton (1822-1911) Alumni of Birmingham University 9 books and > 200 papers Fingerprints, correlation of calculus, twins, neuropsychology, blood transfusions, travel in undeveloped countries, criminality and meteorology) Deeply concerned with improving standards of measurement

17 balls dropped through a succession of metal pins….. …..a normal distribution of balls do not have a normal distribution here. Why? Quincunx machine 1877

18 The distribution derived from the quincunx is not perfect It was only made from 18 balls Normal & Non-normal distributions

19 Galton’s quincunx machine ran with hundreds of balls a more “perfect” shaped normal distribution. Obvious implications for the size of samples of populations used The more lead shot runs through the quincunx machine, the smoother the distribution in the long run.....

20 A SAMPLE OF VISUAL ABILITIES IN THE UK (SIMPLIFIED DATA) very poor average very good frequency recruiting participants in the “r.n.i.b ” magazine would yield? recruiting participants in “ornathology” magazine would yield? recruiting participants in a gp surgery would yield? Normal & Non-normal distributions bigger samples are best (usually)

21 Presentation of data Why use tables and graphs FIRST PRINCIPALS OF DATA PRESENTATION enhance understanding clarity avoidance of misunderstanding WHY USE TABLES ? more accurate than graphs more concise than graphs WHY USE GRAPHS ? provide good general overview allows reader to visualise the concept

22 ExposedControlsT P n=197n=178 Age45.548.9 2.190.07 (yrs)(  9.4)(  7.3) I.Q10599 1.780.12 (  10.8)(  8.7) Speed 115.194.7 3.760.04 (ms) (  13.4)(  12.4) (ms) (  13.4)(  12.4) Presentation of data Table of means

23 ExposedControls Healthy 50 150200 Unwell 147 28175 197 178375 197 178375 Chi square (test of association) shows: Chi square = 7.2P = 0.02 Presentation of data Category tables

24 Bar charts Use for Comparing data and Counts of data Histograms Use for Comparing data and to show spread of data Pie charts Use for Counts of data Scatterplot Use to show spread of data Box plot Confidence intervals Graphical displays

25 Mean GHQ scores for exposure groups GHQ score Job Type Bar charts

26 Title of graph x-axis y-axis (ordinate) x-axis (abscissa) scale Data display area Legend key Graphical display components

27 Movie-goers’ ratings for National Lampoon’s European Vacation (1985) What does the distribution of votes indicate ? What other info is needed ? votes Viewer rating Graphical displays Some real data www.imdb.com

28 Movie-goers’ ratings for The Empire Strikes Back (1980) What does the distribution of votes indicate ? What other info is needed ? Viewer rating votes www.imdb.com Graphical displays Some real data

29 Movie-goers’ ratings (%) Rating10987654321 Lampoon3.53.1813.815.914.714.610.39.17.1 Empire39.420.217.510.94.92.81.30.70.71.8 Movie-goers’ ratings Rating10987654321 Lampoon312770121140129128908062 Empire6197318227491710766435201109113279 Movie data summary Both tables represent the same data..… Do either of them convey the general trend ? www.imdb.com

30 Lampoon Empire What is wrong with this bar chart ? How could it be improved ? votes Viewer rating Movie data summary Back to back comparison www.imdb.com

31 Lampoon Empire Can this be improved ? Viewer rating % of votes Movie data summary Back to back comparison www.imdb.com

32 Lampoon Empire Viewer rating % of votes over-complicated and messy www.imdb.com Movie data summary Back to back comparison

33 Score12345 Frequency3046657890 Tables determined numbers word processed less space Figures overview at a glance little “processing power” showing trends Clarity vs accuracy

34 Importance of Sample Size Forgotten in many studies Forgotten in many studies Little consideration given Little consideration given Appropriate size needed to confirm / refute hypotheses Appropriate size needed to confirm / refute hypotheses Small samples far too small to detect anything but the grossest difference Non-significant results are reported – Type 2 errors occur Small samples far too small to detect anything but the grossest difference Non-significant results are reported – Type 2 errors occur Too large a sample – unnecessary waste of (clinical) resources Ethical considerations – waste of patient time, inconvenience, discomfort Too large a sample – unnecessary waste of (clinical) resources Ethical considerations – waste of patient time, inconvenience, discomfort Make assessment of optimal sample size before starting investigation

35 Quantitative Data Summary What data is needed to answer the larger-scale research question What data is needed to answer the larger-scale research question Combination of quantitative and qualitative ? Combination of quantitative and qualitative ? Cleaning, re-scoring, re-scaling, or re-formatting Cleaning, re-scoring, re-scaling, or re-formatting Measurement of both IV’s and DV’s is complex but can be simplified Measurement of both IV’s and DV’s is complex but can be simplified Binary measurement makes analysis easier but less meaningful Binary measurement makes analysis easier but less meaningful Binary data needs clear parameters e.g exposed vs controls Binary data needs clear parameters e.g exposed vs controls Collecting good quality data at source is vital Collecting good quality data at source is vital

36 Quantitative Data Summary Continuous & Discrete data can also be converted into Binary data Continuous & Discrete data can also be converted into Binary data Normal distribution of participants / data points desirable Normal distribution of participants / data points desirable Means - age, height, weight, BMI, IQ, attitudes Means - age, height, weight, BMI, IQ, attitudes Frequencies / Classifications - job type, sick vs. healthy, dead vs alive Frequencies / Classifications - job type, sick vs. healthy, dead vs alive Means must be followed by Standard Deviation (SD or ±) Means must be followed by Standard Deviation (SD or ±) Presentation of data must enhance understanding or be redundant Presentation of data must enhance understanding or be redundant

37 Further Reading Altman DG. Designing Research. In: Altman DG (ed.) Practical Statistics For Medical Research. Chapman and Hall, London 1991; 74-106. Bland M. The design of experiments. In: Bland M. (ed.) An introduction to medical statistics. Oxford Medical Publications, Oxford 1995; 5-25. Daly LE, Bourke GJ. Epidemiological and clinical research methods. In: Daly LE, Bourke GJ. (eds.) Interpretation and uses of medical statistics. Blackwell Science Ltd, Oxford 2000; 143-201. Gao Smith F, Smith J. (eds.) Key Topics in Clinical Research. BIOS scientific Publications, Oxford 2002. Jackson CA. Planning Health and Safety Research Projects. Croner Health and Safety at Work Special Report 2002; 62: 1-16. Jackson CA. Analyzing Statistical Data in Occupational Health Research. Management of Health Risks Special Report, 81 Croner Publications, Surrey, June 2003


Download ppt "The following lecture has been approved for University Undergraduate Students This lecture may contain information, ideas, concepts and discursive anecdotes."

Similar presentations


Ads by Google