Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine.

Similar presentations


Presentation on theme: "Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine."— Presentation transcript:

1 Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

2 Statistics is the science of conducting studies to collect, organize, summarize, analyze, present, interpret and draw conclusions from data. Any values (observations or measurements) that have been collected

3 What Is Statistics? 1. Collecting Data e.g., Sample, Survey, Observe, Simulate 2. Characterizing Data e.g., Organize/Classify, Count, Summarize 3. Presenting Data e.g., Tables, Charts, Statements 4. Interpreting Results e.g. Infer, Conclude, Specify Confidence Why? Data Analysis Decision- Making © 1984-1994 T/Maker Co.

4 (1) Statistics arising out of biological sciences, particularly from the fields of Medicine and public health. (2) The methods used in dealing with statistics in the fields of medicine, biology and public health for planning, conducting and analyzing data which arise in investigations of these branches.

5  Summarize data  Generalize from a sample to a population  Compare groups  Test for relationships  Analyze multiple variables and their relationship to a given outcome

6  Assess time to an event as an endpoint  Report the characteristics of a test  Combine results of several studies  Assess costs and consequences of treatment  Report decision analyses and clinical practice guidelines

7 BASIC CONCEPTS Data : Set of values of one or more variables recorded on one or more observational units (singular: Datum) Categories of data 1. Primary data: observation, questionnaire, interviews & survey 2. Secondary data: census, medical records, registry Sources of data 1. Routinely kept records 2. Surveys (census) 3. Experiments 4. External source

8 Dataset: Data for a set of variables collection in group of persons. Data Table: A dataset organized into a table, with one column for each variable and one row for each person. Datasets and Data Tables

9 OBSAGEBMIFFNUMTEMP( 0 F)GENDEREXERCISE LEVELQUESTION 12623.2061.0011 23030.2965.5132 33228.91759.6134 43722.4168.4123 53325.5764.5035 62922.3170.2022 73223.0067.3011 83326.3172.8031 93222.2371.5014 103329.1563.2114 112620.8269.1013 123420.9473.6023 133136.3166.3025 143136.4066.9115 152728.6270.2122 163627.5268.5133 173525.614367.8134 183121.21170.7112 193622.7869.8021 203328.1367.8021 Typical Data Table

10 Definitions for Variables AGE: Age in years BMI: Body mass index, weight/height 2 in kg/m 2 FFNUM: The average number of times eating “fast food” in a week TEMP: High temperature for the day GENDER: 1- Female 0- Male EXERCISE LEVEL: 1- Low 2- Medium 3- High QUESTION: what is your satisfaction rating for this Biostatistics session ? 1- Very Satisfied 2- Somewhat Satisfied 3- Neutral 4- Somewhat dissatisfied 5- Dissatisfied

11 When collecting or gathering data we collect data from individuals cases on particular variables. A variable is a unit of data collection whose value can vary. Variables can be defined into types according to the level of mathematical scaling that can be carried out on the data. There are four types of data or levels of measurement: Types of variables and data 1. Nominal2. Ordinal 3. Interval4. Ratio

12 Scales of Measurement

13 Terminology  Categorical variables  Quantity variables  Nominal variables  Ordinal Variables  Binary data.  Discrete and continuous data.  Interval and ratio variables  Qualitative and Quantitative traits/ characteristics of data.

14 Categorical data  The objects being studied are grouped into categories based on some qualitative trait.  The resulting data are merely labels or categories.

15 Examples of categorical data  Eye color: blue, brown, black, green, etc.  Smoking status: smoker, non-smoker  Attitudes towards the death penalty: Strongly disagree, disagree, neutral, agree, strongly agree.

16 Categorical data Ordinal data Nominal data

17 Nominal data  A type of categorical data in which objects fall into unordered categories.  Studies measuring nominal data must ensure that each category is mutually exclusive and the system of measurement needs to be exhaustive.  Variables that have only two responses i.e. Yes or No, are known as dichotomies.

18 Examples of Nominal data  Type of Car  BMW, Mercedes, Lexus, Toyota, etc.,  Ethnicity  White British, Afro-Caribbean, Asian, Arab, Chinese, other, etc.  Smoking status  smoker, non-smoker

19 Binary Data  A type of categorical data in which there are only two categories. Examples:  Smoking status- smoker, non-smoker  Attendance- present, absent  Result of a exam- pass, fail.  Status of student- undergraduate, postgraduate.

20 Ordinal data is data that comprises of categories that can be rank ordered. Similarly with nominal data the distance between each category cannot be calculated but the categories can be ranked above or below each other. Ordinal data

21 Examples of Ordinal Data:  Grades in exam- A+, A, B+ B, C+, C,D, D+, and Fail.  Degree of illness- none, mild, moderate, acute, chronic.  Opinion of students about stats classes- Very unhappy, unhappy, neutral, happy, ecstatic!

22 Examples: Nominal data (Binary)& Ordinal data What is your gender? (please tick) Male Female Did you enjoy the teaching session ? (please tick) Yes No What is the level of satisfaction with the new curriculum at a medical school received? (please tick) Very satisfied Somewhat satisfied Neutral Somewhat dissatisfied Very dissatisfied

23 QUANTATIVE DATA  The objects being studied are ‘measured’ based on some quantitative trait.  The resulting data are set of numbers. Examples:  Pulse rate  Height  Age  Exam marks  Time to complete a statistics test  Number of cigarettes smoked

24 Quantitative data Continuous Discrete

25 Discrete Data Only certain values are possible (there are gaps between the possible values). Implies counting. Continuous Data Theoretically, with a fine enough measuring device. Implies counting.

26 26 Discrete data -- Gaps between possible values Continuous data -- Theoretically, no gaps between possible values Number of Children Hb

27 Examples of Discrete Data:  Number of children in a family  Number of students passing a stats exam  Number of crimes reported to the police  Number of bicycles sold in a day. Generally, discrete data are counts. We would not expect to find 2.2 children in a family or 88.5 students passing an exam or 127.2 crimes being reported to the police or half a bicycle being sold in one day.

28 Example of Continuous Data:  Age ( in years)  Height( in cms.)  Weight (in Kgs.)  Sys.BP, Hb., etc., ‘Generally, continuous data come from measurements.

29 Variables Category Quantity Nominal Ordinal Discrete (counting) Continuous (measuring)

30 Interval variables Examples:  Fahrenheit temperature scale- Zero is arbitrary- 40 Degrees is not twice as hot as 20 degrees.  IQ tests. No such thing as Zero IQ. 120 IQ not twice as intelligent as 60.  Question- Can we assume that attitudinal data represents real, quantifiable measured categories? (ie. That ‘very happy’ is twice as happy as plain ‘happy’ or that ‘Very unhappy’ means no happiness at all). “Statisticians not in agreement on this”.

31 Ratio variables Examples:  Can be discrete or continuous data.  The distance between any two adjacent units of measurement (intervals) is the same and there is a meaningful zero point.  Income- someone earning SR20,000 earns twice as much as someone who earns SR10,000.  Height  Weight  Age

32 These levels of measurement can be placed in hierarchical order. Hierarchical data order

33 Nominal data is the least complex and give a simple measure of whether objects are the same or different. Ordinal data maintains the principles of nominal data but adds a measure of order to what is being observed. Interval data builds on ordinal by adding more information on the range between each observation by allowing us to measure the distance between objects. Ratio data adds to interval with including an absolute zero. Hierarchical data order

34 34 QUANTITATIVE DATA QUALITATIVE DATA wt. (in Kg.) : under wt, normal & over wt. Ht. (in cm.): short, medium & tall

35 35 A science called clinimetrics in which qualities are converted to meaningful quantities by using the scoring system. Examples: (1) Apgar score based on appearance, pulse, grimace, activity and respiration is used for neonatal prognosis. (2) Smoking Index: no. of cigarettes, duration, filter or not, whether pipe, cigar etc., (3) APACHE( Acute Physiology and Chronic Health Evaluation) score: to quantify the severity of condition of a patient

36 Why do we need to know what type of data we are dealing with? The data type or level of measurement influences the type of statistical analysis techniques that can be used when analysing data. Data types – important?

37 Frequency Distributions What is a frequency distribution? What is a frequency distribution? A frequency distribution is an organization of raw data in tabular form, using classes (or intervals) and frequencies. What is a frequency count? What is a frequency count? The frequency or the frequency count for a data value is the number of times the value occurs in the data set.

38 Frequency Distributions data distribution – pattern of variability. the center of a distribution the ranges the shapes simple frequency distributions grouped & ungrouped frequency distributions

39 Categorical or Qualitative Frequency Distributions What is a categorical frequency distribution? What is a categorical frequency distribution? A categorical frequency distribution represents data that can be placed in specific categories, such as gender, blood group, & hair color, etc.

40 Categorical or Qualitative Frequency Distributions -- Example Example: Example: The blood types of 25 blood donors are given below. Summarize the data using a frequency distribution. AB B A O B O B O A O O B O A O B O B B B B O B B B A O AB AB O A O AB AB O A B AB O A A B AB O A

41 Categorical Frequency Distribution for the Blood Types -- Example Continued Note: The classes for the distribution are the blood types. Note: The classes for the distribution are the blood types.

42 Quantitative Frequency Distributions -- Ungrouped What is an ungrouped frequency distribution? What is an ungrouped frequency distribution? An ungrouped frequency distribution simply lists the data values with the corresponding frequency counts with which each value occurs.

43 Quantitative Frequency Distributions – Ungrouped -- Example Example: 57, 57, 56, 57, 58, 56, 54, 64, 53, 54, 54, 55, 57, 55, 60, 58 Example: The at-rest pulse rate for 16 athletes at a meet were 57, 57, 56, 57, 58, 56, 54, 64, 53, 54, 54, 55, 57, 55, 60, and 58. Summarize the information with an ungrouped frequency distribution.

44 Quantitative Frequency Distributions – Ungrouped -- Example Continued Note: The (ungrouped) classes are the observed values themselves.

45 Example of a simple frequency distribution (ungrouped) 5 7 8 1 5 9 3 4 2 2 3 4 9 7 1 4 5 6 8 9 4 3 5 2 1 f 93 82 72 61 54 44 33 23 13  f = 25

46 Relative Frequency Distribution Note: The relative frequency for a class is obtained by computing f/n.

47 Example of a simple frequency distribution 5 7 8 1 5 9 3 4 2 2 3 4 9 7 1 4 5 6 8 9 4 3 5 2 1 f rel f 93.12 82.08 72.08 61.04 54.16 44.16 33.12 23.12 13.12  f = 25  rel f = 1.0

48 Cumulative Frequency and Cumulative Relative Frequency Note: Table with relative and cumulative relative frequencies.

49 Example of a simple frequency distribution (ungrouped) 5 7 8 1 5 9 3 4 2 2 3 4 9 7 1 4 5 6 8 9 4 3 5 2 1 f cf rel f rel. cf 93 3.12.12 82 5.08.20 72 7.08.28 61 8.04.32 54 12.16.48 44 16.16.64 33 19.12.76 23 22.12.88 13 25.12 1.0  f = 25  rel f = 1.0

50 Quantitative Frequency Distributions -- Grouped What is a grouped frequency distribution? What is a grouped frequency distribution? A grouped frequency distribution is obtained by constructing classes (or intervals) for the data, and then listing the corresponding number of values (frequency counts) in each interval.

51 Patient No Hb (g/dl) Patient No Hb (g/dl) Patient No Hb (g/dl) 112.01111.22114.9 211.91213.62212.2 311.51310.82312.2 414.21412.32411.4 512.31512.32510.7 613.01615.72612.5 710.51712.62711.8 812.8189.12815.1 913.21912.92913.4 1011.22014.63013.1 Tabulate the hemoglobin values of 30 adult male patients listed below

52 Hb (g/dl)No. of patients 9.0 – 9.9 10.0 – 10.9 11.0 – 11.9 12.0 – 12.9 13.0 – 13.9 14.0 – 14.9 15.0 – 15.9 1 3 6 10 5 3 2 Total30 Frequency distribution of 30 adult male Frequency distribution of 30 adult male patients by Hb

53 DIAGRAMS/GRAPHS Categorical data --- Bar diagram (one or two groups) --- Pie diagram Continuous data --- Histogram --- Frequency polygon (curve) --- Stem-and –leaf plot --- Box-and-whisker plot

54 Commonly Used Graphs Histogram Height of bars proportional to frequency Width proportional to class boundaries Bar Chart Height proportional to frequency Width not really significant Frequency Polygon Plot points then connect with straight lines

55 Graphic Presentation of Data the histogram (quantitative data) the bar graph (qualitative data) the frequency polygon (quantitative data)

56 Two-dimensional graphs: Basic Set-Up

57 Histograms

58 Frequency Polygons

59 Example data 6863422730362832 7927222824254465 4325745136422831 2825451257511232 4938422731503821 1624644723224327 4928231911524631 30434912

60 Stem and leaf plot Stem-and-leaf of Age N = 60 Leaf Unit = 1.0 6 1 122269 19 2 1223344555777788888 11 3 00111226688 13 4 2223334567999 5 5 01127 4 6 3458 2 7 49

61 Bar Graphs The distribution of risk factor among cases with Cardio vascular Diseases Heights of the bar indicates frequency Frequency in the Y axis and categories of variable in the X axis The bars should be of equal width and no touching the other bars

62 HIV cases enrolment in USA by gender Bar chart

63 HIV cases Enrollment in USA by gender Stocked bar chart

64 Grouped Bar Graph

65 Pie diagram Pie diagram – depicts the percentage represented by each alternative as a slice of a circular pie; the larger the slice, the greater the percentage.

66 The prevalence of different degree of Hypertension in the population Pie Chart Circular diagram – total -100% Divided into segments each representing a category Decide adjacent category The amount for each category is proportional to slice of the pie

67

68 General rules for designing graphs A graph should have a self-explanatory legend A graph should help reader to understand data Axis labeled, units of measurement indicated Scales important. Start with zero (otherwise // break) Avoid graphs with three-dimensional impression, it may be misleading (reader visualize less easily

69 Thank You


Download ppt "Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine."

Similar presentations


Ads by Google