Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Introduction to biostatistics By Dr. S. Shaffi Ahamed Asst. Professor Dept. of Family & Community Medicine KKUH.

Similar presentations


Presentation on theme: "1 Introduction to biostatistics By Dr. S. Shaffi Ahamed Asst. Professor Dept. of Family & Community Medicine KKUH."— Presentation transcript:

1 1 Introduction to biostatistics By Dr. S. Shaffi Ahamed Asst. Professor Dept. of Family & Community Medicine KKUH

2 2 This session covers: Background and need to know Biostatistics Definition of Statistics and Biostatistics Types of data Frequency distribution of a data Graphical representation of a data

3 3 What Is Statistics? Why? 1. Collecting Data e.g., Sample, Survey, Observe, Simulate 2. Characterizing Data e.g., Organize/Classify, Count, Summarize 3. Presenting Data e.g., Tables, Charts, Statements 4. Interpreting Results e.g. Infer, Conclude, Specify Confidence Data Analysis Decision- Making © 1984-1994 T/Maker Co.

4 4 Statistics is the science of conducting studies to collect, organize, summarize, analyze, present, interpret and draw conclusions from data. Any values (observations or measurements) that have been collected

5 5 Basis

6 6 Dynamic nature of the U n i v e r s e the very continuous change in Nature brings - uncertainty and - variability in each and every sphere of the Universe

7 7 We by no mean can control or over-power the factor of uncertainty but capable of measuring it in terms of Probability

8 8 Sources of Medical Uncertainties 1. Intrinsic due to biological, environmental and sampling factors 2. Natural variation among methods, observers, instruments etc. 3. Errors in measurement or assessment or errors in knowledge 4. Incomplete knowledge

9 9 Biostatistics is the science that helps in managing medical uncertainties

10 10 “BIOSTATISICS ” (1) Statistics arising out of biological sciences, particularly from the fields of Medicine and public health. (2) The methods used in dealing with statistics in the fields of medicine, biology and public health for planning, conducting and analyzing data which arise in investigations of these branches.

11 11 CLINICAL MEDICINE Documentation of medical history of diseases. Planning and conduct of clinical studies. Evaluating the merits of different procedures. In providing methods for definition of “normal” and “abnormal”.

12 12 PREVENTIVE MEDICINE To provide the magnitude of any health problem in the community. To find out the basic factors underlying the ill-health. To evaluate the health programs which was introduced in the community (success/failure). To introduce and promote health legislation.

13 13 Role of Biostatics in Health Planning and Evaluation In carrying out a valid and reliable health situation analysis, including in proper summarization and interpretation of data. In proper evaluation of the achievements and failures of a health programs.

14 14 Role of Biostatistics in Medical Research In developing a research design that can minimize the impact of uncertainties In assessing reliability and validity of tools and instruments to collect the information In proper analysis of data

15 15 BASIC CONCEPTS Data : Set of values of one or more variables recorded on one or more observational units (singular: Datum) Categories of data 1. Primary data: observation, questionnaire, record form, interviews, survey, 2. Secondary data: census, medical record,registry Sources of data 1. Routinely kept records 2. Surveys (census) 3. Experiments 4. External source

16 16 Variables and Types of Data Variables and Types of Data To gain knowledge about seemingly haphazard events, statisticians collect information for variables, which describe the event. Variables whose values are determined by chance are called random variables Variables is a characteristic or attribute that can assume different values. is also a characteristics of interest, one that can be expressed as a number that possessed by each item under study. The value of this characteristics is likely to change or vary from one item in the data set to the next.

17 17 Variables can be classified By how they are categorized, counted or measured - Level of measurements of data As Quantitative and Qualitative

18 18 Nomenclature Nominal variable: Variable consists of named categories with no implied order among them. Has cancer or not Received treatment or did not Is alive or dead Is coded (male = 1, female = 2) but has no quantitative value.

19 19 Nomenclature (cont.) Ordinal variable: Variable consists of ordered categories and differences between categories are not equal. Patient status (Improved / Same / Worse) Diagnosis (Stage I / Stage II / Stage III) Evaluation (Satisfied / neutral / Dissatisfied) The coding now has meaning: Improved = 2, Same = 1, worse = 0 However, distance between values is not a constant.

20 20 Other Nomenclature (cont.) Interval variable: Variable has equal distances between values but the zero point is arbitrary. IQ 70 to 80 same as IQ 90 to 100. IQ scale could convert 100 to 500 and have same meaning. IQ of 100 is not twice as smart as IQ of 50. Temperature (37º C -- 36º C; 38º C-- 37º C are equal) and No implication of ratio (30º C is not twice as hot as 15º C)

21 21 Other Nomenclature (cont.) Ratio variable: Variable has equal intervals between values and a meaningful zero point. Height, Weight 220 pounds is twice as heavy as 110 pounds. Even when converted to kilos, the ratio stays the same (100 kilos is twice as heavy as 50 kilos).

22 22 Scales of Measurement QualitativeQualitativeQuantitativeQuantitative NumericalNumerical NumericalNumerical NonnumericalNonnumerical DataData NominalNominalOrdinalOrdinalNominalNominalOrdinalOrdinalIntervalIntervalRatioRatio

23 23 Level of Measurements of Data Examples

24 24 Discrete data -- Gaps between possible values Continuous data -- Theoretically, no gaps between possible values Number of Children Hb

25 25 CONTINUOUS DATA QUALITATIVE DATA wt. (in Kg.) : under wt, normal & over wt. Ht. (in cm.): short, medium & tall

26 26 Table 1 Distribution of blunt injured patients according to hospital length of stay

27 27 CLINIMETRICS A science called clinimetrics in which qualities are converted to meaningful quantities by using the scoring system. Examples: (1) Apgar score based on appearance, pulse, grimace, activity and respiration is used for neonatal prognosis. (2) Smoking Index: no. of cigarettes, duration, filter or not, whether pipe, cigar etc., (3) APACHE( Acute Physiology and Chronic Health Evaluation) score: to quantify the severity of condition of a patient

28

29

30 30 INVESTIGATION

31 31 An overview of descriptive statistics and statistical inference Statistical Inference Descriptive Statistics No Yes

32 32 Descriptive & Inferential Statistics Descriptive & Inferential Statistics Inferential statistics consists of generalizing from samples to populations, performing estimations hypothesis testing, determining relationships among variables, and making predictions. Used when we want to draw a conclusion for the data obtain from the sample Used to describe, infer, estimate, approximate the characteristics of the target population Descriptive statistics consists of the collection, organization, classification, summarization, and presentation of data obtain from the sample. Used to describe the characteristics of the sample Used to determine whether the sample represent the target population by comparing sample statistic and population parameter

33 33 Frequency Distributions “ A Picture is Worth a Thousand Words ”

34 34 Frequency Distributions data distribution – pattern of variability. the center of a distribution the ranges the shapes simple frequency distributions grouped frequency distributions

35 35 Simple Frequency Distribution The number of times that score occurs Make a table with highest score at top and decreasing for every possible whole number N (total number of scores) always equals the sum of the frequency  f = N

36 36 Example of a simple frequency distribution 5 7 8 1 5 9 3 4 2 2 3 4 9 7 1 4 5 6 8 9 4 3 5 2 1 f 93 82 72 61 54 44 33 23 13  f = 25

37 37 Relative Frequency Distribution Proportion of the total N Divide the frequency of each score by N Rel. f = f/N Sum of relative frequencies should equal 1.0 Gives us a frame of reference

38 38 Example of a simple frequency distribution 5 7 8 1 5 9 3 4 2 2 3 4 9 7 1 4 5 6 8 9 4 3 5 2 1 f rel f 93.12 82.08 72.08 61.04 54.16 44.16 33.12 23.12 13.12  f = 25  rel f = 1.0

39 39 Cumulative Frequency Distributions cf = cumulative frequency: number of scores at or below a particular score A score’s standing relative to other scores Count from lower scores and add the simple frequencies for all scores below that score

40 40 Example of a simple frequency distribution 5 7 8 1 5 9 3 4 2 2 3 4 9 7 1 4 5 6 8 9 4 3 5 2 1 f rel f cf 93.12 3 82.08 5 72.08 7 61.04 8 54.16 12 44.16 16 33.12 19 23.12 22 13.12 25  f = 25  rel f = 1.0

41 41 Patien t No Hb (g/dl) Patien t No Hb (g/dl) Patien t No Hb (g/dl) 112.01111.22114.9 211.91213.62212.2 311.51310.82312.2 414.21412.32411.4 512.31512.32510.7 613.01615.72612.5 710.51712.62711.8 812.8189.12815.1 913.21912.92913.4 1011.22014.63013.1 Tabulate the hemoglobin values of 30 adult male patients listed below

42 42 Steps for making a table Step1 Find Minimum (9.1) & Maximum (15.7) Step2 Calculate difference 15.7 – 9.1 = 6.6 Step3 Decide the number and width of the classes (7 c.l) 9.0 -9.9, 10.0-10.9,---- Step4 Prepare dummy table – Hb (g/dl), Tally mark, No. patients

43 43 Hb (g/dl)Tall marksNo. patients 9.0 – 9.9 10.0 – 10.9 11.0 – 11.9 12.0 – 12.9 13.0 – 13.9 14.0 – 14.9 15.0 – 15.9 Total Hb (g/dl) Tall marksNo. patients 9.0 – 9.9 10.0 – 10.9 11.0 – 11.9 12.0 – 12.9 13.0 – 13.9 14.0 – 14.9 15.0 – 15.9 l lll llll 1 llll lll ll 1 3 6 10 5 3 2 Total-30 DUMMY TABLE Tall Marks TABLE

44 44 Hb (g/dl)No. of patients 9.0 – 9.9 10.0 – 10.9 11.0 – 11.9 12.0 – 12.9 13.0 – 13.9 14.0 – 14.9 15.0 – 15.9 1 3 6 10 5 3 2 Total30 Table Frequency distribution of 30 adult male patients by Hb

45 45 Table Frequency distribution of adult patients by Hb and gender: Hb and gender: Hb (g/dl) GenderTotal MaleFemale <9.0 9.0 – 9.9 10.0 – 10.9 11.0 – 11.9 12.0 – 12.9 13.0 – 13.9 14.0 – 14.9 15.0 – 15.9 0 1 3 6 10 5 3 2 2358642023586420 4 8 14 16 9 5 2 Total30 60

46 46 Elements of a Table Ideal table should have Number Title Column headings Foot-notes Number – Table number for identification in a report Title,place - Describe the body of the table, variables, Time period (What, how classified, where and when) Column - Variable name, No., Percentages (%), etc., Heading Foot-note(s) - to describe some column/row headings, special cells, source, etc.,

47 47 DIAGRAMS/GRAPHS Qualitative data (Nominal & Ordinal) --- Bar charts (one or two groups) Quantitative data (discrete & continuous) --- Histogram --- Frequency polygon (curve) --- Stem-and –leaf plot --- Box-and-whisker plot

48 48 Example data 6863422730362832 7927222824254465 4325745136422831 2825451257511232 4938422731503821 1624644723224327 4928231911524631 30434912

49 49 Histogram Figure 1 Histogram of ages of 60 subjects

50 50 Polygon

51 51 Cumulative Frequency Polygon Cumulative counts can be converted to percents. Shows number cases up to & including all within the interval. % # Common in vital statistics 50 30

52 52 Example data 6863422730362832 7927222824254465 4325745136422831 2825451257511232 4938422731503821 1624644723224327 4928231911524631 30434912

53 53 Stem and leaf plot Stem-and-leaf of Age N = 60 Leaf Unit = 1.0 6 1 122269 19 2 1223344555777788888 (11) 3 00111226688 13 4 2223334567999 5 5 01127 4 6 3458 2 7 49

54 54 Box plot

55 55 Descriptive statistics report: Boxplot - minimum score - maximum score - lower quartile - upper quartile - median - mean - the skew of the distribution: positive skew: mean > median & high-score whisker is longer negative skew: mean < median & low-score whisker is longer

56 56 Application of a box and Whisker diagram

57 57 The prevalence of different degree of Hypertension in the population Pie Chart Circular diagram – total -100% Divided into segments each representing a category Decide adjacent category The amount for each category is proportional to slice of the pie

58 58 Bar Graphs The distribution of risk factor among cases with Cardio vascular Diseases Heights of the bar indicates frequency Frequency in the Y axis and categories of variable in the X axis The bars should be of equal width and no touching the other bars

59 59 HIV cases enrolment in USA by gender Bar chart

60 60 HIV cases Enrollment in USA by gender Stocked bar chart

61 61 Graphic Presentation of Data the histogram (quantitative data) the bar graph (qualitative data) the frequency polygon (quantitative data)

62 62

63 63 General rules for designing graphs A graph should have a self-explanatory legend A graph should help reader to understand data Axis labeled, units of measurement indicated Scales important. Start with zero (otherwise // break) Avoid graphs with three-dimensional impression, it may be misleading (reader visualize less easily

64 64 Any Questions


Download ppt "1 Introduction to biostatistics By Dr. S. Shaffi Ahamed Asst. Professor Dept. of Family & Community Medicine KKUH."

Similar presentations


Ads by Google