Presentation is loading. Please wait.

Presentation is loading. Please wait.

BIOSTATISTICS Asst. Prof. NIRUWAN TURNBULL, PhD Faculty of Public Health Mahasarakham University 1.

Similar presentations

Presentation on theme: "BIOSTATISTICS Asst. Prof. NIRUWAN TURNBULL, PhD Faculty of Public Health Mahasarakham University 1."— Presentation transcript:

1 BIOSTATISTICS Asst. Prof. NIRUWAN TURNBULL, PhD Faculty of Public Health Mahasarakham University 1

2 Biostatistics “ Bio-Sadistics ” 2

3 In God we trust. All others must bring data. 3

4 “Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write.” H.G. Wells 4

5 5

6  Statistical ideas can be intimidating and difficult  Thus:  Statistical results are often “skipped-over” when reading scientific literature  Data is often mis-interpreted 6

7 “On average, my class is doing well. Half of my students think that 2+2=3, the other half thinks that 2+2=5.” 7

8  A Bar Chart is a map of the locations of the nearest taverns  A p-value is the result of a urinalysis  Martingale residuals are the droppings of a rare bird  A t-test is a blinded taste test between black and green tea 8

9  Spends Saturday nights in the library  Favorite Movie: Revenge of the Nerds  Doesn’t play golf  A necessary evil  SMART! 9

10 “A statistician is a person that is good with numbers but that lacks the personality to become an accountant.” 10

11  The science of collecting, monitoring, analyzing, summarizing, and interpreting data  This includes study design 11

12  Statistics applied to biological (life) problems, including:  Public health  Medicine  Ecological and environmental  Much more statistics than biology, however biostatisticians must learn the biology as well 12

13  Identify and develop treatments for disease and estimate their effects.  Identify risk factors for diseases.  Design, monitor, analyze, interpret, and report results of clinical studies.  Develop statistical methodologies to address questions arising from medical/public health data. 13

14  Combines rigors of mathematics with uncertainties of the real world.  Can make contribution to advancement of science, statistics, medicine, and public health.  Can study diseases/health problems in which you may have an interest (cancers, HIV, reproductive health, …). 14

15  Much of life is composed of a systematic component (i.e., signal) and a random component (i.e., error or noise)  Example:  Smoking is associated with lung cancer.  Yet not everyone that smokes, gets lung cancer, and not everyone that gets lung cancer, smokes  Yet we know that there is an association (a systematic component) 15

16  Our challenge  Identify the systematic component (separate it from the random component), estimate it, and perhaps make inferences with it 16

17  Population  A group of individuals that we would like to know something about  Parameter  A characteristic of the population in which we have a particular interest  Often denoted with Greek letters ( μ, σ, ρ )  Examples:  The proportion of the population that would respond to a certain drug  The association between a risk factor and a disease in a population 17

18  คือ ค่าต่างๆ ที่รวบรวมมาจากประชากรหรือ คำนวณได้จากประชากร ใช้อักษรกรีกเป็น สัญลักษณ์ ได้แก่   แทนค่า ค่าเฉลี่ยเลขคณิต   แทนค่า ส่วนเบี่ยงเบนมาตรฐาน   2 แทนค่า ความแปรปรวน   แทนค่า สัมประสิทธิ์สหสัมพันธ์

19  คือค่าต่างๆ ที่รวบรวมมาจากกลุ่มตัวอย่างหรือ คำนวณได้จากกลุ่มตัวอย่าง ใช้ตัว ภาษาอังกฤษเป็นสัญลักษณ์ ได้แก่  แทนค่า ค่าเฉลี่ยเลขคณิต  s แทนค่า ส่วนเบี่ยงเบนมาตรฐาน  s 2 แทนค่า ความแปรปรวน  r แทนค่า สัมประสิทธิ์สหสัมพันธ์

20  Sample  A subset of a population (hopefully representative)  Statistic  A characteristic of the sample  Examples:  The observed proportion of the sample that responds to treatment  The observed association between a risk factor and a disease in this sample 20

21  Studying populations is too expensive and time- consuming, and thus impractical  If a sample is representative of the population, then by observing the sample we can learn something about the population  And thus by looking at the characteristics of the sample (statistics), we may learn something about the characteristics of the population (parameters) 21

22 Populations and Samples 22 Sample / Statistics x, s, s 2 Population Parameters μ, σ, σ 2

23  Two steps  Descriptive Statistics  Describe the sample  Inferential Statistics  Make inferences about the population using what is observed in the sample  Primarily performed in two ways:  Hypothesis testing  Estimation 23

24  Samples are random  If we had chosen a different sample, then we would obtain different statistics (sampling variation or random variation)  However, note that we are trying to estimate the same (constant) population parameters 24

25  Describe the Sample  Begin one variable at a time  Describe important variables in your analyses (e.g., endpoints, demographics, confounders, etc.) 25

26  Several types of data  Nominal  Ordinal  Discrete  Continuous  Time-to-event with censoring  The type of data influences the analysis methods to be employed 26

27 ScaleCharacteristic Question Examples Nominal Is A different than B?Marital status Eye color Gender Religious affiliation Race Ordinal Is A bigger than B?Stage of disease Severity of pain Level of satisfaction Interval By how many units do A and B differ? Temperature SAT score Ratio How many times bigger than B is A? Distance Length Time until death Weight Height 27

28  Mutually exclusive unordered categories  Examples  Sex (male, female)  Race/ethnicity (white, black, latino, asian, native american, etc.)  Site  Can summarize in:  Tables – using counts and percentages  Bar chart/graph 28

29  Ordered Categories  Examples  Adverse events  Mild, moderate, severe, life-threatening, death  Income  Low, medium, high 29

30  Often only integer numbers are possible  If there are many different discrete values, then discrete data is often treated as continuous  Examples: CD4 count, HIV viral load  If there are very few discrete values, then discrete data is often treated as ordinal 30

31  Any value on the continuum is possible (even fractions or decimals)  Examples:  Height  Weight  Many “discrete” variables are often treated as continuous  Examples: CD4 count, viral load 31

32  Time to an event (continuous variable)  The event does not have to be survival  Concept of “Censoring”  If we follow a person until the event, then the survival time is clear  If we follow someone for a length of time but the event does not occur, the the time is censored (but we still have partial information; namely that the event did not occur during the follow up period)  Examples: time to progression (cancer), time to response, time to relapse, time to death 32

33  Think of data as a rectangular matrix of rows and columns  Simplest structure  Rows represent the “experimental unit” (e.g., person)  Each row is an independent observation  Columns represent “variables” measured on the experimental unit  More complex structures  Multiple rows per person (e.g., multiple timepoints) 33

34 34

35  It is ALWAYS a good idea to summarize your data (at least for important variables)  You become familiar with the data and the characteristics of the sample that you are studying  You can also identify problems with data collection or errors in the data (data management issues)  Range checks for illogical values 35

36  Some visual ways to summarize data (one variable at a time):  Tables  Graphs  Bar charts  Histograms  Box plots 36

37  Summarizes a variable with counts and percentages  The variable is categorical (e.g., nominal or ordinal) 37

38 Cause of Death FrequencyPercent Motor Vehicle48 Drowning14 House Fire12 Homicide77 Other19 Total100 38

39  Note that you can take a continuous variable and create categories with it  How do you create categories for a continuous variable?  Choose cutoffs that are biologically meaningful  Natural breaks in the data  Precedent from prior research 39

40  How to choose the categories?  Talk to physician about risk categories  May look at National Cholesterol Education Program (NCEP) guidelines and categories 40

41 41

42  Bar Graphs  Nominal data  No order to horizontal axis  Histograms  Continuous or ordinal data on horizontal axis  Box Plots  Continuous data 42

43  The width of the plot has no meaning  25% of the data <13  25% of the data 13-16  25% of the data 16-18  25% of the data >18 43

44 1. Guide-to-Biostatistics.pdf Guide-to-Biostatistics.pdf 2. 3. hti/library/lecture_notes/env_health_science_student s/ln_biostat_hss_final.pdf hti/library/lecture_notes/env_health_science_student s/ln_biostat_hss_final.pdf 4. hdpropedeutics/Skripten/Medical%20Biostatistics%20 I%20version%202009-10.pdf hdpropedeutics/Skripten/Medical%20Biostatistics%20 I%20version%202009-10.pdf 5. method.pdf method.pdf 44

45 - Read through Biostatistics book from, then - We need four groups to solve out these problems, then present it, what are they talking about? 1. Statistical Methods for Assessing Biomarkers (P.81-109) 2. Power and Sample Size Considerations in Molecular Biology (P.111-130) 3. Multiple Tests for Genetic Effects in Association Studies (P. 143-168) 4. Statistical Considerations in Assessing Molecular Markers for Cancer Prognosis and Treatment Efficacy (P. 169-189) 45

Download ppt "BIOSTATISTICS Asst. Prof. NIRUWAN TURNBULL, PhD Faculty of Public Health Mahasarakham University 1."

Similar presentations

Ads by Google