Presentation on theme: "ADVANCED STATISTICS FOR MEDICAL STUDIES Mwarumba Mwavita, Ph.D. School of Educational Studies Research Evaluation Measurement and Statistics (REMS) Oklahoma."— Presentation transcript:
ADVANCED STATISTICS FOR MEDICAL STUDIES Mwarumba Mwavita, Ph.D. School of Educational Studies Research Evaluation Measurement and Statistics (REMS) Oklahoma State University
Statistics Set of methods and rules for organizing, summarizing, and interpreting information. Two categories of statistical procedures to organize and interpreting data.
Descriptive and Inferential Statistics Descriptive statistics are statistical procedures that are used to summarize, organize, and simplify data. Inferential statistics – techniques used to study samples and make generalizations about the populations from which they were selected.
Descriptive Statistics Descriptive measure computed from the data of a sample is called a statistic Descriptive measure computed form the data of a population is called a parameter
Central Tendency A statistical measure that identifies a single score as representative for an entire distribution The goal of central tendency is to find the single score that is most typical or most representative of the entire group Mean – commonly referred as the average Mode – most frequent score in a distribution Median – the middle value in a distribution
Variability Range - highest score – lowest score Semi-interquartile range - (Q3 – Q1)/2 Standard Deviation – the standard distance from mean Variance - the mean of the squared deviations Coefficient of Variation (CV) - useful for comparing two or more data with different units of measurement because it is expressed in percentage (CV = SD/mean x 100%) Confidence Interval (CI) - is a measure of the precision of the point estimate
Normal distribution A bell shape distribution It is symmetrical
Terms IV – Independent variable (treatment) DV – Dependent variable (outcome)
Z- Test Used in hypothesis testing when a sample mean is used to test a hypothesis about an unknown population, generally a population that has received treatment Note the parameters of the population that did not receive treatment are known
T- test T statistic is used to test hypotheses about µ when the value for population standard deviation is not known Uses a t-distribution- thus degree of freedom (number of scores in a sample that are free to vary) Sample size determines use of t-distribution
Independent and Dependent t-test Independent t-test uses two samples of the treatment conditions. (rule of thumb at least 10 subjects per each group) Dependent also is referred as repeated- measure. A single sample of individuals is measured more than once on the same dependent variable
ANOVA (Analysis of Variance) ANOVA - hypothesis-testing procedure used to 1. test hypotheses about population variances 2. evaluate mean differences between two or more treatments (or populations) Uses variances to determine if the means are significantly different.
1.Single factor (one way) - one treatment under different levels 2.Factorial designs – involves more than one factor (treatment) 3.Repeated measures – assess a measurement on the same participants under different condition/time ANOVA (Analysis of Variance)
Correlation and Regression Analysis Correlation analyses mathematically identify and describe relationships between variables Regression analysis attempts to predict or estimate the value of a response variable form the known values of one or more explanatory variables
Factor Analysis Exploratory factor analysis – used when the researcher does not know how many factors are necessary to explain the inter-relationships among a set of characteristics, indicators, or items (Reduction) Confirmatory factor analysis- assess the extent to which the hypothesized organization of a set of identified factor fits the data
Survival Analysis Survival/failure analysis is a family of techniques dealing with the time it takes for something to happen: cure, a failure, a relapse, a death and so on Two major varieties of the technique are life tables, which describe the course of survival of one or more groups of cases The second one encompasses a set of regression techniques in which the DV is survival time
Nonparametric techniques Usually do not state hypotheses in terms of a specific parameter They make vary few assumptions about the population distribution- distribution-free tests. Suited for data measured in ordinal and nominal scales Not as sensitive as parametric tests; more likely to fail in detecting a real difference between two treatments
Types of nonparametric tests Chi-square statistic tests for Goodness of Fit (how well the obtained sample proportions fit the population proportions specified by the null hypothesis Test for independence – tests whether or not there is a relationship between two variables
More Terms Type I error – rejecting a true null hypothesis. (treatment has an effect when in fact the treatment has no effect) Alpha level for a hypothesis test is the probability that the test will lead to a Type I error
Scenario 1 Alcohol appears to be involved in a variety of birth defects, including low birth weight and retarded growth. A researcher would like to investigate the effect of prenatal alcohol on birth weight. How will the researcher do this? D.V. I.V. Participants
Scenario 2 A researcher would like to know whether room temperature affects eating behavior. Design I.V. D.V. Others Participants
Scenario 3 A patient recently visited her physician complaining of backache. The physician is aware of a new technique of disc replacement. The physician would like to test the technique but does not want to use it on the patient. What would you advise the physician to do in this case?
Scenario 4 You notice that students from a nearby elementary school that you have attended suffer from the common cold, a disease that has been at the school for a while. How does this school compare to an elementary school across town? How would you go about investigating this problem?
Scenario 5 Suppose you are interested in finding out how a new treatment on osteoporosis among women will work. Design IV DV Others
Scenario 6 Using scenario 5, how can we make it a two- way ANOVA? How could we make it a Repeated-measures ANOVA?
Scenario 7 Diabetes has been on the increase among American adolescents. A researcher is interested in determining factors that contribute to rise of diabetes among adolescents
Scenario 8 A physician is interested in finding out the factors that contribute to lung cancer. How would you design this study?
Scenario 9 How would you investigate factors that contribute to high blood pressure among people?
Summary Problem Design issues Variables Participants Sample size