Presentation is loading. Please wait.

Presentation is loading. Please wait.

HIM 3200 Midterm Review Dr. Burton. Mid-term review Types of data Normal distribution Variance Standard deviation and z scores 2 X 2 table Hypothesis.

Similar presentations


Presentation on theme: "HIM 3200 Midterm Review Dr. Burton. Mid-term review Types of data Normal distribution Variance Standard deviation and z scores 2 X 2 table Hypothesis."— Presentation transcript:

1 HIM 3200 Midterm Review Dr. Burton

2 Mid-term review Types of data Normal distribution Variance Standard deviation and z scores 2 X 2 table Hypothesis testing H 0 : H A : t-test Pearson r/Linear regression Chi square

3 Measurements Frequency –Incidence The frequency of new occurrences of disease, injury, or death in the study population during the time being examined. –Prevalence The number of persons in defined population that had a specified disease or condition –Point prevalence (at a particular point in time.) –Period prevalence (the sum of the point prevalence at the beginning of the interval plus the incidence during the interval.)

4 Measurements FrequencyFrequency –Incidence –Prevalence Risk –“The proportion of persons who are unaffected at the beginning of a study period but who undergo the risk event during the study period.”

5 Risk event: –Death –Disease –Injury Cohort: –Persons at risk for the event.

6 Measurements FrequencyFrequency –Incidence –Prevalence RiskRisk –“The proportion of persons who are uneffected at the beginning of a study period but who undergo the risk event during the study period.” Rates –“The frequency of events that occur in a defined time period, divided by the average population at risk.”

7 Rates Rate = ------------------- x Constant multiplier Numerator The constant multiplier is usually 100, 1000, 10,000 or 100,000. Types of rates Incidence rates (i.e. Per 1000) Prevalence rates (Proportional i.e. 20%) Incidence density (frequency of new events per person time) Denominator

8 Equations for the most commonly used population data. –(Mortality) Table 1 – 10 p.18 Osborn text –(Morbidity) Table 1 – 11 p. 21 Osborn text

9 Differential and nondifferential error Bias is a differential error –A nonrandom, systematic, or consistent error in which the values tend to be inaccurate in a particular direction. Nondifferential are random errors

10 Bias Three most problematic forms of bias in medicine: –1. Selection (Sampling) Bias: The following are biases that distort results because of the selection process Admission rate (Berkson’s) bias –Distortions in risk ratios occur as a result of different hospital admission rate among cases with the risk factor, cases without the risk factor, and controls with the risk factor –causing greatly different risk-factor probabilities to interfere with the outcome of interest. Nonresponse bias –i.e. noncompliance of people who have scheduled interviews in their home. Lead time bias –A time differential between diagnosis and treatment among sample subjects may result in erroneous attribution of higher survival rates to superior treatment rather than early detection.

11 Bias Three most problematic forms of bias in medicine: –1. Selection (Sampling) Bias Admission rate (Berkson’s) biasAdmission rate (Berkson’s) bias Nonresponse biasNonresponse bias Lead time biasLead time bias –2. Information (misclassification) Bias Recall biasRecall bias –Differentials in memory capabilities of sample subjects Interview biasInterview bias –“blinding of interviewers to diseased and control subjects is often difficult. Unacceptability biasUnacceptability bias –Patients reply with “desirable” answers

12 Bias Three most problematic forms of bias in medicine: –1. Selection (Sampling) Bias Admission rate (Berkson’s) bias Nonresponse bias Lead time bias –2. Information (misclassification) Bias Recall bias Interview bias Unacceptability bias –3. Confounding A confounding variable has a relationship with both the dependent and independent variables that masks or potentiates the effect of the variable on the study.A confounding variable has a relationship with both the dependent and independent variables that masks or potentiates the effect of the variable on the study.

13 Neyman bias “late look bias” if it results in selecting fewer individuals with severe disease because they died before detection. “length bias” in screening programs which tend to select less aggressive cases for treatment.

14 2 X 2 Table comparing the test results of two observers PositiveNegative Positive Negative Observer No. 1 Observer No. 2 ab c d a + b c + d a + c b + d a+b+c+d Total

15 + _ + A B A + B - C D C + D A + C B + D Sensitivity = A/(A + C) Specificity = D/(B + D) False- positive rate = B/(B + D) False-negative rate = C/(A + C) Positive predictive value = A/(A + B) Negative predictive value = D/ (D + C) Accuracy = (A + D) / (A + B + C + D)

16 Types of Variation Nominal variablesNominal variables Dichotomous (Binary) variables Ordinal (Ranked) variables Continuous (Dimensional) variables Ratio variables Risks and Proportions as variables

17 Nominal A O B AB Social Security Number 123 45 6789 312 65 8432 555 44 7777

18 Types of Variation Nominal variables Dichotomous (Binary) variablesDichotomous (Binary) variables Ordinal (Ranked) variables Continuous (Dimensional) variables Ratio variables Risks and Proportions as variables

19 Dichotomous (Binary) variables WNL Not WNL Accept Reject Normal Abnormal

20 Types of Variation Nominal variables Dichotomous (Binary) variables Ordinal (Ranked) variablesOrdinal (Ranked) variables Continuous (Dimensional) variables Ratio variables Risks and Proportions as variables

21 Ordinal (Ranked) variables Strongly agree, agree, neutral, disagree, strongly disagree

22 Types of Variation Nominal variables Dichotomous (Binary) variables Discrete variables Ordinal (Ranked) variables Continuous (Dimensional) variablesContinuous (Dimensional) variables Ratio variables Risks and Proportions as variables

23 Continuous (Dimensional) variables Height Blood Pressure Weight Temperature 32° F

24 Types of Variation Nominal variables Dichotomous (Binary) variables Discrete variables Ordinal (Ranked) variables Continuous (Dimensional) variables Ratio variablesRatio variables Risks and Proportions as variables

25 Ratio variables A continuous scale that has a true zero point

26 Measures of Central Tendency Mode: the value with the highest number of observations in a data set. Median: the middle observation when data have been arranged from highest to lowest. Mean: (arithmetic) the average value of all observed values. Mean = x  (x i ) NiNi Sum =  Observed values = x i Total number of observations = N i

27 Raw data and results of Cholesterol levels in 26 subjects p.115 Number of observations or N 26 Initial HDL values31, 41, 44, 46, 47, 47, 48, 49, 52, 53, 54, 57, 58, 58, 60, 60, 62, 63, 64, 67, 69, 70, 77, 78, 81, 90 mg/dl Highest values90 mg/dl Lowest value31 mg/dl Mode47, 48, 58, 60 mg/dl Median(57 + 58)/2 = 57.5 mg/dl Sum of the values  (x i )1496 mg/dl Means, x 1496/26 = 57.5 mg/dl

28 Percentiles (quantiles) The median is the 50% The 75 th percentile is the point where 75% of observations lie below and 25% are above. (3 rd quartile, Q3) The 25 th percentile is the point where 25% of observations lie below and 75% are above. (1 st quartile, Q1) Interquartile range (Q3 – Q1)

29 Raw data and results of Cholesterol levels in 26 subjects p.115 Number of observations or N 26 Initial HDL values31, 41, 44, 46, 47, 47, 48, 48, 49, 52, 53, 54, 57, 58, 58, 60, 60, 62, 63, 64, 67, 69, 70, 77, 78, 81, 90 mg/dl Highest values90 mg/dl Lowest value31 mg/dl Mode47, 48, 58, 60 mg/dl Median(57 + 58)/2 = 57.5 mg/dl Sum of the values  (x i )1496 mg/dl Means, x 1496/26 = 57.5 mg/dl Interquartile range64 – 48 = 16 mg/dl

30 Measures of dispersion based on the Mean. Mean deviation = Variance = Standard deviation = s = Degrees of Freedom  (x i -  x ) N -1 2  (x i -  x ) N -1 s 2 2 =  (|x i -  x| ) N

31 Raw data and results of Cholesterol levels in 26 subjects p.115 Number of observations or N 26 Initial HDL values31, 41, 44, 46, 47, 47, 48, 48, 49, 52, 53, 54, 57, 58, 58, 60, 60, 62, 63, 64, 67, 69, 70, 77, 78, 81, 90 mg/dl Highest values90 mg/dl Lowest value31 mg/dl Mode47, 48, 58, 60 mg/dl Median(57 + 58)/2 = 57.5 mg/dl Sum of the values  (x i )1496 mg/dl Means, x 1496/26 = 57.5 mg/dl Interquartile range64 – 48 = 16 mg/dl Sum of squares (TSS)4,298.46 mg/dl Variance, “s” squared 171.94 mg/dl Standard Deviation, s 171.94 mg/dl = 13.1 mg/dl

32 Theoretical normal (gaussian) distribution  stands for the mean in a theoretical distribution  stands for the standard deviation in a theoretical population.

33 -3  -2  -- ++ +2  +3  -3 -2 0 123 Z scores Theoretical normal distribution with standard deviations

34 Three Common Areas Under the Curve Three Normal distributions with different areas

35 Process of Testing Hypotheses Test are designed to determine the probability that a finding represents the true deviation from what is expected. This chapter focuses on the justification for and interpretation of the p value designed to minimized type I error. Science is based of the following principles: –Previous experience serves as the basis for developing hypotheses; –Hypotheses serve as the basis for developing predictions; –Predictions must be subjected to experimental or observational testing.

36 Hypothesis testing H 0 TrueH 0 False Accept H 0 Reject H 0 Type I Error Type II Error Correct Truth Decision ab c d Alpha error: rejecting the null H 0 when it is true Beta error: accepting the null H 0 when it is false

37 The power of a test: (probability that a test detects differences that actually exist) can be determined by using the formula 1 – beta (1 -  ) 80% is usually acceptable

38 Hypothesis Testing 1. State question in terms of: H 0 : no difference or relationship (null) H a : is difference or relationship (alternative) 2. Decide on appropriate research design and statistic 3.Select significance (alpha) level and “N” 4.Collect data 5.Analyze and perform calculation to get P- value 6.Draw and state conclusions by comparing alpha with P-value

39 -3  -2  -- ++ +2  +3  -3 -2 0 123 Z scores Theoretical normal distribution with standard deviations Probability Upper tail.1587.02288.0013 Two-tailed.3173.0455.0027

40 When is a specific test used? Student’s t –test: to compare the means of two small (n < 30) independent samples. Paired t-test: to compare the means of two paired samples (e.g. before and after) F – test: to compare means of three or more samples or groups. Chi-Square test: comparing two or more independent proportions. Correlation coefficient: measures the strength of the association between two variables. Regression analysis: Provides an equation that estimates the change in a dependent variable (y) per unit change in an independent variable (x).


Download ppt "HIM 3200 Midterm Review Dr. Burton. Mid-term review Types of data Normal distribution Variance Standard deviation and z scores 2 X 2 table Hypothesis."

Similar presentations


Ads by Google