© 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Slides:



Advertisements
Similar presentations
Introduction to Summary Statistics
Advertisements

Statistics. Review of Statistics Levels of Measurement Descriptive and Inferential Statistics.
© Biostatistics Basics An introduction to an expansive and complex field.
Review of Basics. REVIEW OF BASICS PART I Measurement Descriptive Statistics Frequency Distributions.
Statistics for the Social Sciences
QUANTITATIVE DATA ANALYSIS
Calculating & Reporting Healthcare Statistics
PSY 307 – Statistics for the Behavioral Sciences
Analysis of Research Data
Introduction to Educational Statistics
Data observation and Descriptive Statistics
Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.
Central Tendency and Variability
Chapter 3: Central Tendency
Measures of Central Tendency
Introduction to Statistics February 21, Statistics and Research Design Statistics: Theory and method of analyzing quantitative data from samples.
BIOSTATISTICS II. RECAP ROLE OF BIOSATTISTICS IN PUBLIC HEALTH SOURCES AND FUNCTIONS OF VITAL STATISTICS RATES/ RATIOS/PROPORTIONS TYPES OF DATA CATEGORICAL.
Fall 2013 Lecture 5: Chapter 5 Statistical Analysis of Data …yes the “S” word.
Psychometrics.
B AD 6243: Applied Univariate Statistics Understanding Data and Data Distributions Professor Laku Chidambaram Price College of Business University of Oklahoma.
Class Meeting #11 Data Analysis. Types of Statistics Descriptive Statistics used to describe things, frequently groups of people.  Central Tendency 
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
1 DATA DESCRIPTION. 2 Units l Unit: entity we are studying, subject if human being l Each unit/subject has certain parameters, e.g., a student (subject)
© 2006 McGraw-Hill Higher Education. All rights reserved. Numbers Numbers mean different things in different situations. Consider three answers that appear.
Measures of Central Tendency and Dispersion Preferred measures of central location & dispersion DispersionCentral locationType of Distribution SDMeanNormal.
Describing Behavior Chapter 4. Data Analysis Two basic types  Descriptive Summarizes and describes the nature and properties of the data  Inferential.
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.
© 2006 McGraw-Hill Higher Education. All rights reserved. Numbers Numbers mean different things in different situations. Consider three answers that appear.
Descriptive Statistics
Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.1 Descriptive Statistics, The Normal Distribution, and Standardization.
Lecture 5: Chapter 5: Part I: pg Statistical Analysis of Data …yes the “S” word.
TYPES OF STATISTICAL METHODS USED IN PSYCHOLOGY Statistics.
QUANTITATIVE RESEARCH AND BASIC STATISTICS. TODAYS AGENDA Progress, challenges and support needed Response to TAP Check-in, Warm-up responses and TAP.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
Research Ethics:. Ethics in psychological research: History of Ethics and Research – WWII, Nuremberg, UN, Human and Animal rights Today - Tri-Council.
Chapter 11 Univariate Data Analysis; Descriptive Statistics These are summary measurements of a single variable. I.Averages or measures of central tendency.
Measures of Central Tendency: The Mean, Median, and Mode
FREQUANCY DISTRIBUTION 8, 24, 18, 5, 6, 12, 4, 3, 3, 2, 3, 23, 9, 18, 16, 1, 2, 3, 5, 11, 13, 15, 9, 11, 11, 7, 10, 6, 5, 16, 20, 4, 3, 3, 3, 10, 3, 2,
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
 Two basic types Descriptive  Describes the nature and properties of the data  Helps to organize and summarize information Inferential  Used in testing.
Chapter Eight: Using Statistics to Answer Questions.
Unit 2 (F): Statistics in Psychological Research: Measures of Central Tendency Mr. Debes A.P. Psychology.
Data Analysis.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
BASIC STATISTICAL CONCEPTS Chapter Three. CHAPTER OBJECTIVES Scales of Measurement Measures of central tendency (mean, median, mode) Frequency distribution.
IE(DS)1 Descriptive Statistics Data - Quantitative observation of Behavior What do numbers mean? If we call one thing 1 and another thing 2 what do we.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
LIS 570 Summarising and presenting data - Univariate analysis.
Introduction to statistics I Sophia King Rm. P24 HWB
Describing Distributions Statistics for the Social Sciences Psychology 340 Spring 2010.
Outline of Today’s Discussion 1.Displaying the Order in a Group of Numbers: 2.The Mean, Variance, Standard Deviation, & Z-Scores 3.SPSS: Data Entry, Definition,
Chapter 3: Central Tendency 1. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Chapter 2 Describing and Presenting a Distribution of Scores.
Descriptive Statistics(Summary and Variability measures)
Statistics Josée L. Jarry, Ph.D., C.Psych. Introduction to Psychology Department of Psychology University of Toronto June 9, 2003.
Educational Research Descriptive Statistics Chapter th edition Chapter th edition Gay and Airasian.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved Chapter 2 Describing and Presenting a Distribution of Scores.
Central Bank of Egypt Basic statistics. Central Bank of Egypt 2 Index I.Measures of Central Tendency II.Measures of variability of distribution III.Covariance.
PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are.
APPROACHES TO QUANTITATIVE DATA ANALYSIS
Description of Data (Summary and Variability measures)
Descriptive Statistics
Introduction to Statistics
Basic Statistical Terms
An introduction to an expansive and complex field
Chapter Nine: Using Statistics to Answer Questions
Descriptive Statistics
Presentation transcript:

© 2006 Dr Rotimi Adigun

Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive Medicine and Public Health, Sylvie Ratelle(Chapter 1) for practice questions.

I. Descriptive statistics  Populations, samples and Elements  Probability  Types of Data  Frequency Distribution  Measures of Central Tendency  Measures of Variability  Z scores II. Inferential Statistics  Statistics and Parameters  Estimating mean  T-scores  Hypothesis testing  Steps of hypothesis testing  Z-tests  The meaning of statistical significance  Type I and II errors  Power  Differences between groups

III.  Correlation and Predictive Techniques  Correlation  Survival analysis IV. Research methods  Sampling techniques  Assessing the Evidence  Hierarchy of evidence  Systematic review  Clinical Decision making

© /20/2015 Descriptive Statistics, Population, Samples, Elements Types of Data, Measures of Central tendency, Measures of Variability, Z-scores.

 A way to summarize data from a sample or a population  DSs illustrate the shape, central tendency, and variability of a set of data  The shape of data has to do with the frequencies of the values of observations  DSs= descriptive statistics. 9/20/20156

 Population A population is the group from which a sample is drawn  e.g., automobile crash victims in an emergency room,Student scores on block I exams at Windor,  In research, it is not practical to include all members of a population  Thus, a sample (a subset of a population) is taken 9/20/20157

 Elements=A single observation—such as one students score—is an element, denoted by X.  The number of elements in a population is denoted by N  The number of elements in a sample is denoted by n. 9/20/20158

 Data  Measurements or observations of a variable  Variable  A characteristic that is observed or manipulated  Can take on different values 9/20/20159

 Independent variables  Precede dependent variables in time  Are often manipulated by the researcher  The treatment or intervention that is used in a study  Dependent variables  What is measured as an outcome in a study  Values depend on the independent variable 9/20/201510

Independent and Dependent variables e.g. -You study the effect of different drugs on colon cancer, the drugs, dosage, timing would be independent variables, the effect of the different drugs, dosage,and timing on cancer would be the dependent variable. 9/20/201511

 We compare the effect of stimulants on memory and retention.  Independent variables would be the different stimulants and dosages..dependent variables would be the different outcomes..(improved retention, decreased retention or no effect) 9/20/201512

 Mathematically on a graph or equation where Y depends on the value of X,  Y is a function of X or f(x)  Independent variable is usually plotted on the X axis while the dependent variable is plotted on the Y axis. 9/20/201513

 Parameters  Summary data from a population  Statistics  Summary data from a sample 9/20/201514

 Central tendency describes the location of the middle of the data  Variability is the extent values are spread above and below the middle values  a.k.a., Dispersion  DSs can be distinguished from inferential statistics  DSs are not capable of testing hypotheses 9/20/201515

 Mean (a.k.a., average )  The most commonly used DS  To calculate the mean  Add all values of a series of numbers and then divided by the total number of elements 9/20/201516

 Mean of a sample  Mean of a population  (X bar) refers to the mean of a sample and refers to the mean of a population  E X is a command that adds all of the X values  n is the total number of values in the series of a sample and N is the same for a population 9/20/201517

 Mode  The most frequently occurring value in a series  The modal value is the highest bar in a histogram 9/20/ Mode

 Median  The value that divides a series of values in half when they are all listed in order  When there are an odd number of values  The median is the middle value  When there are an even number of values  Count from each end of the series toward the middle and then average the 2 middle values 9/20/201519

 Each of the three methods of measuring central tendency has certain advantages and disadvantages  Which method should be used?  It depends on the type of data that is being analyzed  e.g., categorical, continuous, and the level of measurement that is involved 9/20/201520

 There are 4 levels of measurement  Nominal, ordinal, interval, and ratio 1. Nominal  Data are coded by a number, name, or letter that is assigned to a category or group  No ordering  Examples  Gender (e.g., male, female)  Race  Treatment preference (e.g., Surgery, Radiotherapy, Hormone, Chemotherapy) 9/20/201521

2. Ordinal  Is similar to nominal because the measurements involve categories  However, the categories are ordered by rank  Examples  Pain level (e.g., mild, moderate, severe)  Military rank (e.g., lieutenant, captain, major, colonel, general)  Opinion- (Agree, strongly agree)  Severity – Mild, moderate, severe (dysplasia) 9/20/201522

 Ordinal values only describe order, not quantity  Thus, severe pain is not the same as 2 times mild pain  The only mathematical operations allowed for nominal and ordinal data are counting of categories  e.g., 25 males and 30 females 9/20/201523

3. Interval  Measurements are ordered (like ordinal data)  Have equal intervals  Does not have a true zero  Examples  The Fahrenheit scale, where 0° does not correspond to an absence of heat (no true zero)  In contrast to Kelvin, which does have a true zero 9/20/201524

4. Ratio  Measurements have equal intervals  There is a true zero  Ratio is the most advanced level of measurement, which can handle most types of mathematical operations (including multiplication and division which are absent in interval scales) 9/20/201525

 Ratio examples  Range of motion  No movement corresponds to zero degrees  The interval between 10 and 20 degrees is the same as between 40 and 50 degrees  Lifting capacity  A person who is unable to lift scores zero  A person who lifts 30 kg can lift twice as much as one who lifts 15 kg 9/20/201526

 NOIR is a mnemonic to help remember the names and order of the levels of measurement  N ominal O rdinal I nterval R atio 9/20/201527

Measurement scale Permissible mathematic operations Best measure of central tendency NominalCountingMode Ordinal Greater or less than operations Median IntervalAddition and subtraction Symmetrical – Mean Skewed – Median Ratio Addition, subtraction, multiplication and division Symmetrical – Mean Skewed – Median 9/20/201528

 Histograms of frequency distributions have shape  Distributions are often symmetrical with most scores falling in the middle and fewer toward the extremes  Most biological data are symmetrically distributed and form a normal curve (a.k.a, bell- shaped curve) 9/20/201529

9/20/ Line depicting the shape of the data

 The area under a normal curve has a normal distribution (a.k.a., Gaussian distribution)  Properties of a normal distribution  It is symmetric about its mean  The highest point is at its mean  The height of the curve decreases as one moves away from the mean in either direction, approaching, but never reaching zero 9/20/201531

9/20/ Mean A normal distribution is symmetric about its mean As one moves away from the mean in either direction the height of the curve decreases, approaching, but never reaching zero As one moves away from the mean in either direction the height of the curve decreases, approaching, but never reaching zero The highest point of the overlying normal curve is at the mean

9/20/ Mean = Median = Mode

 The data are not distributed symmetrically in skewed distributions  Consequently, the mean, median, and mode are not equal and are in different positions  Scores are clustered at one end of the distribution  A small number of extreme values are located in the limits of the opposite end 9/20/201534

 Skew is always toward the direction of the longer tail(not the hump)  Positive if skewed to the right  Negative if to the left 9/20/ The mean is shifted the most

 Because the mean is shifted so much, it is not the best estimate of the average score for skewed distributions  The median is a better estimate of the center of skewed distributions  It will be the central point of any distribution  50% of the values are above and 50% below the median 9/20/201536

 Mean,Median and Mode 9/20/201537

Midrange Smallest observation + Largest observation 2 Mode the value which occurs with the greatest frequency i.e. the most common value Summary statistics

 Median the observation which lies in the middle of the ordered observation.  Arithmetic mean (mean) Sum of all observations Number of observations Summary statistics

 The mean represents the average of a group of scores, with some of the scores being above the mean and some below  This range of scores is referred to as variability or spread  Range  Variance  Standard deviation  Semi-interquartile range  Coefficient of variation  “Standard error”

 SD is the average amount of spread in a distribution of scores  The next slide is a group of 10 patients whose mean age is 40 years  Some are older than 40 and some younger 9/20/201541

9/20/ Ages are spread out along an X axis The amount ages are spread out is known as dispersion or spread

9/20/ Adding deviations always equals zero Etc.

 To find the average, one would normally total the scores above and below the mean, add them together, and then divide by the number of values  However, the total always equals zero  Values must first be squared, which cancels the negative signs 9/20/201544

9/20/ Symbol for SD of a sample  for a population S 2 is not in the same units (age), but SD is

9/20/201546

Mean = 7 SD=0 Mean = 7 SD=0.63 Mean = 7 SD=4.04

 About 68.3% of the area under a normal curve is within one standard deviation (SD) of the mean  About 95.5% is within two SDs  About 99.7% is within three SDs 9/20/201548

9/20/201549

9/20/201550

 The number of SDs that a specific score is above or below the mean in a distribution  Raw scores can be converted to z-scores by subtracting the mean from the raw score then dividing the difference by the SD 9/20/201551

 If the element lies above the mean, it will have a positive z score;  if it lies below the mean, it will have a negative z score. 9/20/201552

 Standardization  The process of converting raw to z-scores  The resulting distribution of z-scores will always have a mean of zero, a SD of one, and an area under the curve equal to one  The proportion of scores that are higher or lower than a specific z-score can be determined by referring to a z-table 9/20/201553

9/20/ Refer to a z-table to find proportion under the curve

9/20/ Partial z-table (to z = 1.5) showing proportions of the area under a normal curve for different values of z. Z Corresponds to the area under the curve in black Corresponds to the area under the curve in black

 Tables of z scores state what proportion of any normal distribution lies above or below any given z scores, not just z scores of ±1, 2, or 3. Z-score tables can be used to - Determine proportion of distribution with a certain score(e.g finding the proportion of the class with a score of 65% on an exam) -Find scores that divide the distribution into certain proportions(for example finding what scores separates the top 5% of the class from the remaining 95%) 9/20/201556

 Allows us to specify the probability that a randomly picked element will lie above or below a particular score.  For example, if we know that 5% of the population has a heart rate above 90 beats/min, then the probability of one randomly selected person from this population having a heart rate above 86.5 beats/min will be 5%. 9/20/201557

 What is the probability that a randomly picked person would have a heart rate of less than 50 beats per minute in a population with S.D of 10 and mean heart beat of 70z? 9/20/201558