Introduction to Statistics

Slides:



Advertisements
Similar presentations
Lesson Describing Distributions with Numbers parts from Mr. Molesky’s Statmonkey website.
Advertisements

Statistics. Review of Statistics Levels of Measurement Descriptive and Inferential Statistics.
Statistical Tests Karen H. Hagglund, M.S.
BHS Methods in Behavioral Sciences I April 18, 2003 Chapter 4 (Ray) – Descriptive Statistics.
QUANTITATIVE DATA ANALYSIS
Scales of Measurement S1-1. Scales of Measurement: important for selecting stat's (later on) 1. Nominal Scale: number is really a name! 1 = male 2 = female.
1 Economics 240A Power One. 2 Outline w Course Organization w Course Overview w Resources for Studying.
Descriptive Statistics
Statistical Analysis SC504/HS927 Spring Term 2008 Week 17 (25th January 2008): Analysing data.
Intro to Descriptive Statistics
Introduction to Educational Statistics
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Data observation and Descriptive Statistics
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 3 Describing Data Using Numerical Measures.
Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.
1 Measures of Central Tendency Greg C Elvers, Ph.D.
Measures of Central Tendency
July, 2000Guang Jin Statistics in Applied Science and Technology Chapter 4 Summarizing Data.
Today: Central Tendency & Dispersion
Understanding Research Results
Describing Data: Numerical
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 12 Describing Data.
Chapter 1 Descriptive Analysis. Statistics – Making sense out of data. Gives verifiable evidence to support the answer to a question. 4 Major Parts 1.Collecting.
@ 2012 Wadsworth, Cengage Learning Chapter 5 Description of Behavior Through Numerical 2012 Wadsworth, Cengage Learning.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
BIOSTATISTICS II. RECAP ROLE OF BIOSATTISTICS IN PUBLIC HEALTH SOURCES AND FUNCTIONS OF VITAL STATISTICS RATES/ RATIOS/PROPORTIONS TYPES OF DATA CATEGORICAL.
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
Completing the Experiment. Your Question should be in the proper format: The Effect of Weight on the Drone’s Ability to Fly in Meters In this format,
Fall 2013 Lecture 5: Chapter 5 Statistical Analysis of Data …yes the “S” word.
Measures of Central Tendency or Measures of Location or Measures of Averages.
Types of data and how to present them 47:269: Research Methods I Dr. Leonard March 31, :269: Research Methods I Dr. Leonard March 31, 2010.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Chapter Eleven A Primer for Descriptive Statistics.
Reasoning in Psychology Using Statistics Psychology
Tuesday August 27, 2013 Distributions: Measures of Central Tendency & Variability.
Measures of Central Tendency and Dispersion Preferred measures of central location & dispersion DispersionCentral locationType of Distribution SDMeanNormal.
Introduction to Descriptive Statistics Objectives: 1.Explain the general role of statistics in assessment & evaluation 2.Explain three methods for describing.
Chapter 2 Describing Data.
Biostatistics Class 1 1/25/2000 Introduction Descriptive Statistics.
Lecture 5: Chapter 5: Part I: pg Statistical Analysis of Data …yes the “S” word.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
Sampling Design and Analysis MTH 494 Ossam Chohan Assistant Professor CIIT Abbottabad.
INVESTIGATION 1.
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Descriptive Statistics The goal of descriptive statistics is to summarize a collection of data in a clear and understandable way.
Data Analysis.
BASIC STATISTICAL CONCEPTS Chapter Three. CHAPTER OBJECTIVES Scales of Measurement Measures of central tendency (mean, median, mode) Frequency distribution.
Edpsy 511 Exploratory Data Analysis Homework 1: Due 9/19.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Statistical Analysis of Data. What is a Statistic???? Population Sample Parameter: value that describes a population Statistic: a value that describes.
Descriptive Statistics Tabular and Graphical Displays –Frequency Distribution - List of intervals of values for a variable, and the number of occurrences.
LIS 570 Summarising and presenting data - Univariate analysis.
Descriptive Statistics(Summary and Variability measures)
Statistics Josée L. Jarry, Ph.D., C.Psych. Introduction to Psychology Department of Psychology University of Toronto June 9, 2003.
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
Lecture 8 Data Analysis: Univariate Analysis and Data Description Research Methods and Statistics 1.
MATH-138 Elementary Statistics
Doc.RNDr.Iveta Bedáňová, Ph.D.
Chapter 3 Describing Data Using Numerical Measures
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
Description of Data (Summary and Variability measures)
Descriptive Statistics
Introduction to Statistics
Basic Statistical Terms
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Welcome!.
Advanced Algebra Unit 1 Vocabulary
Presentation transcript:

Introduction to Statistics Do I have to??

Why we “do it” "What we really want to get at [in health care research] is not how many reports have been done, but how many people's lives are being bettered by what has been accomplished. In other words, is it being used, is it being followed, is it actually being given to patients—... What effect is it having on people—" Rep. John Porter (R-IL), retired chairman House Appropriations Subcommittee on Labor, Health and Human Services (HHS), and Education

Is Statistics Important? Statistics is important because we can use it to find out whether something we observe can be applied to new and different situations. Knowing this allows us to plan for the future, and to make decisions about how to allocate our scarce resources of money, energy, and ultimately life. We use the term generalizable: can what we know help to predict what will happen in new and different situations?

Why Statistics Scientific knowledge represents the best understanding that has been produced by means of current evidence. Research design, if used properly, strengthens the objectivity of the research. Statistical methods allow us to compare what is actually observed to what is logically expected.

Why Statistics (cont’d) Knowledge of statistics . . . Useful in conducting investigations Helpful the preparing and evaluating research proposals. Vital in deciding whether claims of a researcher are valid Keep abreast of current developments. Effective presentations of the findings.

Evils of Pickle Eating Pickles are associated with all the major diseases of the body. Eating them breeds war and Communism. They can be related to most airline tragedies. Auto accidents are caused by pickles. There exists a positive relationship between crime waves and consumption of this fruit of the cucurbit family. For example

Evils of Pickle Eating (cont’d) Nearly all sick people have eaten pickles. 99.9% of all people who die from cancer have eaten pickles. 100% of all soldiers have eaten pickles. 96.8% of all Communist sympathizers have eaten pickles. 99.7% of the people involved in air and auto accidents ate pickles within 14 days preceding the accident. 93.1% of juvenile delinquents come from homes where pickles are served frequently. Evidence points to the long-term effects of pickle eating. Of the people born in 1839 who later dined on pickles, there has been a 100% mortality.

Evils of Pickle Eating (cont’d) All pickle eaters born between 1849 and 1859 have wrinkled skin, have lost most of their teeth, have brittle bones and failing eyesight-if the ills of pickle eating have not already caused their death. Even more convincing is the report of a noted team of medical spe­cialists: rats force-fed with 20 pounds of pickles per day for 30 days de­veloped bulging abdomens. Their appetites for WHOLESOME FOOD were destroyed.

Evils of Pickle Eating (cont’d) In spite of all the evidence, pickle growers and packers continue to spread their evil. More than 120,000 acres of fertile U.S. soil are devoted to growing pickles. Our per capita consumption is nearly four pounds. Eat orchid petal soup. Practically no one has as many problems from eating orchid petal soup as they do with eating pickles. EVERETT D. EDINGTON

Types of Statistics Descriptive Statistics Examples enumerate, organize, summarize, and categorize graphical representation of data. these type of statistics describes the data. Examples means and frequency of outcomes charts and graphs

Types of Statistics Inferential Statistics Examples drawing conclusions from incomplete information. they make predictions about a larger population given a smaller sample these are thought of as the statistical test Examples t-test, chi square test, ANOVA, regression

Creighton University Medical Center Variables J.D. Bramble, Ph.D. Creighton University Medical Center Med 483 -- Fall 2006

Types of Data Qualitative Quantitative data fall into separate classes with no numerical relationship sex, mortality, correct/incorrect, etc. Quantitative numerical data that is continuous pharmaceutical costs, LOS, etc.

Parameters and Statistics characteristics of the population calculating the exact population parameter is often impractical or impossible Statistics characteristics of the sample represent summary measures of observed values

Types of Variables Variables are symbols to which numerals or values are assigned e.g. X and Y are variables Dependent (Y’s), that which is predicted Independent (X’s), that which predicts Extraneous (Confounding or Control) statistical models “adjust” for their influence

Independent variables Independent variables are the presumed cause of the the dependent variable The variable responsible for the change in the phenomena being observed Nothing is for sure, so avoid the word ‘cause’ and think in terms of independent and dependent variables

Dependent variables Also referred to as the outcome variable The outcome of the changes due to the independent variables Example: y = a + bx

Confounding variables Additional variables that may effect the changes in the dependent variable attributed to the independent variables. These variables are controlled by measuring them and statistical methods adjust for there influence. Sometimes referred to as control variables

Active vs. attribute variables Active variables are those variables under the control of the researcher controlled experimental studies e.g., amount of drug administered Attribute variables can not be manipulated by the researcher quasi-experimental studies e.g.,sex or age of subject; blood pressure; smoker

The Wrong data Leads to Migraines

Levels of Measurement Categorical Variables Continuous Variables Nominal Scale Ordinal Scale Continuous Variables Interval Scale Ratio Scale

Continuous Variables Continuous variables are measured and can take on any value along the scale quantitative variables measured on a interval or ratio level Examples Age, income, number of medications

Categorical Variables Categorical variables are measured as dichotomous or polytomous measures qualitative variables measured on a nominal or ordinal level Examples sex; smoking status; ownership Categorizing continuous variables

Nominal measurement scale Used for qualitative data Two or more levels of measurement The name of the groups does not matter Examples Sex (Male/Female) Smoker (Yes/No) Political Party (Rep, Dem, Ind)

Ordinal measurement scale All the properties of nominal plus . . . The groups are ordered or ranked Intervals between groups are not necessarily equal Example: Income (low, med, high) Disease severity Likert scales

Interval measurement scale All properties of nominal and ordinal plus . . . A scale is used to measure the response of the study subjects The intervals scale’s units are equal; however arbitrary (e.g., a relative scale) Examples: Temperature on Fahrenheit scale

Ratio measurement scale All properties of the previous scales plus . . . An absolute zero point Can perform mathematical operations Highest level of measurement Examples Income, age, height, weight

Measures of Central Tendency and Variation Summarizing Data Measures of Central Tendency and Variation The mean is our usual concept of an overall average - add up the items and divide them by the number of sharers (100 candy bars collected for five kids next Halloween will yield 20 for each in a just world). The median, a different measure of central tendency, is the half-way point. If I line up five kids by height, the median child is shorter than two and taller than the other two (who might have trouble getting their mean share of the candy). A politician in power might say with pride, "The mean income of our citizens is $15,000 per year." The leader of the opposition might retort, "But half our citizens make less than $10,000 per year." Both are right, but neither cites a statistic with impassive objectivity. The first invokes a mean, the second a median.

Mean Arithmetic mean the balance point sum all observations divide the sum by the number of observations Means are higher than medians in such cases because one millionaire may outweigh hundreds of poor people in setting a mean; but he can balance only one mendicant in calculating a median

Median Divides the distribution into two equal parts. Considered the most “typical” observation Less sensitive to extreme values

Calculating Medians To find the median value: q(n+1) 41, 28, 34, 36, 26, 44, 39, 32, 40, 35, 36, 33 order data in ascending order 26, 28, 32, 33, 34, 35, 36, 36, 39, 40, 41, 44 Apply the median location formula: 0.5(12+1) = 6.5 Note: this is ONLY the location of the median

Quantiles Quantiles are those values that divide the distribution into n equal parts so that there is a given proportion of data below each quantile. The median is the middle quantile. Quartiles are also very common (25, 50, 75) If we divided the distribution into 100 then we have percentiles.

Mode The observation that occurs most frequently Graphically it is the value of the peak of the distribution. Frequency often may be bimodal--two modes. If values are all the same--no mode exists

Single Modal

Bimodal Example

Symmetrical: The relationship between the Mean, Median, & Mode

Positive Skew: The relationship between the Mean, Median, & Mode

Negative Skew: The relationship Between the Mean, Median, & Mode

Summarizing Data Frequency distributions Measures of central tendency The tendency of data to center around certain numerical and ordinal values. Three common measures: mean, median, & mode Measures of variation standard deviation

Five Figure Summary Median Quartiles Maximum Minimum Can be shown in a box and whisker plot

Which Measure? Mean Median Mode numerical data symmetric distribution ordinal data skewed distribution Mode bimodal distribution most popular

Variation Must also report measures of variation Measures of variability reflect the degree to which data differ from one another as well as the mean. Together the mean and variability help describe the characteristics of the data and shows how the distributions vary from one another.

Example of Variation Take the following three sets of data:          1) 10, 8, 5, 5, 2;         2) 5, 6, 6, 7, 6;           3) 6, 6, 6, 6, 6 In all three cases the mean is 6, the variability is a lot of variability in set 1 No variability in set 3. We will discuss three measures of variability: 1) the range; 2) the standard deviation; and 3) variance

Measures of Variation Range the value between the highest and the lowest observations Range = xmax - xmin limited usefulness since it only accounts for the extreme values can also report the inter-quartile range (q3 – q1)

Standard Deviation most widely used & preferred measure of variation. represented by the symbol s or sd the square root of the variance (s2) larger values = more heterogeneous distribution 75% of the observations lie between x-2s and x+2s if the distribution is normal (bell shaped) 67% = 95% = 99.7% =

Variance and Std Deviation Standard Deviation

Example Using data on the sexual activity of male and female subjects can be found in Chatterjee, Handcock, and Simonoff (1995) A casebook for a first course in statistics. New York: Wiley. They provide data on the reported number of sexual partners for 1682 females and 1850 males. The dependent variable is the number of reported partners.

Descriptive Statistics Male Female (n=1850) (n=1685) Mean 10.9 3.4 Median 4 1 Mode 1 1

Using Excel When Syntax in Known Write them right into the spreadsheet Be sure to start with an equal sign Use your mouse to highlight data to analyze

Using Excel When Syntax in Unknown Use the wizard and follow in instructions. All wizards work about the same way. Select the fx button to select appropriate test Select category and then desired test

Follow the Wizard Either highlight the array or just write it in These icons reduce/enlarge the Wizard box