Chapter Eleven A Primer for Descriptive Statistics.

Chapter Eleven A Primer for Descriptive Statistics

Descriptive Statistics A variety of tools, conventions, and procedures for describing variables and relationships between variables

Measurement is the process of assigning numbers to phenomena according to a set of rules Levels of Measurement Nominal: involves no underlying continuum; assignment of numeric values arbitrary Examples: religious affiliation, gender, etc.

Levels of Measurement Ordinal: implies an underlying continuum; values are ordered but intervals are not equal. Examples: Community size, Likert items, etc.

Levels of Measurement Cont. Ratio: involves an underlying continuum; numeric values assigned reflect equal intervals; zero point aligned with true zero. Examples: weight, age in years, % minority

Data Distributions A listing of all the values for any one variable The most basic technique for presenting a large data set is to create a frequency distribution table A systematic listing of all the values on a variable from the lowest to the highest with the # of times (frequency) each value was observed

Normal Distribution A normal distribution roughly follows a bell-shaped curve Bimodal distribution (2 peaks eg. male & female body weight) Platykurtic distribution (flat & wide, great deal of variability) Leptokurtic distribution (peaked, little variability)

Measures of Central Tendency A single numeric value that summarizes the data set in terms of its “average” value. Eg. the nurse researcher uses the value of 98.6 F or 37 C to describe the average adult body temperature

Measures of Central Tendency Mean: calculated by summing values and dividing by number of cases Median: caluculated by ordering a set of values and then using the middle most value (in cases of two middle values, calculated the mean of the two values. Mode: the most frequently occuring value.

Measures of Dispersion Range: calculated by substracting lowest value from the highest value in a set of values. Standard Deviation: a measure reflecting the average amount of deviation in a set of values.

Dispersion Cont. Variance: this measure is simply the standard deviation squared.

Standardizing Data To standardize data is to report data in a way that comparisons between units of different size may be made

Standardizing Data Proportions: represents the part of 1 that some element represents. A so-called batting average is actually a proportion because it represents: BA =Number of Hits Number at Bats

Percentage: a proportion may be converted to a percentage by multiplying by 100. If a players batting “average” is.359 we could convert that to a percentage by multiplying by 100. In this case, the percentage of time the person gets a hit is 35.9%. In short, a percentage represents how often something happens per 100 times.

Percentage Change: a measure of how much something has changed over a given time period. Percentage change is: Time 2 - Time 1 x 100 Time 1 Thus, if there were 25 nurses now compared to 17 five years earlier, the percentage change over the 5 year period would be: ((25 - 17)  17) x 100 = 47.1%

Rates: represent the frequency of something for a standard sized unit. Divorce rates, suicide rates, crime rates are examples. So if we had 104 suicides in a population of 757,465 the suicide rate per 100,000 would be calculated as follows: SR = 104 x 100 = 13.73 757,465 I.e., there are 13.73 suicides per 100,000

Ratio: represents a comparison of one thing to another. So if there are 200 suicides in the U.S. and 57 per 100,000 in Canada, the U.S./Canadian suicide ratio is: US Suicide Rate = 200 = 3.51 Candian Suicide Rate 57

Normal Distribution Much data in the social and physical world is “normally distributed”. If it is this means that there will be a few low values, many more clustered toward the middle, and a few high values. Normal distributions are: symmetrical, bell-shaped curve mean, mode, and median will be similar 2/3 of cases ± 1 standard deviation of mean 95.6 cases ± 2 standard deviations of mean

Normal Distribution Cont.

Z Scores A Z score represents the distance, in standard deviation units, of any value in a distribution. The Z Score formula is as follows: __ Z = X - X sd

Exercise: Suppose: Income Mean = $72,000; SD = $18,000 Education Mean = 11 years; SD = 4 years SubjectIncome Education Case 180,00014 Case 270,00010 Case 391,00019 Case 456,000 8 Calculation Case 1: Case 1 Z (income) = 80,000 - 72,000 =.44 18,000 Case 1 Z (education) = 14 - 11 =.75 4 SES score Case 1 =.44 +.75 = 1.19

Areas Under the Normal Curve draw normal curve, include lines to represent problem calculate Z score(s) for problem look up value in Table 11.14 Solve problem, recall that.5 of cases fall above the mean,.5 below convert proportion to percentage, if needed

Exercise: Suppose you wished to know percentage of cases will fall above $100,000 in a sample whose MEAN is $65,000 and the SD is $22,000 Show p. 370 of text Z = 1.59 100,000 - 65,000 / 22,000 look up in Table 11.14, p 368 =.4441.5000 -.4441 =.0559 (proportion) x 100 = 5.6% (percentage)

Describing Relationships Between Variables 1. Crosstabular Analysis: used with a nominal dependent variable we cross-classify the information to show the relation between an independent and a dependent variable a standard table looks like the following:

Rules for Crosstabular Tables: in table title, name dependent variable first place dependent variable on vertical axis place independent on horizontal plane use clear variable labels run % figures toward independent variable report % to one decimal point statistical data reported below table interpret by comparing % in categories of the independent variable

2. Comparing Means used when dependent variable is ratio comparison to categories of independent variable both t-test and ANOVA may be used Presentation may be as follows:

t Test T-test is used to determine: if the differences in the means of two groups are statistically significant with samples under 30 when comparing 2 groups on a ratio level dependent variable

Analysis of Variance (ANOVA) ANOVA is used when 3 or more groups means are compared, or When the means for 2 or more groups are compared at 2 or more points in time in a single analysis (e.g., a pre-post experimental design) Computes a ratio that compares 2 kinds of variability-with-in group & between-groups variability

3. Correlation used with ratio level variables interest in both the equation and the strength of the correlation Y = a + bX is the general equation the r is the symbol used to report the strength of the correlation: can vary from -1.0 to + 1.0

Sample Data Set (X) (Y) 23 34 54 768

Y876543210Y876543210 0 1 2 3 4 5 6 7 8 X

Y876543210Y876543210 0 1 2 3 4 5 6 7 8 X Regression Line

Y876543210Y876543210 0 1 2 3 4 5 6 7 8 X a value read here b value (slope) read here h/b h b

Y876543210Y876543210 0 1 2 3 4 5 6 7 8 X Predicted Value

Chapter Eleven A Primer for Descriptive Statistics.

Similar presentations

Presentation on theme: "Chapter Eleven A Primer for Descriptive Statistics."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter Eleven A Primer for Descriptive Statistics.

Similar presentations

Presentation on theme: "Chapter Eleven A Primer for Descriptive Statistics."— Presentation transcript:

Similar presentations

About project

Feedback