Presentation is loading. Please wait.

Presentation is loading. Please wait.

Descriptive Statistics Unit 6. Variable Any characteristic (data) recorded for the subjects of a study ex. blood pressure, nesting orientation, phytoplankton.

Similar presentations


Presentation on theme: "Descriptive Statistics Unit 6. Variable Any characteristic (data) recorded for the subjects of a study ex. blood pressure, nesting orientation, phytoplankton."— Presentation transcript:

1 Descriptive Statistics Unit 6

2 Variable Any characteristic (data) recorded for the subjects of a study ex. blood pressure, nesting orientation, phytoplankton count Can be classifies as either: -categorical -quantitative: *discrete *continuous

3 Categorical Variable Data belongs to one of a set of categories Exs: 1.Gender (Male or Female) 2.Pets owned (dog, cat, great white…) 3.Type of food imported (beef, pork, shellfish …) 4.Engage in 30 minutes of exercise daily (Yes or No) Type of graph(s): bar, pie

4 Pie Charts Summarizes categorical variable Drawn as circle where each category is a slice The size of each slice is proportional to the percentage in that category

5 Bar Graphs Summarizes categorical variable Vertical bars for each category Height of each bar represents either counts or percentages Easier to compare categories with bar graph than with pie chart Called Pareto Charts when ordered from tallest to shortest

6 Quantitative Variable Data is given numerical values for different magnitudes. Exs: 1.Age of test subjects 2.Number of siblings 3.Seasonal changes in pH of pond water Type of graph: scatter-plot, line, stem and leaf

7 Quantitative vs. Categorical For Quantitative variables, key features are the center (a representative value) and spread (variability). For Categorical variables, a key feature is the percentage of data in each of the categories

8 Discrete Quantitative Variable Quantitative variable is discrete if its possible values form a set of separate numbers: 0,1,2,3,…. Exs: 1.Number of calico cats sold 2.Number of nests with down linings 3.Number of students who fall asleep in Stats class

9 Continuous Quantitative Variable Quantitative variable is continuous if its possible values form an interval Measurements Examples: 1.Height/Weight 2.Age 3.Blood pressure

10

11

12 Most Common Way to Describe Data Central tendency Statistical variation

13 Central Tendency Used to represent entire data set Highlights distribution of data Measures one of the following: mode, mean, and median

14 Mode Value that occurs most often Highest bar in the histogram Mode is most often used with categorical data Best if not used alone

15 12, 12, 13, 14, 14, 15, 15, 15, 15, 37, 38 2, 3, 3, 4, 5, 5- bimodal 65, 68, 69, 71, 72, 73, 75, 77- mode?

16 Mean The sum of the observations divided by the number of observations Measure of centermost point when there is a symmetrical distribution of values in a data set Mean = Σx Σ- sum n n- total number of values

17 8g/cm³, 10g/cm³, 7g/cm³, 9g/cm³ 8g/cm³ + 10g/cm³ + 7g/cm³ + 9g/cm³ 4 34g/cm³ 4 8.5g/cm³

18 Median Midpoint of the observations when ordered from least to greatest Used when there are extremes in data 1. Order observations 2. If the number of observations is: a)Odd, the median is the middle observation b)Even- the median is the average of the two middle observations

19

20 Central Tendency If data set has normal distribution: mean, median and mode are the same value If data set is not distributed normally: values of central tendency will vary. *requires inferential statistics: t-test, ANOVA

21

22 Comparing the Mean and Median Mean and median of a symmetric distribution are close In a skewed distribution: the mean is farther out than the median

23 Statistical Variation Shows how scores differ from one another AKA: variation, dispersion, spread Represent average difference from the mean Four measures of variation: range, interquartile range, standard deviation, variance

24 Range Most general measure of variation Measures difference between highest and lowest values: spread of data Ex. pH 6, 6, 6, 7, 7, 7, 7, 5, 3 range: 7-3 = pH 4

25 Range range is strongly affected by outliers.

26 Interquartile Range- IQR AKA mid-fifty or midspread Organizes data into 4 quartiles, each with 25% of data To calculate IQR: 1. Find median of entire data set 2. Find median of lower half of set- lower quartile 3. Find median of upper half of set- upper quartile

27 Quartiles

28 M = median = 3.4 Q 1 = first quartile = 2.2 Q 3 = third quartile = 4.35 Measure of Spread: Quartiles * 25% of the data at or below Q 1 and 75% above * 50% of the obs are above the median and 50% are below * 75% of the data at or below Q 3 and 25% above

29 Calculating Interquartile Range Interquartile range: distance between the third and first quartile, giving spread of middle 50% of the data: IQR = Q3 - Q1

30 Standard Deviation Each data value has an associated deviation from the mean, A deviation is positive if it falls above the mean and negative if it falls below the mean The sum of the deviations is always zero

31 Standard deviation: summarizes the deviations of each observation from the mean and calculates an adjusted average of these deviations: Standard Deviation 1. Find mean 2. Find each deviation 3. Square deviations 4. Sum squared deviations 5. Divide sum by n-1 6. Take square root

32 Outlier An outlier falls far from the rest of the data

33 Graphs for Quantitative Data 1.Dot Plot: shows a dot for each observation placed above its value on a number line 2.Stem-and-Leaf Plot: displays individual observations 3.Histogram: uses bars to portray the data

34 Which Graph? Dot-plot and stem-and-leaf plot: More useful for small data sets Data values are retained Histogram More useful for large data sets Most compact display More flexibility in defining intervals content.answers.com

35 Dot Plots To construct a dot plot 1.Draw and label horizontal line 2.Mark regular values 3.Place a dot above each value on the number line Sodium in Cereals

36 Stem-and-leaf plots Summarizes quantitative variables Separates each observation into a stem (first part of #) and a leaf (last digit) Write each leaf to the right of its stem; order leaves if desired Sodium in Cereals


Download ppt "Descriptive Statistics Unit 6. Variable Any characteristic (data) recorded for the subjects of a study ex. blood pressure, nesting orientation, phytoplankton."

Similar presentations


Ads by Google