Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistics - is the science of collecting, organizing, and interpreting numerical facts we call data. Individuals – objects described by a set of data.

Similar presentations


Presentation on theme: "Statistics - is the science of collecting, organizing, and interpreting numerical facts we call data. Individuals – objects described by a set of data."— Presentation transcript:

1 Statistics - is the science of collecting, organizing, and interpreting numerical facts we call data. Individuals – objects described by a set of data (person or thing)

2 Variable – any characteristic of the individual Categorical Variable – grouped by categories Ex) race, sex, political party Quantitative Variable – numerical data Ex) height in feet, income in dollars Distribution – tells what values the variable can be

3 Make and model Vehicle type Transmis sion type # of cylinders City MPG Highway MPG BMW 318ICompactAutomatic42231 BMW 318ICompactManual42332 Buick century MidsizeAutomatic62029 Chevrolet blazer 4-wheel drive Automatic61620 Ex.) Fuel efficient cars, the following graph describes the fuel economy in miles per gallon of 1998 model motor vehicles a) What are the individuals in this data set? b) For each individual, what variables are given? Which of these are categorical and which are quantitative?

4 Ex.) Medical Study variables Data from medical studies contain values of many variables for each of the people who were subjects of the study. Which of the following variables are categorical and which are quantitative? a) Gender (female or male) b) Age (years) c) Race (Asian, black, white, or other) d) Smoker (yes or no) e) Systolic blood pressure (millimeters of mercury) f) Level of Calcium in the blood (micrograms per milliliter)

5 Section 1.1 Displaying Distributions with Graphs I. Stemplots – Quick way to picture the shape of a distribution while preserving the original data in the plot Individual bits of data can be seen relative to the overall picture of the distribution Separate into stem & leaf ( single digits ) List stems in increasing order top to bottom Arrange leaves in increasing order (spaced evenly)

6 Sometimes you need to split stems, 5 stems is a good minimum Use stem and leaf to question: unusually large observations; why are there groupings; why are they even; or why are there so few # in that group

7

8

9

10 II. Dotplots – one of the simplest to construct label your axes & title your graph, draw a horizontal line & label it with the variable 1) Title your graph 2) Make scale based on values 3) Mark a dot above the number on the horizontal axis

11 Looking at a distribution 1). Locate the center 2). Examine shape (symmetric or skewed) 3). Look for marked deviations from overall shape (gaps, outliers – fall outside the overall pattern)

12

13 III. Bar Graph & Pie Chart (usually categorical data) A. Bar Graph 1). Label axes & title graph 2). Scale axes, be sure categories are equally spaced 3). Draw a vertical bar to a height that corresponds with the count in that category B. Pie Chart Best to use a computer Divide into % of whole

14 IV.Histograms - Graph of quantitative variable 1). Make a frequency distribution dividing into classes of equal width. Be sure each observations falls into one class 2). Label & scale your axes & title your graph 3). Draw a bar that represents each classes frequency the base of the bar should cover its class 4). Break in scale symbol ‘// ’ on the axes 5). Interpretation (center, spread, shape, outliers) Histogram tips – five classes good minimum, not too few classes, not too many

15 Appearance of Data, Distribution Shapes 1) Symmetric (Symmetrical distribution)- data values are evenly distributed The mean, median, and mode are the same If the right & left sides of the histogram are approximately mirror images of each other mode = median = mean

16 2) Skewed to the right (Positively skewed distribution) – if the right side of the histogram extends much farther out than the left The mean and median are to the right of the mode Majority of data fall to left of the mean and cluster at the lower end of the distribution Mode median mean

17 3)Skewed to the left (Negatively skewed)– if the left side of the histogram extends much farther than the right. The mean and median are to the left of the mode Majority of data falls to the right Mean median mode

18 VI. Relative frequency, cumulative frequency, percentiles, and ogives 1) Percentile – of a distribution is the value such that p percent of the observations fall at or below it 2) Relative Cumulative frequency graph (ogive) a). Make a frequency table b). Add 3 columns ( relative freq., cumulative freq. and relative cumulative freq. 1). relative frequency divide count by total 2). Cumulative frequency column add counts in previous for frequency column 3). relative cumulative freq. divide entries in cumulative frequency by total c). Label and scale axes, title graph d). Plot pt. corresponding to relative frequency in each class.

19 3) Time Plots – Plots each observation against the time time is horizontal scale variable of interest is vertical Look for trend (common overall pattern, constant upward, or downward trend) Seasonal variation- a pattern that repeats itself at regular time intervals

20 Outlier – observations that seem to stand apart, judgement ( 1.5 IQR rule identifies suspected outliers) – it looks at the position of extreme scores compared with the spread of the center. Important in discussing data to see the big effects, major peaks in a distribution rather than minor ups and downs in a histogram, clear skewness rather than slight imbalance between the two sides of a distribution, serious outliers rather than just the largest and smallest observations. It is important to decide what to do with an outlier – if we remove it before continuing with the analysis; there is the risk of removing an important piece of information.

21 Graphing – looking at the legitimate ways to display data. Many bad graphs are designed to deceive. www.homerunrecord.com cnnsi.com/baseball/mlb/2002/postseason


Download ppt "Statistics - is the science of collecting, organizing, and interpreting numerical facts we call data. Individuals – objects described by a set of data."

Similar presentations


Ads by Google