# Descriptive statistics using Excel

## Presentation on theme: "Descriptive statistics using Excel"— Presentation transcript:

Descriptive statistics using Excel
Displaying Data Calculating Measures of Central Tendency and Dispersion Classification of data Index Numbers and Correlation Bell Shaped distribution Kari Peisa, Ramk/Teli 2006

Displaying Data Types of Data and Measurement Scales Scale Data
Kvalitative Kvantitative Nominal X Ordinal Interval Ratio

Constructing Diagrams in Excel
General instructions 1. Construct an array where data values form continues series either in rows or in columns 2. Write titles for each series 3. Paint (=activate) the series which you want to include in diagram (data + titles), select Chart Wizard, and follow the instructions

Examples of Displaying Data Nominal Scale and Pie Chart
Observations can be assigned to different categories which can’t be placed in a meaningful order

Examples of Displaying Data Ordinal Scale and Bar Chart
Classes can be rank-ordered from highest to lowest. Classes may be represented also numerically when the measures of central tendency have a real value. But, do they actually have a real meaning?

Examples of Displaying Data Interval Scale and Line Chart
Addition and subtraction but not multiplication or division can be performed on data to compare observations. We can say that temperature at 15:00 is 1.1 C warmer than at 12:00, but we can’t say for example that -6C is twice as warm as -12 C or vice versa.

Examples of Displaying Data Ratio Scale and Line Chart
Ratio data has an absolute beginning point (true 0 point). All mathematical operations can be performed to compare values. We can say that the length of the plant in Septemper is more than twice as tall as in June.

Measures of Central Tendency
Average (Mean) Mode Median Quartiles and Percentiles The most frequently occurred The occurrence in the middle of a set of ordered occurrences The occurrence at the first (25%), second (50%) or third (75%) quartile or at the given percentile (p %) of a set of ordered occurrences

Measures of Central Tendency in Excel
Average (Mean) Write the formula into a cell … =AVERAGE(B2:E5) = =AVERAGE(A1:A4;B2;C1:C2;E1:E4) A colon (:) stands between the upper left corner and the lower right corner of an array Empty cells don’t effect on the value of mean The reference to an array is made by painting the array The semicolon (;) connects separate arrays

Measures of Central Tendency in Excel
Or, use the wizard: Insert function… Select Category: Statistics Function: Average Activate the command line in the box and paint an array in Excel sheet

Measures of Central Tendency in Excel
Mode Median Quartiles Percentiles =MODE(B2:B5;D2:E4) 1., 2. or 3. quartile (the 2. = Median) =MEDIAN(B2:B5;D2:E4) =QUARTILE((B2:B5;D2:E4);1) =PERCENTILE((B2:B5;D2:E4);0.35))

Measures of Dispersion
Average deviation Variance Standard Deviation Skewness Skewness characterizes the degree of asymmetry of a distribution around its mean. It has been formulated in many ways. Pearson:

Measures of Dispersion in Excel
Average deviation Variance Standard Deviation Skewness =AVEDEV(B2:B5;D2:E4) =VAR(B2:B5;D2:E4) =STDEV(B2:B5;D2:E4) =SKEW(B2:B5;D2:E4)

Classification (Grouping) of Data
In classification we arrange a large sample of data into classes There are some rules usually followed when arranging classes The classes should be of equal size (if possible) All data values from the original table need to be included in one and only in one class The number of classes should be between 5 and 15.

Classification in Excel
The frequencies indicate the number of observations in the data array that are more than the upper limit in the previous row but less than or equal to the upper limit in this row Activate the (whole) frequency column and write the formula =FREQUENCY(data;bins) into the first cell. Remark! This is an array formula, which means that we have to accept the formula by pressing: shift + ctrl + enter

Index Numbers Index numbers are used to display data in proportional form Source: CIA/ World Factbook 2002 https://www.cia.gov/cia/publications/factbook/index.html

Writing own formulas in Excel
Fill right the formula Formula: =C2/\$B\$2*\$B\$6 Some fundamental rules for writing formulas Starts with = sign A pair ColumnRow refers a cell in a relative position A pair \$Column\$Row refers a cell in an absolute position In arrays you should be able to fill right or fill down formulas that are correctly formed Press F4 for switching between relative, absolute, and mixed references

Displaying Correlation in Scattered Chart
In scattered chart two data series are displayed in correspondence. The correlation coefficient between two measurement variables and different fitted curves are associated with this type of chart.

Bell-Shaped model When the sample size is reasonably large and tha data are not too skewed, we can estimate the real distribution by a mathematical model called ”bell-shaped” model (or Gaussian shape or Normal Distribution). The model needs to know the mean and the standard deviation of the original data

Example of using Bell-Shaped Model
How many percentage of the students in the previous frequency distribution got the mark at least 4? From original data: With the bell-shaped model, where the average =2.697 and standard deviation = of the original data, we have to calculate the total area under the model curve after the point x=3.5 In Excel the formula is And it returns 0.246 =1-NORMDIST(3.5;2.697;1.1677;1) Total area = 1 Returns the area before x=3.5