Presentation on theme: "Descriptive statistics using Excel"— Presentation transcript:
1Descriptive statistics using Excel Displaying DataCalculating Measures of Central Tendency and DispersionClassification of dataIndex Numbers and CorrelationBell Shaped distributionKari Peisa, Ramk/Teli 2006
2Displaying Data Types of Data and Measurement Scales Scale Data KvalitativeKvantitativeNominalXOrdinalIntervalRatio
3Constructing Diagrams in Excel General instructions1. Construct an array where data values form continues series either in rows or in columns2. Write titles for each series3. Paint (=activate) the series which you want to include in diagram (data + titles), select Chart Wizard, and follow the instructions
4Examples of Displaying Data Nominal Scale and Pie Chart Observations can be assigned to different categories which can’t be placed in a meaningful order
5Examples of Displaying Data Ordinal Scale and Bar Chart Classes can be rank-ordered from highest to lowest.Classes may be represented also numerically when the measures of central tendency have a real value. But, do they actually have a real meaning?
6Examples of Displaying Data Interval Scale and Line Chart Addition and subtraction but not multiplication or division can be performed on data to compare observations. We can say that temperature at 15:00 is 1.1 C warmer than at 12:00, but we can’t say for example that -6C is twice as warm as -12 C or vice versa.
7Examples of Displaying Data Ratio Scale and Line Chart Ratio data has an absolute beginning point (true 0 point).All mathematical operations can be performed to compare values. We can say that the length of the plant in Septemper is more than twice as tall as in June.
8Measures of Central Tendency Average (Mean)ModeMedianQuartiles and PercentilesThe most frequently occurredThe occurrence in the middle of a set of ordered occurrencesThe occurrence at the first (25%), second (50%) or third (75%) quartile or at the given percentile (p %) of a set of ordered occurrences
9Measures of Central Tendency in Excel Average (Mean)Write the formulainto a cell …=AVERAGE(B2:E5)==AVERAGE(A1:A4;B2;C1:C2;E1:E4)A colon (:) stands between the upper left corner and the lower right corner of an arrayEmpty cells don’t effect on the value of meanThe reference to an array is made by painting the arrayThe semicolon (;) connects separate arrays
10Measures of Central Tendency in Excel Or, use the wizard: Insert function…SelectCategory: StatisticsFunction: AverageActivate the command line in the box and paint an array in Excel sheet
11Measures of Central Tendency in Excel ModeMedianQuartilesPercentiles=MODE(B2:B5;D2:E4)1., 2. or 3. quartile(the 2. = Median)=MEDIAN(B2:B5;D2:E4)=QUARTILE((B2:B5;D2:E4);1)=PERCENTILE((B2:B5;D2:E4);0.35))
12Measures of Dispersion Average deviationVarianceStandard DeviationSkewnessSkewness characterizes the degree of asymmetry of a distribution around its mean. It has been formulated in many ways.Pearson:
13Measures of Dispersion in Excel Average deviationVarianceStandard DeviationSkewness=AVEDEV(B2:B5;D2:E4)=VAR(B2:B5;D2:E4)=STDEV(B2:B5;D2:E4)=SKEW(B2:B5;D2:E4)
14Classification (Grouping) of Data In classification we arrange a large sample of data into classesThere are some rules usually followed when arranging classesThe classes should be of equal size (if possible)All data values from the original table need to be included in one and only in one classThe number of classes should be between 5 and 15.
15Classification in Excel The frequencies indicate the number of observations in the data array that are more than the upper limit in the previous row but less than or equal to the upper limit in this rowActivate the (whole) frequency column and write the formula=FREQUENCY(data;bins)into the first cell.Remark! This is an array formula, which means that we have to accept the formula by pressing:shift + ctrl + enter
16Index NumbersIndex numbers are used to display data in proportional formSource: CIA/ World Factbook 2002https://www.cia.gov/cia/publications/factbook/index.html
17Writing own formulas in Excel Fill right the formulaFormula:=C2/$B$2*$B$6Some fundamental rules for writing formulasStarts with = signA pair ColumnRow refers a cell in a relative positionA pair $Column$Row refers a cell in an absolute positionIn arrays you should be able to fill right or fill down formulas that are correctly formedPress F4 for switching between relative, absolute, and mixed references
18Displaying Correlation in Scattered Chart In scattered chart two data series are displayed in correspondence. The correlation coefficient between two measurement variables and different fitted curves are associated with this type of chart.
19Bell-Shaped modelWhen the sample size is reasonably large and tha data are not too skewed, we can estimate the real distribution by a mathematical model called ”bell-shaped” model (or Gaussian shape or Normal Distribution).The model needs to know the mean and the standard deviation of the original data
20Example of using Bell-Shaped Model How many percentage of the students in the previous frequency distribution got the mark at least 4?From original data:With the bell-shaped model, where the average =2.697 and standard deviation = of the original data, we have to calculate the total area under the model curve after the point x=3.5In Excel the formula isAnd it returns 0.246=1-NORMDIST(3.5;2.697;1.1677;1)Total area = 1Returns the area before x=3.5