Presentation is loading. Please wait.

Presentation is loading. Please wait.

Elementary statistics for foresters Lecture 2 Socrates/Erasmus WAU Spring semester 2005/2006.

Similar presentations


Presentation on theme: "Elementary statistics for foresters Lecture 2 Socrates/Erasmus WAU Spring semester 2005/2006."— Presentation transcript:

1 Elementary statistics for foresters Lecture 2 Socrates/Erasmus Program @ WAU Spring semester 2005/2006

2 Descriptive statistics

3 Data grouping (frequency distribution) Graphical data presentation (histogram, polygon, cumulative histogram, cumulative histogram) Measures of location (mean, quadratic mean, weighted mean, median, mode) Measures of dispersion (range, variance, standard deviation, coefficient of variation) Measures of asymmetry

4 Descriptive statistics Descriptive statistics are used to summarize or describe characteristics of a known set of data. Used if we want to describe or summarize data in a clear and concise way using graphical and/or numerical methods.

5 Descriptive statistics For example: we can consider everybody in the class as a group to be described. Each person can be a source of data for such an analysis. A characteristic of this data may be for example age, weight, height, sex, country of origin, etc.

6 Descriptive statistics Closer-to-forestry example: we can consider all pine stands in central Poland as a group to be characterized. Each stand can be described by its area, age, site index, average height, QMD, volume per hectare, volume increment per hectare per year, amount of carbon sequestered, species composition, damage index,...

7 Frequency distribution

8 Frequency distribution is an ordered statistical material (measurements) in classes (bins) built according to the investigated variable values

9 Frequency distribution How to build it? –determine classes (values/mid-points and class limits), depending on variable type –classify each unit/measurement to the appropriate class –sum units in each class

10 Frequency distribution Practical issues: –number of classes should be between 6 and 16 –classes should have identical widths –middle-class values/class mid-points should be chosen in such a way, that they are easy to manipulate

11 Frequency distribution

12 Graphical description of data Pictures are very informative and can tell the entire story about the data. We can use different plots for different sorts of variables. We can use for example bar plots (histograms), pie charts, box plots,....

13 Graphical description of data

14

15

16

17 Numerical data description

18 Sums and their properties

19 Measures of location Arithmetic mean Quadratic mean Weighted mean Median Mode other

20 Arithmetic mean

21 Quadratic mean

22 Properties of the mean Weighted mean...

23 Median If observations of a variable are ordered by value, the median value corresponds to the middle observation in that ordered list. The median value corresponds to a cumulative percentage of 50% (i.e., 50% of the values are below the median and 50% of the values are above the median).

24 Median The position of the median is calculated by the following formula:

25 Median How to calculate it? If the detailed values are available, sort the data file and find an appropriate value If the frequecy distribution is available, use the following formula:

26 Mode The mode is the most frequently observed data value. There may be no mode if no value appears more than any other. There may also be two (bimodal), three (trimodal), or more modes (multimodal). In the case of grouped frequency distributions, the modal class is the class with the largest frequency.

27 Mode If there is no exact mode available in the data file, you can calculate its value by using: –an approximate Pearson formula –by using an interpolation

28 Relationship between measures

29 f(x) μ o μ e μ c 3c

30 Relationship between measures μ μ e μ o c 3c x f(x)

31 Sample calculations

32

33

34 Measures of dispersion Range Variance Standard deviation Coefficient of variation

35 Range and variance Range is a difference between the lowest and the highest value in the data set Variance –average squared differences between data values and arithmetic mean

36 Variance

37

38

39 Standard deviation and coefficient of variation

40 Sample calculations

41 Measures of asymmetry Skewness: is a measure of the degree of asymmetry of a distribution. If the left tail is more pronounced than the right tail, the function has negative skewness. If the reverse is true, it has positive skewness. If the two are equal, it has zero skewness.

42 Skewness

43 Skewness can be calculated as a distance between mean and mode expressed in standard deviations:

44 Acknowledgements This presentation was made thanks to the support and contribution of dr Lech Wróblewski


Download ppt "Elementary statistics for foresters Lecture 2 Socrates/Erasmus WAU Spring semester 2005/2006."

Similar presentations


Ads by Google