Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 02 Dr. MUMTAZ AHMED MTH 161: Introduction To Statistics.

Similar presentations


Presentation on theme: "Lecture 02 Dr. MUMTAZ AHMED MTH 161: Introduction To Statistics."— Presentation transcript:

1 Lecture 02 Dr. MUMTAZ AHMED MTH 161: Introduction To Statistics

2 Objectives Methods of Data Presentations Classification of Data Bases of Classification Types of Classifications Tabulation of Data Types of Tabulations Constructing a Statistical Table General Rules of Tabulation Table of frequency distributions Frequency Distribution Relative frequency distribution Cumulative frequency distribution

3 Organizing Data After collecting data, the first task for a researcher is to organize and simplify the data so that it is possible to get a general overview of the results. Raw Data: Data which is not organized is called raw data. Un-Grouped Data: Data in its original form is called Un-Grouped Data. Note: Raw data is also called ungrouped data.

4 Different Ways of Organizing Data To get an understanding of the data, it is organized and arranged into a meaningful form. This is done by the following methods: Classification Tabulation (e.g. simple tables, frequency tables, stem and leaf plots etc.) Graphs (Bar Graph, Pie chart, Histogram, Frequency Ogive etc.)

5 Classification of Data The process of arranging data into homogenous group or classes according to some common characteristics present in the data is called classification. Example: The process of sorting letters in a post office, the letters are classified according to the cities and further arranged according to streets.

6 Bases of Classification There are four important bases of classification: Qualitative Base Quantitative Base Geographical Base Chronological or Temporal Base

7 Bases of Classification Qualitative Base: When the data are classified according to some quality or attributes such as sex, religion, etc. Quantitative Base: When the data are classified by quantitative characteristics like heights, weights, ages, income etc.

8 Bases of Classification Geographical Base: When the data are classified by geographical regions or location, like states, provinces, cities, countries etc. Chronological or Temporal Base: When the data are classified or arranged by their time of occurrence, such as years, months, weeks, days etc. (e.g. Time series data).

9 Types of Classification There are Three main types of classifications: One -way Classification Two-way Classification Multi-way Classification

10 One -way Classification If we classify observed data keeping in view single characteristic, this type of classification is known as one-way classification. Example: The population of world may be classified by religion as Muslim, Christian etc.

11 Two-way Classification If we consider two characteristics at a time in order to classify the observed data then we are doing two way classifications. Example: The population of world may be classified by Religion and Sex.

12 Multi-way Classification If we consider more than two characteristics at a time in order to classify the observed data then we are doing multi-way classification. Example: The population of world may be classified by Religion, Sex and Literacy.

13 Tabulation of Data The process of placing classified data into tabular form is known as tabulation. A table is a symmetric arrangement of statistical data in rows and columns. Rows are horizontal arrangements whereas columns are vertical arrangements.

14 Types of Tabulation There are Three types of tabulation: Simple or One-way Table Double or Two-way Table Complex or Multi-way Table

15 Simple or One-way Table When the data are tabulated to one characteristic, it is said to be simple tabulation or one-way tabulation. Example: Tabulation of data on population of world classified by one characteristic like Religion, is an example of simple tabulation.

16 Double or Two-way Table When the data are tabulated according to two characteristics at a time. It is said to be double tabulation or two-way tabulation. Example: Tabulation of data on population of world classified by two characteristics like Religion and Sex, is an example of double tabulation.

17 Complex or Multi-way Table When the data are tabulated according to many characteristics (generally more than two), it is said to be complex tabulation. Example: Tabulation of data on population of world classified by three characteristics like Religion, Sex and Literacy etc.

18 Construction of Statistical Table A statistical table has at least four major parts and some other minor parts. The Title The Box Head (column captions) The Stub (row captions) The Body Prefatory Notes Foot Notes Source Notes

19 General Sketch of Table THE TITLE (Prefatory Notes) Foot Notes… Source Notes… Box Head Row CaptionColumn Caption Stub Entries The Body

20 General Sketch of Table THE TITLE A title is the main heading written in capital shown at the top of the table. It must explain the contents of the table and throw light on the table as whole. Different parts of the heading can be separated by commas and no full stop should be used in the little. Box Head Row CaptionColumn Caption Stub Entries The Body THE TITLE (Prefatory Notes) Foot Notes… Source Notes…

21 General Sketch of Table THE Box Head (Column Captions) The vertical heading and subheading of the column are called columns captions. The spaces where these column headings are written is called box head. Only the first letter of the box head is in capital letters and the remaining words must be written in small letters. Box Head Row CaptionColumn Caption Stub Entries The Body THE TITLE (Prefatory Notes) Foot Notes… Source Notes…

22 General Sketch of Table THE Stub (Row Captions) The horizontal headings and sub-heading of the row are called row captions. The space where these row headings are written is called stub. Box Head Row CaptionColumn Caption Stub Entries The Body THE TITLE (Prefatory Notes) Foot Notes… Source Notes…

23 General Sketch of Table THE Body It is the main part of the table which contains the numerical information classified with respect to row and column captions. Box Head Row CaptionColumn Caption Stub Entries The Body THE TITLE (Prefatory Notes) Foot Notes… Source Notes…

24 General Sketch of Table Prefatory Notes A statement given below the title and enclosed in brackets usually describe the units of measurement. Box Head Row CaptionColumn Caption Stub Entries The Body THE TITLE (Prefatory Notes) Foot Notes… Source Notes…

25 General Sketch of Table Foot Notes It appears immediately below the body of the table providing the further additional explanation. Box Head Row CaptionColumn Caption Stub Entries The Body THE TITLE (Prefatory Notes) Foot Notes… Source Notes…

26 General Sketch of Table Source Notes The source notes is given at the end of the table indicating the source from where the information has been taken. It includes the information about compiling agency, publication etc. Box Head Row CaptionColumn Caption Stub Entries The Body THE TITLE (Prefatory Notes) Foot Notes… Source Notes…

27 General Rules of Tabulation A table should be simple and attractive. A complex table may be broken into relatively simple tables. Headings for columns and rows should be proper and clear. Suitable approximation may be adopted and figures may be rounded off. But this should be mentioned in the prefatory note or in the foot note. The unit of measurement and nature of data should be well defined.

28 Organizing Data via Frequency Tables One method for simplifying and organizing data is to construct a frequency distribution. Frequency Distribution: The organization of a set of data in a table showing the distribution of the data into classes or groups together with the number of observations in each class or group is called a Frequency Distribution. Class Frequency: The number of observations falling in a particular class is called class frequency or simply frequency, denoted by ‘f’. Grouped Data: Data presented in the form of a frequency distribution is called grouped data.

29 Why Use Frequency Distributions? A frequency distribution is a way to summarize data. A frequency distribution condenses the raw data into a more meaningful form. A frequency distribution allows for a quick visual interpretation of the data. Frequency Distributions can be drawn for qualitative data as well as quantitative data.

30 Frequency Distribution of Discrete Data Example: Number of children in 20 families. 2 3 1 3 2 5 4 1 4 2 3 5 2 5 2 1 3 1 2 0 Construct un-grouped or discrete frequency distribution. Interpretation: There is 1 family with no children. 4 families with 1 children 6 families with 2 children 4 families with 3 children 2 families with 4 children and 3 families with 1 children. No of Children TallyNo of Families (frequency) f 0|1 1| | 4 2| | | | |6 3| | 4 4| 2 5| | |3 Total20

31 Grouped Frequency Distribution Sometimes, when the data is continuous or covers a wide range of values, it becomes very burdensome to make a list of all values as in that case the list will be too long. To remedy this situation, a grouped frequency distribution table is used.

32 Grouped Frequency Distribution for Continuous Data Example (Temperature Data): Temperature of 20 winter days in Pakistan is recorded below: 24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41, 43, 44, 27, 53, 27 Construct frequency distribution. Note: Temperature is a continuous variable because it could be measured to any degree of precision desired.

33 Steps in Constructing Grouped Frequency Distribution Sort raw data from low to high: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Find range: Range=maximum value – minimum value=58 - 12 = 46 Select number of classes: 5 (usually between 5 and 20) Compute class width: Class width=Range/no of class=46/5=9.2 ~ 10 Determine class limits: 11-20, 21-30, 31-40, 41-50, 51-60 (Note: the above classes should cover the full data) Count the number of values in each class

34 Frequency Distribution of Grouped Data Sorted Data: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Frequency Distribution (Temp Data) ClassesTallyFrequency (f) 11-20| | |3 21-30| | | 7 31-40| | 4 41-50| | 4 51-60| 2 Total20

35 Frequency Distribution of Qualitative Data Political Party Affiliations: Professor X asked his introductory statistics students to state their political party affiliations as PML-N(N), PPP(P), PTI and PML-Q(Q). The responses of the 30 students in a class are: PPP N Q PTI N Q N PPP PTI N PTI N PTI PPP N Q N PTI Q PTI PPP PTI N PTI Q PTI N PTI Q PPP Construct a frequency distribution. Interpretation: Out of 30 students in the class, 10 are in favor of PTI 9 are in favor of PML-N 6 are in favor of PML-Q and 5 are in favor of PPP. PartyTallyFreq (f) PTI| | | | 10 N| | | | 9 Q| | | | |6 P| | 5 Total30

36 Relative Frequency Distribution Relative Frequency is the ratio of the frequency to the total number of observations. Relative frequency = Frequency/Number of observations Example: Relative frequency of students who favored PTI=10/30=0.333=33.33% Relative frequency of students who favored PML-N=9/30=0.3=30% Relative frequency of students who favored PML-Q=6/30=0.2=20% Relative frequency of students who favored PPP=5/30=0.167=16.67%

37 Frequency Distribution of Qualitative Data Party Affiliation Example: Interpretation: Out of 30 students in the class, 33.3% are in favor of PTI 30% are in favor of PML-N 20% are in favor of PML-Q and 16.7% are in favor of PPP. PartyFreq (f)Relative Freq PTI1010/30=0.3333 N99/30=0.30 Q66/30=0.20 P55/30=0.1667 Total301

38 Cumulative Frequency Distribution Cumulative Frequency: The total frequency of a variable from its one end to a certain values (usually upper class boundary in grouped data), called the base, is known as cumulative frequency less than or more than the base of the variable. Cumulative Frequency Distribution: The table showing cumulative frequencies is called cumulative frequency distribution.

39 Cumulative Frequency Distribution Constructing Class Boundaries: Take difference of lower limit of second class and upper limit of first class. (e.g. 21-20=1), Then divide this difference by 2. (i.e. ½=0.5). Subtract the resulting number (i.e. 0.5) from lower class limit of each class and add the resulting number (i.e. 0.5) to the upper class limit of each class. The newly obtained classes are called Class Boundaries (C.B). ClassesClass BoundariesFrequency (f) 11-2010.5-20.53 21-3020.5-30.56 31-4030.5-40.55 41-5040.5-50.54 51-6050.5-60.52 Total20

40 Less than Cumulative Frequency Distribution Frequency Distribution ofLess than Cumulative temperature datafrequency distribution of Temp data Class BoundariesCumulative Frequency Less than 10.50 Less than 20.53 Less than 30.53+6=9 Less than 40.59+5=14 Less than 50.514+4=18 Less than 60.518+2=20 ClassesClass Boundaries Frequency (f) 11-2010.5-20.53 21-3020.5-30.56 31-4030.5-40.55 41-5040.5-50.54 51-6050.5-60.52 Total20

41 More than Cumulative Frequency Distribution Frequency Distribution ofMore than Cumulative temperature datafrequency distribution of Temp data Class BoundariesCumulative Frequency More than 10.520 More than 20.520-3=17 More than 30.517-6=11 More than 40.511-5=6 More than 50.56-4=2 More than 60.52-2=0 ClassesClass Boundaries Frequency (f) 11-2010.5-20.53 21-3020.5-30.56 31-4030.5-40.55 41-5040.5-50.54 51-6050.5-60.52 Total20

42 Stem and Leaf Plot Disadvantage of Frequency Table: An obvious disadvantage of using frequency table is that the identity of individual observation is lost in the grouping process. Stem and Leaf plot provides the solution by offering a quick and clear way of sorting and displaying data simultaneously.

43 Stem and Leaf Plot METHOD: Sort the data series Separate the sorted data series into leading digits (the stem) and the trailing digits (the leaves) e.g. In 13, the leading digit (stem) is 1 and trailing digit (leaf) is 3 and in 21, the leading digit (stem) is 2 and trailing digit (leaf) is 1. List all stems in a column from low to high For each stem, list all associated leaves

44 Stem and Leaf Plot Example 1: Consider the temp data again. The sorted data from low to high is shown below: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Here, use the 10’s digit for the stem unit: 13 is shown as 21 is shown as 35 is shown as StemLeaf 13 21 35

45 Stem and Leaf Plot Data in ordered array: 12, 13, 17, 21, 24, 24, 26, 27, 28, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Completed Stem-and-leaf diagram StemLeaf 12 3 7 21 4 4 6 7 8 30 2 5 7 8 41 3 4 6 53 8

46 Review Let’s review the main concepts: Methods of Data Presentations Classification of Data Bases of Classification Types of Classifications Tabulation of Data Types of Tabulations Constructing a Statistical Table General Rules of Tabulation Table of frequency distributions Frequency Distribution Relative frequency distribution Cumulative frequency distribution

47 Next Lecture In next lecture, we will study: Graphical Methods of Data Presentations Graphs for qualitative data Bar Charts Simple Bar Chart Multiple Bar Chart Component Bar Chart Pie Charts Graphs for quantitative data Histograms Frequency Polygon Cumulative Frequency Polygon (Frequency Ogive)


Download ppt "Lecture 02 Dr. MUMTAZ AHMED MTH 161: Introduction To Statistics."

Similar presentations


Ads by Google