Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Introduction to Quantitative Data Analysis

Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques, notion of random n Data Preparation (Coding & Cleaning Data) n Common Ways of Presenting Statistics u Tables u Charts u Graphs

Presenting Data (Raw Data) Regan, T. (1985). In search of sobriety: Identifying factors contributing to the recovery from alcoholism. Kentville, NS.

n univariate:= one variable n “raw count” (frequencies, percentages) Simple Univariate Tables of Frequency Distributions and Percentages Neuman (2000: 318)

Revision of Example: Collapsing Categories and Treatment of Missing Data in Tables Johnson, A. G. (1977). Social Statistics Without Tears. Toronto: McGraw Hill. n Example: Raw Data Frequencies

Types of Missing Data u Examples: Non-response, don’t know, refusal etc. n Categories of missing data u Missing data completely at random (MCAR) Missing data completely at random Missing data completely at random F Equipment malfunction, illness etc… u Missing data at random F Can be explained by controlling for another variable u Missing data that is not random

Some techniques for dealing with missing data u Omission (may involve using statistical techniques or logie to decide who to omit, ex. Add all like cases based on other responses) u Imputation (guess at what the likely responses would be by comparing with other response patterns) F Match other characteristics F Distribute by equally or use weighted responses

Treatment of Missing Data (Ommison vs. Inclusion) Table 5-1 Alienation of Workers Level of Alienation F % High 30 14 Medium 100 48 Low 20 10 No Response 60 29 (Total) 210 100 Comparison of % distributions and without non respondents Table 5-1 Alienation of Workers Level of Alienation F % High 30 20 Medium 100 67 Low 20 13 (Total) 150 100

n Comparison with high & medium alienation collapsed Treatment of Missing Data & collapsing categories (creating new variables after data collection) Table 5-1 Alienation of Workers Level of Alienation F % High & Medium 130 62 Low 20 10 No Response 60 29 (Total) 210 100 Table 5-1 Alienation of Workers Level of Alienation F % High & Medium 130 87 Low 20 13 (Total) 150 100 Non-respondents included Non-respondents eliminated

n Comparison with medium & low collapsed Treatment of Missing Data Table 5-1 Alienation of Workers Level of Alienation F % High 30 14 Medium & Low 120 58 No Response 60 29 (Total) 210 100 Table 5-1 Alienation of Workers Level of Alienation F % High 30 20 Medium & Low 120 80 (Total) 150 100 Non-respondents included Non-respondents eliminated

Effects of Collapsing Response Categories n Comparison of two different ways of collapsing response categories Table 5-1 Alienation of Workers Level of Alienation F % High & Medium 130 87 Low 20 13 (Total) 150 100 Table 5-1 Alienation of Workers Level of Alienation F % High 30 20 Medium & Low 120 80 (Total) 150 100

Collapsing categories (U.N. example) Babbie, E. (1995). The practice of social research Belmont, CA: Wadsworth

Collapsing Categories & omitting missing data Babbie, E. (1995). The practice of social research Belmont, CA: Wadsworth

Grouping Response Categories n To make new categories n Facilitate analysis of trends n But decisions have effects on the interpretation of patterns n Importance of understanding logic, conceptual and operational definitions n Same data can produce totally different-looking results

Bivariate Tables (Cross Tabulations): Tables Presenting Relationship between Two Variables Singleton, R., Straits, B. & Straits, M. (1993) Approaches to social research. Toronto: Oxford

Expected outcomes (Null Hypothesis) Singleton, R., Straits, B. & Straits, M. (1993) Approaches to social research. Toronto: Oxford

Interpretation issues (Bivariate Tables) n Percentages within categories of attributes of independent variable n In example: u Independent variable: gender u Dependent variable: fear of walking alone at night u Women more afraid than men

Styles of Presentation of Percentaged Tables (Bivariate) Table 1. Percentage in support of strike by type of school Percent supporting Type of School Strike Secondary60% (800) Elementary30% (1000) __________________________________________________________,=.30N = 1800 Serial NumberDescriptive Caption Dependent Variable Independent Variable Categories One category of dichotomous dependent variable Marginals for independent variable Percentage difference (epsilon) Total Sample

Factors to consider when reading table n Sampling technique? Or total population? n Conceptual & operational definitions ( Validity & reliability issues) u What measure was used? u How was it used? n Data preparation and cleaning issues (treatment of inconsistencies, non-responses etc..) n Data Analysis issues

Other Ways of Presenting Same Data & Interpretation Issues n Deciding on Direction of Calculation of Percentages? u Depends on Objectives (Research Questions), for example: F Are we interested in the patterns within each school type? F Are we interested in overall support of strike?

Other Ways of Presenting Bivariate Relationships in tabular form (ex. Ratios)

Control variables: Trivariate Tables Men/Women Drivers Automobile Accidents by Sex ------------------------------------------ Per Cent Accident Free Women68% (6,950) Men56% (7,080) ------------------------------------------ Automobile Accidents by Sex and Distance Driven ---------------------------------------------------------------------------- Distance Under 10,000 kmOver 10,000 km Per Cent Accident FreeAccident Free Women75% 48% (5,035) (1,915) Men75% 48% (2,070) (5,010) ---------------------------------------------------------------------------- Women have fewer accidents than men because women tend to drive less frequently than do men, and people who drive less frequently tend to have fewer accidents n In, Say it with Figures, Hans Zeisel presents the following data:

Another Way to Present Percentaged Tables (Trivariate) Table 2. Percentage who support strike by type of school and sex Sex Female Per cent Male Per cent Type of School supporting strikesupporting strike Secondary60%60% (400) Elementary30%30% (900) (100) __________________________________________________________ Female =.30  : Male =.30N = 1800 Dependent Variable Independent Variable Control variable Control variable Categories of control variable

Common Types of Charts & Graphs n Bar charts n Histograms n Pie Charts n Line Graphs/Polygons n Scattergrams

Bar Chart n Parallel bars or rectangles with lengths proportional to the frequency with which specified quantities occur in a set of data n graphic representation of frequency distribution, n generally used for discrete data.

A Bar Chart (flat-best for 2 dimensional data)

Bar Chart with break n World Population Growth Showing Projections (Time to add billions) Click for sourcesource

Histograms n graphically representing grouped data of a frequency distribution n baseline typically depicts the classes, and the vertical scale represents the frequencies or percentages n for continuous data. Example n In a survey of people between the age of 18 and 74 to determine the number of bike users categorized by age groups. n Q. Which age-group do you belong to? 18 to 24 25 to 34 35 to 44 45 to 54 55 to 64 65 to 74

Histogram

Pie Chart n circular chart n divided into sectors, illustrating relative magnitudes or frequencies. u arc length of each sector (and consequently its central angle and area), is proportional to the quantity it represents. n sectors create a full disk. Example: 2004 Election Results of EU (link to source & data) link to link to

exploded pie chart n one or more sectors separated from the rest of the disk Example: 2004 Election Results of EU

PresentationPresentation of identical data in pie and bar charts PresentationPresentation of identical data in pie and bar charts Problem with pie charts: easier to compare bar charts visually & to see differences in proportions

Line and Scatter Charts (Graph) n starts with mapping quantitative data points. n usually a dot or a small circle represents a single data point. n one mark (point) for every data point n visual distribution of the data n When both variables are quantitative, the line segment that connects the two points on the chart expresses a slope n Slope can be visually interpreted relative to the slope of other lines.

Example of Frequency Distribution Table from Textbook

Frequency Polygon Showing Same Data (Graph Plotting Frequency Distribution)

Common types of Distributions n Normal Distribution (bell-shaped curve) n Skewed Distributions n Bi-Modal Distributions

Normal Distribution Neuman (2000: 319)

Skewed Distributions Neuman (2000: 319)

Multiple Line charts

Multi-symbol Line chart

Combining Quantitative & Qualitative Info. In Graphs: Temperatures during Napoleon’s March (E. Tufte)

Line Chart (Poor example) n Example of Bad choice of graphic representation n Data discrete Connecting dots does not make sense because Measures of colours are nominal here

Scattergrams

Design & Interpretation Issues: Choice of Scales n Same data presented using different scales for x and y axis

Core Notions in Basic Univariate Statistics n Ways of describing data about one variable (“uni”=one) u Measures of central tendency F Summarize information about one variable (“averages”) u Measures of dispersion F Variations or “spread”

Measures of Central Tendency n summarize information about one variable in single number u Mode u Median u Mean n Use of Measures of Central Tendency u to summarize common “overall” “centralized” trends u doesn’t show variability, spread, dispersion

Mode Babbie (1995: 378) n most common or frequently occurring case (for all types of data)

Median Babbie (1995: 378) u middle point (only for ordinal, interval or ratio data)

Mean (arithmetic mean) Babbie (1995: 378) u “average” = sum of values divided by number of cases (only for ratio and interval data)

Normal Distribution & Measures of Central Tendency Neuman (2000: 319)

Skewed Distributions & Measures of Central Tendency Neuman (2000: 319)

Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Similar presentations

Presentation on theme: "Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Similar presentations

Presentation on theme: "Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,"— Presentation transcript:

Similar presentations

About project

Feedback