Propensity Score Adjustments for Internet Survey of Voting Behavior:

Slides:



Advertisements
Similar presentations
Brief introduction on Logistic Regression
Advertisements

Sampling and Sampling Distributions: Part 2 Sample size and the sampling distribution of Sampling distribution of Sampling methods.
Survey Research & Understanding Statistics
Determining the Size of
Voting Behavior of Naturalized Citizens: Sarah R. Crissey Thom File U.S. Census Bureau Housing and Household Economic Statistics Division Presented.
T-Tests and Chi2 Does your sample data reflect the population from which it is drawn from?
Chapter 8 Introduction to Hypothesis Testing
Correlation and Linear Regression. Evaluating Relations Between Interval Level Variables Up to now you have learned to evaluate differences between the.
Chapter 15 Sampling and Sample Size Winston Jackson and Norine Verberg Methods: Doing Social Research, 4e.
Chapter Eleven The entire group of people about whom information is needed; also called the universe or population of interest. The process of obtaining.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 7-1 Chapter 7 Sampling and Sampling Distributions Basic Business Statistics 11 th Edition.
Basic Business Statistics
Lecture 5.  It is done to ensure the questions asked would generate the data that would answer the research questions n research objectives  The respondents.
Methods of Presenting and Interpreting Information Class 9.
Market research THE TIMES 100.
Inference for a Single Population Proportion (p)
Principles of Marketing - UNBSJ
Learning Objectives : After completing this lesson, you should be able to: Describe key data collection methods Know key definitions: Population vs. Sample.
Public Opinion and Political Action
Sampling From Populations
Chapter 8 - Interval Estimation
Sampling Distributions
Inference for Least Squares Lines
METHODS IN BEHAVIORAL RESEARCH
EXPERIMENTAL RESEARCH
A Comparison of Two Nonprobability Samples with Probability Samples
Constructing Propensity score weighted and matched Samples Stacey L
Chapter 7 (b) – Point Estimation and Sampling Distributions
Chapter Seven Public Opinion.
Collecting Data with Surveys and Scientific Studies
Did people do what they said
Relative Values.
Trena M. Ezzati-Rice, Frederick Rohde, Robert Baskin
Secondary PowerPoint 7: Opinion Polling in Elections
Sampling And Sampling Methods.
Chapter 3: risk measurement
Sampling.
Social Research Methods
Slides by JOHN LOUCKS St. Edward’s University.
MINI-SPEC REFERENDUM April 2010 Report
Chapter Three Research Design.
Chapter 7 Survey research.
Chapter Eight: Quantitative Methods
Jeremiah Coldsmith University of Pittsburgh at Johnstown
SPEC Barometer Results
Lecture 2: Data Collecting and Sampling
Section 2: Types of longitudinal studies
Sampling: Final and Initial Sample Size Determination
Random sampling Carlo Azzarri IFPRI Datathon APSU, Dhaka
Warmup To check the accuracy of a scale, a weight is weighed repeatedly. The scale readings are normally distributed with a standard deviation of
Categorical Data Analysis Review for Final
Task Force on Victimization Eurostat, October 2011 Guillaume Osier
STA 291 Summer 2008 Lecture 23 Dustin Lueker.
Public Opinion: Divided by Race?
& Political Socialization
Contingency Tables.
RESEARCH METHODOLOGY ON ENVIRONMENTAL HEALTH PRACTICE IN WEST AFRICA
Evaluating Impacts: An Overview of Quantitative Methods
Chapter 5: Producing Data
The European Statistical Training Programme (ESTP)
Fixed, Random and Mixed effects
Sampling.
Sampling Designs and Sampling Procedures
Inferential Statistics
Chapter: 9: Propensity scores
Multiple Regression – Split Sample Validation
Marketing Research: Course 3
New Techniques and Technologies for Statistics 2017  Estimation of Response Propensities and Indicators of Representative Response Using Population-Level.
Sampling: How to Select a Few to Represent the Many
Presentation transcript:

Propensity Score Adjustments for Internet Survey of Voting Behavior: A Case in Japan Tetsuro Kobayashi National Institute of Informatics, Japan k-tetsu@nii.ac.jp General Online Research 09 April 7, 2009 Vienna

Expectations toward Internet Surveys Deterioration of survey environments Changes in living environment Increasing “privacy” sensitivity ⇒Expectations toward Internet surveys Definition of Internet survey Survey methods in which research entities (normally research companies) recruit potential survey respondents and then have a selected group of respondents fill out one-line questionnaires

Potential problems in applying to social survey Ambiguous definition of population Originates from panel construction Purposive sampling Non-negligible divergences from probability sampling Problem of representativeness larger percentages of housewives in their thirties “professional respondents” ⇒Quata sampling Divergences still remain Advantages Collecting samples satisfying specific conditions

Propensity Score Adjustment Propensity Score (Rosenbaum and Rubin, 1983) The probability of a given respondent belonging to a particular group, which is calculated from covariates information. ‘Typicality’ of a given respondent in a particular group Covariates Variables that can be considered to affect variables that are targets of adjustment AND survey assignments (probability sampling surveys vs. internet surveys)

Propensity Score Adjustment Probability sampling survey Internet Covariates Logistic (or Probit) regression Independent vars: Covariates Dependent vars: Survey assignment probability sampling surveys vs. internet surveys Predict the probability(e) for each respondents e = propensity score

Strongly Ignorable Treatment Assignment Condition Selection of covariates is extremely important But unclear in past studies Effective selection criteria for survey data Hoshino & Shigemasu(2004), Hoshino & Maeda(2006) Strongly Ignorable Treatment Assignment Survey assignments depend on covariates yet do NOT directly depend on the dependent variables (the variables that are targets of adjustments)

Strongly Ignorable Treatment Assignment Condition 1. Select covariates that are stable within individuals and have the potential to be used continuously in both internet surveys and probability sampling surveys. 2. Select covariates that differentiate internet surveys from probability sampling surveys (in particular, select covariates that yield larger standardized partial regression coefficients and smaller p-values for probability significance, based on the results of analyses such as student’s t-test and logistic repression analysis). 3. Select covariates whose partial repression coefficients in predicting the target variables of adjustments are either positive or negative in both of the two sets of data. Especially, select covariates whose absolute values of standardized partial regression coefficients are relatively large. 4. Remove covariates from the set of covariates selected with the above criteria so as to make the sum of square error smaller (or minimize the increase).

Strongly Ignorable Treatment Assignment Condition Criterion 3 is particularly important Adjustments can have adverse effects if covariates are selected only to maximize the predicting power of a logistic regression analysis that predicts survey assignments (Hoshino & Maeda, 2006) We follow the four when determining covariates for calculating propensity scores and verify the effectiveness of propensity score adjustments.

Survey Data Internet Survey Probability Sampling Survey Three-wave panel design survey conducted during the last days before the Upper House election of Japan in summer 2007. Respondents Those who “regularly obtain information about social and political issues from internet resources” Purposive sampling An intensive panel survey that spans three days would have been difficult to conduct by traditional interview or telephone surveys Probability Sampling Survey Two-stage stratified random sampling; N=1373(54.9%) Sponsored by the Ministry of Education, Science, Sports and Culture, Grant-in-Aid(PI: Ken’ichi Ikeda; assignment number 18203033)

Dependent variables: The targets of adjustments Party ID

Dependent variables: The targets of adjustments Political parties for which Rs voted in the 2007 Upper House election Propositional representation seats Electoral district seats

Candidates for Covariates Gender Age Education level Annual household income Occupation Subjective social class Conservative vs. Progressive ideology Social tolerance Participation in organizations Participation in informal groups Openness and verticalness of the participating organizations and informal groups Exposure to election campaigns

Selection of Covariates 1. Select covariates that are stable within individuals and have the potential to be used continuously in both internet surveys and probability sampling surveys. ⇒OK 2. Select covariates that differentiate internet surveys from probability sampling surveys (in particular, select covariates that yield larger standardized partial regression coefficients and smaller p-values for probability significance, based on the results of analyses such as student’s t-test and logistic repression analysis).

Selection of Covariates 3. Select covariates whose partial repression coefficients in predicting the target variables of adjustments are either positive or negative in both of the two sets of data. Especially, select covariates whose absolute values of standardized partial regression coefficients are relatively large. Multi-nominal logit models Dependent vars: Party ID and Votes Independent vars: Each of the covariate candidates

Selection of Covariates

Calculation of Propensity Score Method 1: Only the demographic variables (gender, age, annual household income, and subjective social class) Method 2: Besides the demographic variables, political attitudes and behaviors were also included in this method. (gender, age, annual household income, subjective social class, conservative or progressive ideology, participation in organizations, exposure to political campaigns) Method 3: Criterion 3 was disregarded. With all the covariate candidates as independent variables, a logistic regression analysis that predicts survey assignments was performed, and covariates were selected with a variable reduction technique with 0.05 as the threshold for p-values.

Calculation of Propensity Score Logistic regressions Dependent var: Survey assignment probability sampling surveys vs. internet surveys Independent vars: Each sets of covariates Calculate the predicted probability (e) of each R being assigned to Internet survey If e<.10, then replace e with .10 as the lower limit. Calculate (1-e)/e, and use it as the weight for Internet survey respondents.

Adjustment of Party ID Setting the distribution of probability sampling survey as the true value, the squared errors with the three methods were calculated. The adjustments were unsuccessful because the square errors were larger after the adjustments than before.

Adjustment of Party ID

Adjustment of Votes The squared errors of electoral district The squared errors of proportional representation

Adjustment of Votes Propositional representation seats Electoral district seats

Why failed in Party ID? Case 1: The selected covariates have explanatory power over assignments but do NOT have explanatory power over dependent variables that are the targets of adjustments. Unlikely. Pseudo R2s of multi-nominal logit models do not differ much between party ID and votes. Case 2: The data from the probability sampling survey diverged from the true values of the population Case 3: The number of covariate candidates was too small Case 4: The check of Strongly Ignorable Treatment Assignment is retroactive

Why failed in Party ID? Strongly Ignorable Treatment Assignment Condition is not satisfied if survey assignment directly depends on party ID. Those who “regularly obtain information about social and political issues from internet resources” Strong interests in politics and tended to support particular political parties The people who support a particular political party might have a stronger tendency to participate in the internet survey

Why failed in Party ID? If survey assignments depend on variables that are not reducible to basic social attributes such as gender and age, adjustments with covariates only would be difficult.

Dilemma of Adjustments in Social Survey Advantages of Internet survey Purposively extracting samples that meet certain conditions that are difficult to capture with probability sampling surveys Intensive panel design with short intervals However, Strongly Ignorable Treatment Assignment condition may be violated if the criteria for purposive sampling themselves depend on dependent variables that are the targets of adjustment Adjustments of internet surveys in voting behavior and public opinion research can be effective if they do not involve purposive sampling with particular political inclinations