Student Consensus on RateMyProfessors.com April Bleske-Rechek, Amber Fritsch, and Brittany Henn University of Wisconsin Eau Claire Background Method Discussion.

Slides:

Advertisements

Similar presentations

Descriptive Measures MARE 250 Dr. Jason Turner.

Advertisements

Understanding Grade Level and School Growth Reports Office of Assessment and Accountability.

Literature Review Kathryn Westerman Oliver Smith Enrique Hernandez Megan Fowler.

Correlation Chapter 6. Assumptions for Pearson r X and Y should be interval or ratio. X and Y should be normally distributed. Each X should be independent.

Copyright © Allyn & Bacon (2010) Statistical Analysis of Data Graziano and Raulin Research Methods: Chapter 5 This multimedia product and its contents.

AGE VARIATION IN MATING STRATEGIES AND MATE PREFERENCES AMONG COLLEGE STUDENTS Danielle Ryan and April Bleske-Rechek, University of Wisconsin-Eau Claire.

Robert delMas, Joan Garfield, and Ann Ooms University of Minnesota

IDEA Student Ratings of Instruction Update Carrie Ahern and Lynette Molstad Selected slides reproduced with permission of Dr. Amy Gross from The IDEA Center.

Table of Contents Exit Appendix Behavioral Statistics.

Can You Match These Friends? A Test of Genetic Similarity Theory Katrina M. Sandager, Stephanie R. A. Maves, Sarah L. Hubert, and April Bleske-Rechek University.

Critical Thinking.

A BSTRACT Student evaluations of teachers are an important part of the university decision-making process. Therefore, factors beyond objective teaching.

Potential Biases in Student Ratings as a Measure of Teaching Effectiveness Kam-Por Kwan EDU Tel: etkpkwan.

1 Here are some additional methods for describing data.

1 Here are some additional methods for describing data.

Agenda for January 25 th Administrative Items/Announcements Attendance Handouts: course enrollment, RPP instructions Course packs available for sale in.

Class 6: Tuesday, Sep. 28 Section 2.4. Checking the assumptions of the simple linear regression model: –Residual plots –Normal quantile plots Outliers.

Measures of Dispersion

1 We will now consider the distributional properties of OLS estimators in models with a lagged dependent variable. We will do so for the simplest such.

Attraction and Flirtation in Young Adults’ and Middle-Aged Adults’ Opposite-Sex Friendships Erin E. Hirsch, Cierra A. Micke, and April Bleske-Rechek University.

Let’s Review for… AP Statistics!!! Chapter 1 Review Frank Cerros Xinlei Du Claire Dubois Ryan Hoshi.

Ways to look at the data Number of hurricanes that occurred each year from 1944 through 2000 as reported by Science magazine Histogram Dot plot Box plot.

Sample an AIMS activity: What Makes the Standard Deviation Larger or Smaller (NSF DUE ) Bob delMas, Joan Garfield, and Andy Zieffler University.

Background RateMyProfessors.com (RMP.com) is a public forum where students rate instructors on several characteristics: Clarity Helpfulness Overall Quality.

Student Engagement Survey Results and Analysis June 2011.

Topic 6.1 Statistical Analysis. Lesson 1: Mean and Range.

The Genetics Concept Assessment: a new concept inventory for genetics Michelle K. Smith, William B. Wood, and Jennifer K. Knight Science Education Initiative.

Statistics Recording the results from our studies.

Psychology’s Statistics Statistical Methods. Statistics  The overall purpose of statistics is to make to organize and make data more meaningful.  Ex.

A Statistical Analysis of Seedlings Planted in the Encampment Forest Association By: Tony Nixon.

Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.

Research & Statistics Looking for Conclusions. Statistics Mathematics is used to organize, summarize, and interpret mathematical data 2 types of statistics.

Introduction to Descriptive Statistics Objectives: 1.Explain the general role of statistics in assessment & evaluation 2.Explain three methods for describing.

Friends as Rivals: Perceptions of Attractiveness Predict Mating Rivalry in Female Friendships Stephanie R. A. Maves, Sarah L. Hubert, and April Bleske-Rechek.

National Commission for Academic Accreditation & Assessment Developmental Reviews at King Saud University and King Faisal University.

Counseling Research: Quantitative, Qualitative, and Mixed Methods, 1e © 2010 Pearson Education, Inc. All rights reserved. Basic Statistical Concepts Sang.

Biostatistics Class 1 1/25/2000 Introduction Descriptive Statistics.

Background Method Discussion References Acknowledgments Results  RateMyProfessors.com (RMP.com) launched in 1999 as an avenue for students to offer public.

MBP1010H – Lecture 4: March 26, Multiple regression 2.Survival analysis Reading: Introduction to the Practice of Statistics: Chapters 2, 10 and 11.

TYPES OF STATISTICAL METHODS USED IN PSYCHOLOGY Statistics.

Objectives 2.1Scatterplots  Scatterplots  Explanatory and response variables  Interpreting scatterplots  Outliers Adapted from authors’ slides © 2012.

Does time spent on Facebook affect your grades? Study results presented by: Mary Vietti : Power Point Creator Justin Price : Editor & Conclusion Jacob.

Statistics Chapter 1: Exploring Data. 1.1 Displaying Distributions with Graphs Individuals Objects that are described by a set of data Variables Any characteristic.

ITEC6310 Research Methods in Information Technology Instructor: Prof. Z. Yang Course Website: c6310.htm Office:

Individual Differences & Correlations Psy 425 Tests & Measurements Furr & Bacharach Ch 3, Part 1.

Data Analysis Econ 176, Fall Populations When we run an experiment, we are always measuring an outcome, x. We say that an outcome belongs to some.

Revision Analysing data. Measures of central tendency such as the mean and the median can be used to determine the location of the distribution of data.

BPS - 5th Ed. Chapter 251 Nonparametric Tests. BPS - 5th Ed. Chapter 252 Inference Methods So Far u Variables have had Normal distributions. u In practice,

1 Chapter 10 Correlation. 2  Finding that a relationship exists does not indicate much about the degree of association, or correlation, between two variables.

Introduction to statistics I Sophia King Rm. P24 HWB

STATISTICS STATISTICS Numerical data. How Do We Make Sense of the Data? descriptively Researchers use statistics for two major purposes: (1) descriptively.

Using IDEA for Assessment, Program Review, and Accreditation Texas A & M University November 8, 2012 Shelley A. Chapman, PhD.

ASSESSMENT CRITERIA Jessie Johncock Mod. 2 SPE 536 October 7, 2012.

Statistical Fundamentals: Using Microsoft Excel for Univariate and Bivariate Analysis Alfred P. Rovai Pearson Product-Moment Correlation Test PowerPoint.

USING GRAPHING SKILLS. Axis While drawing graphs, we have two axis. X-axis: for consistent variables Y-axis: for other variable.

Continuing Education Provincial Survey Winter 2012 Connie Phelps Manager, Institutional Research & Planning.

LESSON 5 - STATISTICS & RESEARCH STATISTICS – USE OF MATH TO ORGANIZE, SUMMARIZE, AND INTERPRET DATA.

CHAPTER 11 Mean and Standard Deviation. BOX AND WHISKER PLOTS  Worksheet on Interpreting and making a box and whisker plot in the calculator.

Project VIABLE - Direct Behavior Rating: Evaluating Behaviors with Positive and Negative Definitions Rose Jaffery 1, Albee T. Ongusco 3, Amy M. Briesch.

Contact Do advanced qualifications equate to better.

Assessing Students' Understanding of the Scientific Process Amy Marion, Department of Biology, New Mexico State University Abstract The primary goal of.

Attraction and Attractiveness in a Naturally Occurring

Evidence for gender bias in interpreting online professor ratings

Natural Sampling versus Mental Concepts Whitney Joseph

Research Statistics Objective: Students will acquire knowledge related to research Statistics in order to identify how they are used to develop research.

Module 8 Statistical Reasoning in Everyday Life

Course Access Frequencies and Student Grades

Does time spent on Facebook affect your grades?

Psych 231: Research Methods in Psychology

Presentation transcript:

Student Consensus on RateMyProfessors.com April Bleske-Rechek, Amber Fritsch, and Brittany Henn University of Wisconsin Eau Claire Background Method Discussion References Instructors Vary in Perceived Quality and Easiness Student Consensus About Instructors is of Similar Degree Across Number of Student Raters Students Show the Most Consensus about Instructors with Really High or Low Mean Ratings Results Variance in Student Ratings follows Quality, not Easiness RateMyProfessors.com (RMP.com) is a widely used public forum on which students use five-point scales to rate their instructors on characteristics such as Clarity, Helpfulness, Overall Quality (Clarity & Helpfulness), and Easiness (“how much work it is to get a good grade”). Despite RMP.com’s widespread use and increasing influence, 1 there are few systematic analyses of the ratings on RMP.com, and pop and scholarly evaluations of the site are mixed. 2, 3, 4 Previous research out of our lab provided some evidence of the validity of ratings on the site: Students report rating instructors for a variety of pedagogical (rather than emotional) reasons, and students who post ratings do not differ from other students in their learning goals or grade orientation. 5 In the current study, we further test the assumption that students’ postings on RateMyProfessors.com reflect students’ emotions rather than objective appraisals of the instruction they have received. If students’ postings reflect emotional responses to the instructor, then students should vary widely in their ratings about a given instructor. That is, each student’s ratings should include varied sources of bias and error, and only by aggregating many students’ responses should there be consensus about an instructor. According to this logic, the number of ratings should be negatively associated with degree of variance in the ratings – that is, it should take many ratings to get consensus around the mean. However, if students are not ranting and raving, but rather providing objective assessments of instructor pedagogy, then the number of ratings should not be related to variance in ratings – that is, it should not take many ratings to get consensus around the mean. We selected instructors from a single university who had 10 or more student ratings on RMP.com. First, we created a single data set for each of the instructors. Each instructor’s data included quality, helpfulness, clarity, and easiness ratings of each student who had rated that instructor; we computed descriptive statistics (M, SD, and variance) for each of those variables. Then, we compiled an umbrella dataset that included all the instructors. For each instructor, we recorded their sex and discipline, and their means, standard deviations, and variances (as described above), as well as the number of student raters that had contributed to those summary statistics. After removing six outlier instructors who had more than 87 ratings, the final dataset included 367 instructors with anywhere between 10 and 86 ratings. All disciplines were well-represented. The histogram to the left shows the percent of instructors with each mean quality rating. The instructors under investigation here show a similar distribution of ratings as found in other research on RMP.com 6 : Instructors differ widely in how high or low in quality they are rated, but the distribution of means is slightly negatively skewed, such that, on average, instructors are rated more positively than negatively in terms of their quality. The histogram to the right displays the percent of instructors with each mean easiness rating. Again, instructors differ widely in how low or high in easiness they are rated. Ratings follow a normal distribution. Acknowledgements The scatter plot at left displays the association between number of quality ratings and degree of consensus in those ratings; the scatter at right displays the association between number of easiness ratings and degree of consensus in those ratings. Each dot represents a given instructor’s number of ratings and degree of consensus in those ratings. As we expected and counter to negative assumptions about the reliability of student ratings on the site, degree of variance in a given instructor’s ratings is not associated with how many students have rated them. In other words, instructors with 10 ratings show the same degree of consensus in their ratings as do instructors with 50 ratings. The scatter at left shows an r(367) =.03, p =.52; at right r(367) =.09, p =.09. If anything, the trend at right suggests that more variance in easiness ratings is tied to more raters, not fewer. Within each sex and within each discipline, number of ratings was not associated with degree of variance in instructor quality ratings, all ps >.07, values for r ranged from -.18 to Although degree of variance in student ratings is not tied to how many students have provided ratings, degree of variance in student ratings about instructors is tied to the overall perception of their quality (shown at left, quadratic F(2, 364) = , p <.001, R 2 =.70) and easiness (shown at right, quadratic F(2, 364) = 66.25, p <.001, R 2 =.27). The effect is very robust for quality, and replicated by sex and discipline of instructor. Instructors with very high mean quality ratings showed very little variance in students’ ratings (or strong consensus) – in some cases essentially no variance at all. The bar graph at right displays a finding we have documented previously 5 : The 103 instructors in Math & Natural Sciences departments are rated as less easy than are instructors in each of the other disciplines (135 Arts & Humanities, 66 Social Sciences, and 63 Pre- professional), F(3, 363 = 6.21, p <.001, partial η² =.05, all pair-wise ps ≤.01. However, instructors in Math & Natural Sciences are rated similarly in quality, F(3, 363 = 0.21, p =.89, partial η² =.002. This finding implies that students are distinguishing between easiness and quality. In further support of students as objective judges of instruction, the bar graph at right shows that variance in students’ ratings of both easiness and quality do not differ by discipline. That is, students show similar consensus about their instructors, regardless of the type of discipline those instructors are in (easiness F(3, 363) = 0.68, p =.56, partial η² =.006; quality F(3, 363) = 0.10, p =.96, partial η² =.001). We thank the Office of Research and Sponsored Programs at the University of Wisconsin-Eau Claire for supporting this research, and the McNair Foundation for providing financial support to Brittany Henn. 1. Steinberg, J. (2009, August 6). As Forbes sees it, West Point beats Princeton (and Harvard, too). Retrieved from m/2009/08/06/westpoint/ 2. Otto, J., Sanford, D. A., & Ross, D. N. (2008). Does ratemyprofessor.com really rate my professor? Assessment & Evaluation in Higher Education, 33, Davison, E., & Price, J. (2009). How do we rate? An evaluation of online student evaluations. Assessment & Evaluation in Higher Education, 34, Felton, J., Mitchell, J., & Stonson, M. (2004). Web-based student evaluations of professors: The relationships between perceived quality, easiness, and sexiness. Assessment & Evaluation in Higher Education, 29, Bleske-Rechek, A., & Michels, K. (2010). RateMyProfessors.com: Testing assumptions about students use and misuse. Practical Assessment, Research & Evaluation, 15, Silva, K. M., Silva, F. J., Quinn, M. A., Draper, J. N., Cover, K. R., & Munoff, A. A. (2008). Rate my professor: Online evaluations of psychology instructors. Teaching of Psychology, 35, Coladarci, T., & Kornfield, I. (2007). RateMyProfessors.com versus formal in- class student evaluations of teaching. Practical Assessment, Research & Evaluation, 12, Research suggests that a majority of students use RateMyProfessors.com to either view or post ratings about instructors at their institution. 5 However, scholars have suggested that student ratings are not valid – that students are biased, grade-oriented consumers. 3,4 In this study we have documented several effects that, taken together, suggest that students provide valid judgments of instruction: (1) the typical instructor is rated as higher in quality than in easiness; (2) student consensus about instructors can be achieved with as few as 10 ratings; (3) students show very strong consensus about those instructors who are perceived to be high in quality; and (4) although one might presume that disciplines differ in how susceptible they are to student subjectivity or bias, the degree of variance (and therefore, consensus) in students’ ratings is similar across disciplines. In conclusion, we have shown that student consensus can be achieved with relatively few ratings. Our data place common assumptions about RateMyProfessors.com in question, and reinforce previous findings to suggest that, in the aggregate, students are providing objective ratings of quality of instruction. 2,5,7