Analyzing Reliability and Validity in Outcomes Assessment

Slides:



Advertisements
Similar presentations
Part II Sigma Freud & Descriptive Statistics
Advertisements

Part II Sigma Freud & Descriptive Statistics
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT
Random Sampling and Data Description
Statistical Issues in Research Planning and Evaluation
Concept of Measurement
Beginning the Research Design
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 6 Introduction to Sampling Distributions.
A quick introduction to the analysis of questionnaire data John Richardson.
1 Validation and Verification of Simulation Models.
FOUNDATIONS OF NURSING RESEARCH Sixth Edition CHAPTER Copyright ©2012 by Pearson Education, Inc. All rights reserved. Foundations of Nursing Research,
Chapter 7 Correlational Research Gay, Mills, and Airasian
Copyright © 2008 by Pearson Education, Inc. Upper Saddle River, New Jersey All rights reserved. John W. Creswell Educational Research: Planning,
Dr. Engr. Sami ur Rahman Assistant Professor Department of Computer Science University of Malakand Research Methods in Computer Science Lecture: Research.
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
Chapter 9 Statistics Section 9.1 Frequency Distributions; Measures of Central Tendency.
Analyzing Reliability and Validity in Outcomes Assessment (Part 1) Robert W. Lingard and Deborah K. van Alphen California State University, Northridge.
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Foundations of Sociological Inquiry Quantitative Data Analysis.
EDU 8603 Day 6. What do the following numbers mean?
6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)
Basic Statistical Terms: Statistics: refers to the sample A means by which a set of data may be described and interpreted in a meaningful way. A method.
STATISTICS AND OPTIMIZATION Dr. Asawer A. Alwasiti.
Chapter 7 Measuring of data Reliability of measuring instruments The reliability* of instrument is the consistency with which it measures the target attribute.
Topics Semester I Descriptive statistics Time series Semester II Sampling Statistical Inference: Estimation, Hypothesis testing Relationships, casual models.
1 Collecting and Interpreting Quantitative Data Deborah K. van Alphen and Robert W. Lingard California State University, Northridge.
5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)
ESTABLISHING RELIABILITY AND VALIDITY OF RESEARCH TOOLS Prof. HCL Rawat Principal UCON,BFUHS Faridkot.
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
Statistics & Evidence-Based Practice
Risk Identification and Evaluation Chapter 2
Intro to Research Methods
Statistics in Management
MATH-138 Elementary Statistics
VALIDITY by Barli Tambunan/
Lecture 5 Validity and Reliability
CHAPTER 4 Research in Psychology: Methods & Design
Chapter 2: Methods for Describing Data Sets
Reliability and Validity in Research
Concept of Test Validity
Understanding Results
Chapter 5 STATISTICS (PART 1).
CHAPTER 2: PSYCHOLOGICAL RESEARCH METHODS AND STATISTICS
Statistical Analysis of Research
STATS DAY First a few review questions.
Analyzing Reliability and Validity in Outcomes Assessment (Part 1)
Analyzing Reliability and Validity in Outcomes Assessment Part 1
Lecture Slides Elementary Statistics Eleventh Edition
Introduction to Statistics
Evaluation of measuring tools: reliability
Introduction to Summary Statistics
Collecting and Interpreting Quantitative Data – Introduction (Part 1)
2-1 Data Summary and Display 2-1 Data Summary and Display.
MEDIATED MOOCS Introduction to descriptive statistics
Statistics: The Interpretation of Data
6A Types of Data, 6E Measuring the Centre of Data
Introduction to Summary Statistics
15.1 The Role of Statistics in the Research Process
Lecture 1: Descriptive Statistics and Exploratory
Measurement Concepts and scale evaluation
Introduction to Summary Statistics
(-4)*(-7)= Agenda Bell Ringer Bell Ringer
Descriptive Statistics
Introduction to Summary Statistics
Collecting and Interpreting Quantitative Data
Presentation transcript:

Analyzing Reliability and Validity in Outcomes Assessment Deborah K. van Alphen and Robert W. Lingard California State University, Northridge

Overview Introduction Basic statistical concepts Reliability Validity Correlation March 19, 2009 March 19, 2009 van Alphen & Lingard van Alphen & Lingard 2

Introduction – Basic Questions How can we make assessment easier? Minimize the effort required to collect data How can we design an assessment to provide useful information? Use tools to evaluate the approach How can we learn more from the assessment results we obtain? Use tools to interpret quantitative data March 19, 2009 van Alphen & Lingard

Making Assessment Easier Assess existing student work rather than creating or acquiring separate instruments. Measure only a sample of the population to be assessed. Depend on assessments at the College or University level. March 19, 2009 van Alphen & Lingard

Designing Assessments to Produce Useful Information Does the assessment instrument measure what is intended to be assessed? How should the students to be measured selected? Does the assessment approach produce consistent results? March 19, 2009 van Alphen & Lingard

Learning More from Assessment Results What do the assessment results really mean? Was the sample used representative and large enough? Were the results obtained meaningful? Were the instrument and process utilized dependable? Is a difference between two results significant? March 19, 2009 van Alphen & Lingard

Basic Statistical Concepts Central tendency for data Frequency distribution of data Variance among data March 19, 2009 van Alphen & Lingard

Measure of Central Tendency: Mean n = number of observations in a sample x1, x2, …, xn denotes these n observations , the sample mean, is the most common measure of center (a statistic) is the arithmetic mean of the n observations: µ represents the population mean, a parameter March 19, 2009 March 19, 2009 van Alphen & Lingard van Alphen & Lingard 8 8

Measure of Central Tendency: Median The median of a set of measurements is the middle value when the measurements are arranged in numerical order. If the number of measurements is even, the median is the mean of the two middle measurements. Example: {1, 2, 3, 4, 5} Median = 3 Example: {1, 2, 3, 4, 100} Median = 3 Example: {1, 2, 3, 4, 5, 6} Median = (3 + 4)/2 = 3.5 March 19, 2009 March 19, 2009 van Alphen & Lingard van Alphen & Lingard 9 9

Comparison of Mean and Median A survey of computer scientists yielded the following seven annual salaries: $31.3K, $41K, $45.1K, $46.3K, $47.5K, $51.6K, $61.3K median and mean salary If we add Bill Gates to the sample for this survey, the new sample (8 values) is: $31.3K, $41K, $45.1K, $46.3K, $47.5K, $51.6K, $61.3K, $966.7K median = $46.9K (slight increase) mean = $161.35K (large increase) Outliers have a large effect on the mean, but not the median March 19, 2009 March 19, 2009 van Alphen & Lingard van Alphen & Lingard 10 10

Frequency Distribution of Data The tabulation of raw data obtained by dividing the data into groups of some size and computing the number of data elements falling within each pair of group boundaries March 19, 2009 March 19, 2009 van Alphen & Lingard van Alphen & Lingard 11

Frequency Distribution – Tabular Form March 19, 2009 van Alphen & Lingard

Histogram A histogram is a graphical display of statistical information that uses rectangles to show the frequency of data items in successive numerical intervals of equal size. In the most common form of histogram, the independent variable is plotted along the horizontal axis and the dependent variable is plotted along the vertical axis. March 19, 2009 van Alphen & Lingard

Frequency Distribution -- Histogram March 19, 2009 van Alphen & Lingard

Variation among Data The following three sets of data have a mean of 10: {10, 10, 10} {5, 10, 15} {0, 10, 20} A numerical measure of their variation is needed to describe the data. The most commonly used measures of data variation are: Range Variance Standard Deviation March 19, 2009 March 19, 2009 van Alphen & Lingard van Alphen & Lingard 15

Measures of Variation: Variance Sample of size n: x1, x2, …, xn One measure of positive variation is Definition of sample variance (sample size = n): Definition of population variance (population size = N): March 19, 2009 March 19, 2009 van Alphen & Lingard van Alphen & Lingard 16

Measures of Variation: Standard Deviation Sample Standard Deviation: Population Standard Deviation: The units of standard deviation are the same as the units of the observations March 19, 2009 March 19, 2009 van Alphen & Lingard van Alphen & Lingard 17

Measures of Variation: Variance and Standard Deviation The following data sets each have a mean of 10. Data Set Variance Standard Deviation 10, 10, 10 (0+0+0)/2 = 0 5, 10, 15 (25 + 0 + 25)/2 = 25 5 0, 10, 20 (100 + 0 + 100)/2 = 100 10 Good measure of variation March 19, 2009 March 19, 2009 van Alphen & Lingard van Alphen & Lingard 18

Definition of Terms Related to Sampling Data: Observations (test scores, survey responses) that have been collected Population: Complete collection of all elements to be studied (e.g., all students in the program being assessed) Sample: Subset of elements selected from a population Parameter: A numerical measurement of a population Statistic: A numerical measurement describing some characteristic of a sample March 19, 2009 March 19, 2009 van Alphen & Lingard van Alphen & Lingard 19 19

Sampling Example Population = 1000 students Sample = 100 students There are 1000 students in our program, and we want to study certain achievements of these students. A subset of 100 students is selected for measurements. Population = 1000 students Sample = 100 students Data = 100 achievement measurements March 19, 2009 March 19, 2009 van Alphen & Lingard van Alphen & Lingard 20 20

Methods of Sample Selection Probability Sampling: The sample is representative of the population Non-Probability Sampling: The sample may or may not represent the population March 19, 2009 March 19, 2009 van Alphen & Lingard van Alphen & Lingard 21 21

Probability Samples Random sample: Each member of a population has an equal chance of being selected Stratified random sample: The population is divided into sub-groups (e.g., male and female) and a random sample from each sub-group is selected Cluster sample: The process of sampling moves through stages until the final sample has been selected January 12 - 13, 2009 S. Katz & D. van Alphen Introduction-22

Non-Probability Samples Convenience sample: The sample that is easily available at the time Network sample: The sample is constructed by finding members of the population through the contacts of a known member Quota sample: The sample takes available subjects, but it attempts to ensure inclusion of representatives from certain element s of the population January 12 - 13, 2009 S. Katz & D. van Alphen Introduction-23

Problems with Sampling The sample may not be representative of the population. The sample may be too small to provide valid results. It may be difficult to obtain the desired sample. March 19, 2009 van Alphen & Lingard

Reliability and Validity Reliability refers to the stability, repeatability, and consistency of an assessment instrument. (i.e., how good are the operational metrics and the measurement data). Validity refers to whether the measurement really measures what it was intended to measure (i.e., the extent to which an empirical measure reflects the real meaning of the concept under consideration). March 19, 2009 van Alphen & Lingard

Reliability Tests of stability: If an instrument can be used on the same individual more than once and achieve the same results, it is stable. Tests of repeatability: If different observers using the same instrument report the same results, the instrument is repeatable. Tests of internal consistency: If all parts of the instrument measure the same concept, it is internally consistent. March 19, 2009 van Alphen & Lingard

Validility Self-evident measures: Does the instrument appear to measure what it is supposed to measure. Face validity: “It looks all right on the face of it.” Content validity: The validity is estimated from a review of literature on the topic or through consultation with experts. March 19, 2009 van Alphen & Lingard

Validility (Con’td) Pragmatic measures: These test the practical value of a particular instrument. Concurrent validity: The results have high correlation with an established measurement. Predictive validity: The results predicted actually occur. Construct validity: When what you are attempting to measure is not directly observable, and the results correlate with a number of instruments attempting to measure the same construct. March 19, 2009 van Alphen & Lingard

Reliability and Validity Reliable & valid Reliable but not valid Valid but not reliable March 19, 2009 van Alphen & Lingard

Correlation Correlation is probably the most widely used statistical method to assess relationships among observational data. Correlation can show whether and how strongly two sets of observational data are related. Can be used to show reliability or validity by attempting to correlate the results from different approaches to assess the same outcome. March 19, 2009 van Alphen & Lingard

Example Correlation March 19, 2009 van Alphen & Lingard

Group Problem Assume your goal is to assess the written communication skills of students in your program. (Assume the number of students in the program is large and that you already have a rubric to use in assessing student writing.) Working with your group devise an approach to accomplish this task. March 19, 2009 van Alphen & Lingard

Group Problem (Cont’d) Specifically, who would you assess and what student produced work items would you evaluate, i.e., how would you construct an appropriate sample of students (or student work) to assess? Identify any concerns or potential difficulties with your plan, including issues of reliability or validity. What questions do you have regarding the interpretation of results once the assessment is completed? March 19, 2009 van Alphen & Lingard