Operational Definition The activities of the researcher in measuring and manipulating a variable. Caffeine: consumption of 1 cup of coffee 1 hour before experiment. Anxiety: measured by galvanic skin response. Sleep deprivation and test performance sleep deprivation: awake for 24 hours Exam performance: 50 questions, 1 point each question
Identity Data points that are different, receive different scores. Ex: types of cereal Corn Flakes = 12 Cheerios = 5 Raisin Bran =9 The numerical values serve to divide the data into categories. Nominal scale of measurement. Categorical variables: Ex: ethnicity, gender, religion There is no absolute zero value: no ethnicity, no gender
Categorical Data Nonparametric statistical procedures Chi Square test of Independence Chi Square test of Goodness- of-Fit
Magnitude Data that rank in order along a continuum of the variable being measured. Ex: time to finish NYC marathon: 1 st place: 2:09:59 2 nd place: 2:13:51 3 rd place: 2:34:27 Difference between ranks is not the same. Ordinal scale of measurement. There is no absolute zero value Nonparametric statistics: Wilcoxon tests
Equal Unit Size Data that have an equal unit size vary by the same difference throughout the scale. Ex: temperature Interval scale of measurement. Do not have an absolute zero: 0 degrees is still a temperature. Capable of performing math operations on interval data.
Absolute Zero Data assigned a zero indicates the absence of a variable being measured. Ex: words recalled – score is zero if no words are recalled Ratio scale of measurement. Data can be described in terms of proportions or ratios. Ex: 16 words recall is twice the number of 8 words recall. Statistics: t-tests, ANOVAs, correlation coefficients
Discrete vs. Continuous Variables Discrete variables Whole number units or categories Values are distinct and detached from each other. Ex: gender, religion, number of children Continuous variable Allow for fractional values. Fall on a continuum. Ex: weight (75.45 lbs), reaction time (23.41 seconds)
Reliability Consistency or stability of a measuring instrument or measures of behavior. Observed score = true score + measurement error Measured using correlation coefficients r = 0 to 1 As error increases, reliability scores drop below 1.00.
Reliability Types Test/Retest reliability Alternate forms reliability Split-half reliability Interrater reliability See next slide for definitions…
Test/Retest Reliability Giving the same test again over a short time interval. Measures how performance on the 1 st test is correlated with performance on the 2 nd test. if the correlation is high, then the test is reliable. Measures the stability of a test over time. Problems: Practice effects
Alternate-forms Reliability Administering 2 tests, but the tests are slightly altered from each other (parallel-forms reliability). Measures how performance on the 1 st test is correlated with performance on the alternate 2 nd test. if the correlation is high, then the tests are reliable. Measures the stability of a test over time. Difficult to create 2 tests that are truly parallel. Practice effects
Split-half Reliability One test is divided into 2 parts. Measures how scores on ½ of the test correlate with scores on the other ½ of the test. If correlation is high, then test is reliable. Does not measure stability of a test over time. Difficult to determine how to divide the test. Usually divided by odd number and even number questions Ensures that easy and difficult questions are not compared with each other.
Interrater Reliability (Interobserver Agreement) Measures the extent to which 2 or more raters agree on observations. Based on % agreement between raters. If the raters’ data are reliable, then the % agreement should be high. When low interrater reliability is observed: Check protocol Check measuring instruments Retrain raters
Validity The truth of a measure or observation. Ex: Validity test for new machine to measure heart rate Correlate the results obtained with the new machine with the results obtained with existing machines. 4 types of validity Criterion validity Construct validity External validity Internal validity
Criterion Validity Validates a measure by checking it against a standard measure (or criterion). Making predictions about one aspect of behavior based on another measure of behavior. Ex: SAT scores correlates with freshman year GPA in college. Predictive Validity
Construct Validity Degree to which IV and DV measure what they intend to measure. Coke vs. Root Beer vs. Pepsi example Confounding variables reduce the construct validity of a study. Minimize invalidity by using operational definitions and adhering to a protocol during study.
External Validity Extent to which the observations can be generalized to other settings and populations. Ex: Stroop effect Replications – whether the observations can be repeated under different circumstances. Provide an insight into the generality of observations
Internal Validity Experiments aim to determine cause-effect relations in the world. Internal Validity Extent to which we can make causal statements about the relationship between variables. Confounding variables reduce the internal validity of a study. Cannot infer causality
Reliability and Validity Study can be reliable, but not valid Rorschach test But if a study is valid, it is also reliable. Beck Depression Inventory