Presentation on theme: "Making Social Work Count Lecture 4 An ESRC Curriculum Innovation and Researcher Development Initiative."— Presentation transcript:
Making Social Work Count Lecture 4 An ESRC Curriculum Innovation and Researcher Development Initiative
What is being studied? Approaches to measuring variables
Assessment and judgment Social workers have to assess all the time: – Is there a problem or need here? – What is the risk of things getting worse? – Have I made a difference? Researchers carry out similar tasks This lecture considers the key issue of developing meaningful measurements for use in quantitative research Many of the issues are of relevance to the more general task of “assessment”
Quantitative and qualitative All research involves simplification – The question is whether we know what is gained and lost by simplification Qualitative studies tend to focus on meaning – Common strategy is identifying themes of relevance Quantitative studies convert issues to numbers – Allows certain types of important description (e.g. how many people have this problem?) – And – crucially - comparison (e.g. are things getting better? Does one group have more problems?)
Quantitative and qualitative Quantitative research This session focuses on quantitative research It identifies key considerations in thinking about the quality of quantitative study – Reliability – Validity Qualitative research Some of these considerations can also be applied to qualitative research However, qualitative studies also have their own criteria for assessing good research
Learning outcomes Understand what a variable is Appreciate different types of variable that can be used in quantitative research Understand issues in relation to reliability and validity Know what a standardised instrument is Have had the opportunity to reflect on implications for practice
Example of children in care Returning to idea that care “fails” children Lecture 3 suggested that comparing children who have left care with the general population is not a valid comparison sample Now let’s look at outcome measures
Forrester et al. (2009) review The literature review focused on studies that looked at child welfare over time for children in care Strongest finding: very poor research base – this is a difficult area to research Of 13 studies, almost all suggested: – Most of the harm occurs before care – Children tend to do better once in care – Some harm occurs as children leave care – Even in good placements children still tend to have problems
But… What “outcomes” were being measured? What outcomes do YOU think should be measured for children in care?
Key points Deciding on “outcomes” or variables for a study is NOT some value- neutral, technocratic activity Key issues to consider: – WHO is deciding what is to be measured? (e.g. experts? Government? Service users?) – WHAT is being measured? – HOW is it being measured? [focus of this lecture]
Key points What is measured? For instance, in studies reviewed by Forrester: – the most common issue “measured” was behaviour (and particularly problem behaviour) – education was the second most common – others included physical growth, social relations, etc How is it measured? Studies in the review: – obtained information from social work files and made a researcher “judgment” – used school tests – pooled interview and other data and made a researcher “judgment” – used questionnaires to carers What are the strengths and weaknesses of each?
Attributes and variables An attribute – is a characteristic of an individual e.g. height, intelligence, beauty, serenity A variable – is the operationalisation of an attribute e.g. metres, IQ score, marks out of 10?, err… It allows attributes to be compared and described The focus of lecture is on: how attributes are operationalised?
Variables need to be reliable and valid Reliability Are the results consistent, e.g. can the results be replicated in different conditions and across different groups? Validity Does the instrument measure what it claims to measure?
Measures should be both reliable and valid Low reliability Cannot be valid if not reliable… Reliable Not valid Not reliable AND not valid
Standardised Instruments (SIs) Tools that measure a specific quality or characteristic e.g. psychological distress They let us compare results across groups in different settings e.g. social workers, families, teachers, police..... SIs need to be high in both reliability and validity
Reliability – overview The consistency of a measure A test is considered reliable if we get the same result repeatedly Reliability can be estimated in a number of different ways – Test-retest reliability: over time – Inter-rater reliability: between different scorers – Internal Consistency Reliability: across items on the same test
Test-Retest Reliability Tests the extent to which the test is repeatable and stable over time The same social workers are given the same questions 2 to 3 weeks later If the results differ substantially, and there has been no intervention, then we should question the reliability of those questions
Inter-rater reliability Where two or more people rate/score/judge the test The scores of the judges are compared to find the degree of correlation/consistency between their judgements If there is a high degree of correlation between the different judgements, the test can be said to be reliable
Internal Consistency Reliability For example where there are two questions within a SI that seem to be asking the same thing If the test is internally valid the respondent should give the same answer to both questions More generally questions should be linked to one another if they measuring the same attribute
Validity The extent to which a test measures what it claims to measure: – Construct validity: The degree to which the test measures the construct of what it wants to measure – the overarching type of validity – Predictive validity: The degree of effectiveness with which the performance on a test predicts performance in a real-life situation – Content validity: that items on the test represent the entire range of possible items the test should cover
Construct validity The degree to which the test measures what it is intended to measure The over-arching concept in validity – all other types of validity are ways of assessing this As a result construct validity has many elements: – Predictive validity (can it predict things e.g. IQ scores and later test results) – Criterion validity (does it correctly differentiate e.g. does a screening instrument identify people who are depressed) – Construct validity (is the full range of the construct included) – And other types…
Predictive validity Can structured risk assessment tools predict children who will be abused? Are the predictions more accurate than practitioners’ decisions?
Predictive validity Barlow et al (2013) found that most attempts to predict had low success i.e. high numbers of false positives or false negatives Further research needed to develop reliable tools that predict abuse or re-abuse Though this is also true for practitioners…
Content validity Refers to the extent to which a measure represents elements of a social construct or trait For example, a depression scale may lack content validity if it only assesses the affective dimension of depression but fails to take into account the behavioural dimension Or : how should “ethnicity” be defined? In practice it is not possible to capture the full range of possible ethnicities – but what level of simplification is “valid”?
General Health Questionnaire (GHQ) A reliable and valid screening instrument identifying aspects of current mental health (anxiety/depression/social phobia) The self administered questionnaire asks if someone has experienced a particular symptom or behaviour recently Each item is rated on a four- point scale Used in many countries in different languages
GHQ 12 questions Questions include: Have you recently Been able to concentrate on whatever you are doing 2. Lost much sleep over worry 3. Felt that you are playing a useful part in things 4. Felt capable of making decisions about things 5. Felt constantly under strain 6. Felt you couldn’t overcome your difficulties 7. Been able to enjoy your normal day to day activities 8. Been able to face up to your problems 9. Been feeling unhappy and depressed 10. Been losing confidence in yourself 11. Been thinking of yourself as a worthless person 12. Been feeling reasonably happy, all things considered
GHQ 12 Different ways of measuring risk of psychiatric problems using data All show reasonable link with clinical diagnosis Common way is ‘yes’ or ‘no’ (depending on question) in 4 or more questions How do social workers do…?
Clinical scores for social workers and general population using GHQ Carpenter et al, 2010; ONS, 2010
How to measure children’s emotional and behavioural welfare? SDQ: Questionnaire designed for carers, children and teachers Reliability is tested by: comparing emotional and behavioural welfare – and over time Validity is tested by: seeing whether scores predict children receiving specialist help, criminal behaviour, excluded from school and “real world” outcomes also comparing with clinical assessment and other instruments
Strengths and Difficulties Questionnaire (SDQ) A brief behavioural screening questionnaire for parents/carers/ teachers with 3-16 year olds Asks about psychological attributes, some positive and others negative – E.g. emotional, conduct, hyperactivity, peer relationship, prosocial behaviour
SDQ questions 25 questions composed of five scales with five questions in each scale E.g. 5 questions in the Emotional Symptoms Scale 1.I get a lot of headaches 2.I worry a lot 3.I am often unhappy 4.I am nervous in 5.I have many fears Responses: Not true/Somewhat true/Certainly true
Why does this matter? Worth considering common social work research methods such as coming to a “researcher judgment” – how reliable? How valid? More importantly – what about your practice? What is a better way of judging whether a child has emotional or behavioural problems, or an adult is at risk of psychological problems – your judgment or a standardized instrument? If you want to evaluate whether you are making a difference – what role might a standardized instrument have?
Learning outcomes Do you? Understand what a variable is Appreciate different types of variable that can be used in quantitative research Understand issues in relation to: – Reliability – Validity Know what a standardised instrument is Have had the opportunity to reflect on implications for practice
References Goldberg, D. & Williams, P. (1988) A user’s guide to the General Health Questionnaire. Slough: NFER-Nelson Goodman R (1997) The Strengths and Difficulties Questionnaire: A Research Note. Journal of Child Psychology and Psychiatry, 38, Barlow, J., Fisher, J.D. and Jones, D. (2013) Systematic Review of Models for Analysing Significant Harm, Department for Education Report; London Accessed: https://www.gov.uk/government/uploads/system/uploads/attachment_d ata/file/183949/DFE-RR199.pdf https://www.gov.uk/government/uploads/system/uploads/attachment_d ata/file/183949/DFE-RR199.pdf