Performance Monitoring in the Public Services. Challenges, opportunities, pitfalls (failed) Challenge: Performance Monitoring in the Public Services (missed)

Slides:

Advertisements

Similar presentations

Performance Monitoring in the Public Services. Royal Statistical Society Performance Indicators: Good, Bad, and Ugly Some good examples, but Scientific.

Advertisements

Study Size Planning for Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare Research and Quality (AHRQ)

How do we achieve cost effective cancer treatments in the UK? Professor Peter Littlejohns Department of Public Health and Primary Care.

Robert Coe Neil Appleby Academic mentoring in schools: a small RCT to evaluate a large policy Randomised Controlled trials in the Social Sciences: Challenges.

Benefits and limits of randomization 2.4. Tailoring the evaluation to the question Advantage: answer the specific question well – We design our evaluation.

Doug Altman Centre for Statistics in Medicine, Oxford, UK

Exploring uncertainty in cost effectiveness analysis NICE International and HITAP copyright © 2013 Francis Ruiz NICE International (acknowledgements to:

Equity vs. Adequacy By: Jay Masterson. For 100 years…  School financing through local wealth and property taxes  Creates a situation if significant.

Project Monitoring Evaluation and Assessment

Assessing Program Impact Chapter 8. Impact assessments answer… Does a program really work? Does a program produce desired effects over and above what.

Justice Data Lab: Facing the Third Sector How can we develop the capacity of third sector organisations to engage with data? Scottish Universities Insight.

Estimation and Reporting of Heterogeneity of Treatment Effects in Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare.

Elements of a clinical trial research protocol

Criminal Designs Sheila M. Bird MRC Biostatistics Unit, CAMBRIDGE CB2 2SR.

Obtaining Informed Consent: 1. Elements Of Informed Consent 2. Essential Information For Prospective Participants 3. Obligation for investigators.

© Grant Thornton UK LLP. All rights reserved. Review of Sickness Absence Vale of Glamorgan Council Final Report- November 2009.

Evaluating the Mixed Economy Model in Central Scotland Police Kenneth Scott Director, Centre for Criminal Justice and Police Studies University of the.

Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Basic Business Statistics.

Evaluation. Practical Evaluation Michael Quinn Patton.

Types of Evaluation.

Purpose of the Standards

Standard Treatment Protocols Managing Quality & Cost Dr. Farouk Abdallah Pediatrician, Q.M. Consultant Esqua Founder & Board Member 1995 Nasser Institute.

Health Systems and the Cycle of Health System Reform

Introduction to the design (and analysis) of experiments James M. Curran Department of Statistics, University of Auckland

Performance indicators: good, bad, and ugly The report of the Royal Statistical Society working party on performance monitoring in the public services,

Accredited Member of the Association of Clinical Research Professionals, USA Tips on clinical trials Maha Al-Farhan B.Sc, M.Phil., M.B.A., D.I.C.

The role of Audit Scotland in monitoring police performance Miranda Alcock Portfolio Manager – Public Reporting Group.

Making all research results publically available: the cry of systematic reviewers.

IPhVWP Polish Presidency, Warsaw October 6 th 2011 Almath Spooner Irish Medicines Board Monitoring the outcome of risk minimisation activities.

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Business Statistics,

Session 3.11 Risk Identification Presented By: RTI, JAIPUR.

Audit Commission Presentation Salford City Council Consideration of the financial statements.

Module 3. Session DCST Clinical governance

1 Experimental Study Designs Dr. Birgit Greiner Dep. of Epidemiology and Public Health.

Exploring the use of QSR Software for understanding quality - from a research funder’s perspective Janice Fong Research Officer Strategies in Qualitative.

An outcome evaluation of three restorative justice initiatives delivered by Thames Valley Probation Wager, N a, O’Keeffe, C b., Bates, A c. & Emerson,

Project CARA: Update and Lessons Learned Rob Braddock: Research Manager Hampshire Constabulary.

Preventing Surgical Complications Prevent Harm from High Alert Medication- Anticoagulants in Primary Care Insert Date here Presenter:

Criteria for Assessing The Feasibility of RCTs. RCTs in Social Science: York September 2006 Today’s Headlines: “Drugs education is not working” “ having.

OECD/INFE Tools for evaluating financial education programmes Adele Atkinson, PhD Policy Analyst OECD With the support of the Russian/World Bank/OECD Trust.

Introduction to Evaluation Odette Parry & Sally-Ann Baker

Evidencing Outcomes Ruth Mann / George Box Commissioning Strategies Group, NOMS February 2014 UNCLASSIFIED.

Consumer behavior studies1 CONSUMER BEHAVIOR STUDIES STATISTICAL ISSUES Ralph B. D’Agostino, Sr. Boston University Harvard Clinical Research Institute.

Measuring Efficiency CRJS 4466EA. Introduction It is very important to understand the effectiveness of a program, as we have discovered in all earlier.

Professor Anthea Hucklesby Centre for Criminal Justice Studies, University of Leeds, UK Co-funded by the Criminal Justice Programme.

Estimating Causal Effects from Large Data Sets Using Propensity Scores Hal V. Barron, MD TICR 5/06.

What is a non-inferiority trial, and what particular challenges do such trials present? Andrew Nunn MRC Clinical Trials Unit 20th February 2012.

Current Challenges and Future Developments in HTA in the UK Frances Macdonald, 23 rd September 2008 (A personal, Industry View)

Healthcare Commission update Sue Fraser-Betts Senior Assessment Manager October

EXPERIMENTAL EPIDEMIOLOGY

Evaluating Ongoing Programs: A Chronological Perspective to Include Performance Measurement Summarized from Berk & Rossi’s Thinking About Program Evaluation,

Clarifying uncertainties and needs for monitoring and evaluation Are there important uncertainties that should be addressed prior to making a decision?

Copyright © 2007 Pearson Education Canada 7-1 Chapter 7: Audit Planning and Documentation.

Development Impact Evaluation in Finance and Private Sector 1.

Africa Impact Evaluation Program on AIDS (AIM-AIDS) Cape Town, South Africa March 8 – 13, Steps in Implementing an Impact Evaluation Nandini Krishnan.

USE OF UNCERTAINTY OF MEASUREMENT IN TESTING ROHAN PERERA MSc ( UK ), ISO/IEC Technical Assessor, Metrology Consultant.

Formal experiments: randomisation + study size. Concept of randomisation Biology, 1926: Sir Ronald Fisher Medicine, 1947: Sir Austin Bradford Hill Randomised.

Pilot and Feasibility Studies NIHR Research Design Service Sam Norton, Liz Steed, Lauren Bell.

EVALUATION RESEARCH To know if Social programs, training programs, medical treatments, or other interventions work, we have to evaluate the outcomes systematically.

What is Impact Evaluation … and How Do We Use It? Deon Filmer Development Research Group, The World Bank Evidence-Based Decision-Making in Education Workshop.

EFFECTIVE, EFFICIENT AND ETHICAL ELECTRONIC CASE MANAGEMENT SYSTEMS Sue Tolley, Barnardos Australia Bronwen Elliott, Good Praxis P/L.

Inter-American Development Bank BIMILACI 2007 QUALITY PROCUREMENT Third Party Review May 2007 Project Procurement Division.

Challenges in Promoting RCR: Reflections from a Public Funder´s Perspective Secretariat on Responsible Conduct of Research [Canadian Institutes of Health.

Torbay Council Partnerships Review August PricewaterhouseCoopers LLP Date Page 2 Torbay Council Partnerships Background The Audit Commission defines.

Establishing by the laboratory of the functional requirements for uncertainty of measurements of each examination procedure Ioannis Sitaras.

Performance indicators: good, bad, and ugly The report of the Royal Statistical Society working party on Performance Monitoring in the Public Services.

Kids' legal rights in medical care, your obligations and risk minimisation 27 April 2017.

Chapter 5: Water management and adaptation

Introduction to the design (and analysis) of experiments

Presentation transcript:

Performance Monitoring in the Public Services

Challenges, opportunities, pitfalls (failed) Challenge: Performance Monitoring in the Public Services (missed) Opportunities: Formal Experiments in evaluating new policies (rescue) Pitfalls: Reporting, Reliance & rMDT Freedom of Information

Performance Indicators: Good, Bad, and Ugly Some good examples, but Scientific standards, in particular statistical standards, had been largely ignored

Royal Statistical Society concern PM schemes Well-designed, avoiding perverse behaviours*, Sufficiently analysed (context/case-mix) Fairly reported (measures of uncertainty) & Shielded from political interference. *Address seriously criticisms/concerns of those being monitored

1. Introduction 1990s rise in “government by measurement” goad to efficiency & effectiveness better public accountability (financial)

Three uses of PM data What works? (research role) Well[/under]-performing institutions or public servants... (managerial role) Hold Ministers to account for stewardship of public services (democratic role)

2. PM Design, Target Setting & Protocol How to set targets Step 1 Reasoned assessment of plausible improvement within PM time-scale Step 2 Work out PM scheme’s statistical potential ( “power”) re this rational target {see p11}

Power matters Excess power - incurs unnecessary cost Insufficient power – risks failing to identify effects that matter Insufficient power – can’t trust claims of policy ‘equivalence’ How not to set targets {see p12}

How not to set targets {p12} Progressive sharpening: “better of current target & current performance” (ignores uncertainty: prisons) Setting extreme target: “no-one to wait 4 hours” (abandon) Cascading the same target: 50% reduction in MRSA within 3 years (most hospitals: < 10 MRSA)

3. Analysis of PM data: same principles Importance of variability intrinsic part of real world & interesting per se + contributes to uncertainty in primary conclusions {p15} Adjusting for context to achieve comparability {note p17} incompleteness of any adjustment Multiple indicators resist 1-number summary (avoid value judgements + reveal intrinsic variation)

4. Presentation of PIs: same principles Simplicity =/= discard uncertainty League tables  uncertainty of ranking {PLOT 1} Star ‘banding’  show uncertainty of institution’s banding Funnel plot: variability depends on sample size; divergent hospitals stand out {see PLOT 2}

Plot 1: 95% intervals for ranks

Funnel plot: an alternative to the ‘league table’

Teenage pregnancies Government aim to reduce teenage pregnancies Target reduction is 15% between 1998 and 2004 Hope for 7.5% reduction by 2001

5. Impact of PM on the public services Public cost: if PM fails to identify under- performing institutions & so no remedial action is taken Less well recognised Institutional cost: falsely labelled as under- performing Unintended consequences: e.g. risk-averse surgeons

6. Evaluating PM initiatives Commensurate with risks & costs How soon to start evaluation Pre-determined policy roll-out (DTTOs) Disentangling (several) policy effects Role of experiments (& randomisation)

“What works” in UK criminal justice? RCTs essentially untried...

Judges prescribe sentence on lesser evidence than doctors prescribe medicines Is public aware?

7. Integrity, confidentiality & ethics Integrity (statistical) For public accountability: PIs need wider-than- government consensus + safeguards, as for National Statistics. Lacking if: irrational targets, insufficient power, cost-inefficient, analysis lacks objectivity or is superficial.

Royal Statistical Society is calling for PM protocols Independent scrutiny of disputed PIs Reporting of measures of uncertainty Research: into strategies other than “name & shame” + better designs for evaluating policy initiatives Wider consideration of PM ethics & cost- effectiveness

Application of scientific method R andomisation: to compare like with like A dequate study size: for precise estimation R eporting standards: as in medical journals E fficacy and costs: rational, prior estimates P eer scientific review of S tudy/trial protocol

Concept of randomisation Biology, 1926: Sir Ronald Fisher Medicine, 1947: Sir Austin Bradford Hill Randomised Controlled Trial Criminal justice ?

Randomisation in medicine Toss of coin determines experimental or control treatment RCT assignment unpredictable Fair [=> ethical] allocation of scarce resource Balance treatment numbers overall, in each hospital, and for major prognostic factors

RCT Telephone randomisation

Experiments & Power matter Designs for policy evaluations... which respect financial/political constraints

Evaluations-charade “ Public money spent on inferior (usually non-randomised) study designs that result in poor-quality evidence about how well policies actually work”  costly, inefficient by denying scientific method, & a serious loss in public accountability

Missed opportunities for experiments (including randomisation) Drug Treatment & Testing Orders (DTTOs) Cost-effectiveness matters!

SSRG Court DTTO-eligible offenders: do DTTOs work ? Off 1 DTTO Off 2 DTTO Off 3 alternative = Off 4 DTTO Off 5 alternative = Off 6 alternative = Database linkage to find out about major harms: offenders’ deaths, re-incarcerations &...

SSRG Court DTTO-eligible offenders: cost-effectiveness ? Off 7 DTTO = Off 8 alternative = Off 9 alternative = Off10 DTTO = Off11 DTTO = Off12 alternative = Off13 DTTO = Off14 alternative = Breaches... drugs spend?

UK courts’ DTTO-eligible offenders: ? guess Off 7 DTTO [ ? ] Off 8 DTTO [ ? ] Off 9 DTTO [ ? ] Off10 DTTO [ ? ] Off11 DTTO [ ? ] Off12 DTTO [ ? ] Off13 DTTO [ ? ] Off14 DTTO [ ? ] (before/after) Interviews versus... [ ? ]

Evaluations-charade Failure to randomise Failure to find out about major harms Failure even to elicit alternative sentence  funded guesswork on relative cost-effectiveness Volunteer-bias in follow-up interviews Inadequate study size re major outcomes...

Power (study size) matters! Back-of-envelope sum for 80% power Percentages Counts If MPs/journalists don’t know, UK plc keeps hurting

For 80% POWER, 5% significance: comparison of failure (re-conviction) rates Randomise per treatment group, 8 times STEP 1 answer = Success * fail rate + Success * fail rate for new disposal for control ( success rate for new – success rate for control ) 2

DTTO example: TARGET 60% v. control 70% reconviction rate? Randomise per ‘CJ disposal’ group, 8 times STEP 1 answer = 40 * * DTTOs control = (40 – 30) 2 100

Five PQs for every CJ initiative PQ1: Minister, why no randomised controls? PQ2: Minister, why have judges not even been asked to document offender’s alternative sentence that this CJ initiative supplants {re CE}? PQ3: What statistical power does Ministerial pilot have re well-reasoned targets? {or just kite flying...} PQ4: Minister, cost-effectiveness is driven by longer-term health & CJ harms, how are these ascertained {  database linkage}? PQ5: Minister, any ethical/consent issues?

“ If I had 50p for every prisoner that was liberated in error by the Scottish Prison Service and the police when they were doing the job I think I'd be quite a rich man”

Reliance: PIs, thresholds, penalties? Performance Indicator: severity Expect monthly Thresholds >A 0.02 >B BLANKED OUT Appeal Scotland’s Commission Freedom of Information Confidential Clause... Late delivery SMB Key compromise Prisoner disorder Serious assault Self-harm Overdue response

Random Mandatory Drugs Testing of Prisoners: rMDT Home Affairs Select Committee Inquiry, 2000 ONS contract from Home Office, 2001 Final report, 2003 With Minister... raised with National Statistician, Statistics Commission, 2004 Publication?... Freedom of Information! Disputed PI: costly, potential danger, impact on parole, underestimates inside-use of heroin, human rights...

Restorative Justice: Youth Justice Board 46 Restorative Justice projects with about 7000 clients by October 2001: Evaluation report for YJB, 2004 “to let 1000 flowers bloom... “ Satisfaction rates by victim & offender typically high (both having been willing for RJ? eligibility for, & response rate to, interviews?) YJB Targets: RJ used in 60% of disposals by 2003, & in 80% by 2004; 70% victims taking part to be satisfied!

Specific Recommendations Royal Statistical Society Working Party on Performance Monitoring in the Public Services

Royal Statistical Society: 11 Recommendations 1. PM procedures need detailed protocol 2. Must have clearly specified objectives, achieve them with rigour; & input to PM from institutions being monitored 3. Designed so that counter-productive behaviour is discouraged 4. Cost-effectiveness given wider consideration in design; & PM’s benefits should outweigh burden of collecting quality-assured data 5. Independent scrutiny – as safeguard of public accountability, methodological rigour, and of those being monitored

Royal Statistical Society: 11 Recommendations 6. Major sources of variation - due to case-mix, for example – must be recognised in design, target setting & analysis 7. Report measures of uncertainty: always 8. Research Councils: to investigate range of aspects of PM, including strategies other than “name & shame” 9. Research: into robust methods for evaluating new government policies, including role of randomised trials... In particular, efficient designs for ‘roll-out’ of new initiatives

Royal Statistical Society: 11 Recommendations 10. Ethical considerations may be involved in all aspects of PM procedures, and must be properly addressed 11. Wide-ranging educational effort is required about the role and interpretation of PM data Scotland’s Airborne score-card: 11/11... wrong!

Statistician’s role in PM: both Strenuously to safeguard from misconceived reactions to uncertainty those who are monitored Design effective PM protocol so that data are properly collected, exceptional performance can be recognised & reasons further investigated  Efficient, informative random sampling for inspections

(PM) Protocol Assumptions / Rationale in Choice of PI Objectives Calculations (power) & consultations + piloting Anticipated perverse consequences + avoidance Context/case-mix + data checks Analysis plan & dissemination rules Statistical performance of proposed PI monitoring + follow-up inspections PM’s cost-effectiveness? Identify PM designer & analyst to whom queries...