Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Experimental Design In Social Research - Do we know for example what works against crime? Seminar Centre for Census and Survey Research, University of.

Similar presentations


Presentation on theme: "1 Experimental Design In Social Research - Do we know for example what works against crime? Seminar Centre for Census and Survey Research, University of."— Presentation transcript:

1 1 Experimental Design In Social Research - Do we know for example what works against crime? Seminar Centre for Census and Survey Research, University of Manchester 5 Dec Paul Marchant Leeds Metropolitan University (Paul Baxter from Department of Statistics, University of Leeds is involved in developing some of this work)

2 2 The Basic Point If non-RCTs are used, we need a sound understanding of the system being studied and a quantitative model to work out what is lost and what the effect is. If non-RCTs are used, we need a sound understanding of the system being studied and a quantitative model to work out what is lost and what the effect is. The effects being sought may be small so impact of small systematic errors can be important. The effects being sought may be small so impact of small systematic errors can be important. Use best scientific methods, especially when policy implications are costly. Use best scientific methods, especially when policy implications are costly. Need rigorous scientific evaluation of the implementation of policy. Need rigorous scientific evaluation of the implementation of policy.

3 3 Science and Statistics Science is the belief in the unimportance of experts, Richard Feynman. (He wrote Cargo Cult Science on what distinguishes science from the rest.) Science is the belief in the unimportance of experts, Richard Feynman. (He wrote Cargo Cult Science on what distinguishes science from the rest.) Science is sceptical enquiry (note Gorard 2002). Science is sceptical enquiry (note Gorard 2002). Statistics is the logical glue which attaches conclusions to data. Statistics is the logical glue which attaches conclusions to data.

4 4 Randomised Controlled Trial v. something else In crime research there is a 5 point Scientific Methods Scale which orders trial designs (RCT is the top ) In crime research there is a 5 point Scientific Methods Scale which orders trial designs (RCT is the top ) While the ordering may be fine there is no formal indication of what is lost by using a 4 rather than a 5. While the ordering may be fine there is no formal indication of what is lost by using a 4 rather than a 5. A large potential exists it would seem to draw false inference. A large potential exists it would seem to draw false inference.

5 5 The Randomised Controlled Trial (A truly marvellous scientific invention) Note to avoid bias: Register trial / protocol. Register trial / protocol. Allocation is best made tamper-proof. Allocation is best made tamper-proof. (e.g. use concealment) Use multiple blinding of: Use multiple blinding of: patients, patients, physicians, physicians, assessors, assessors, analysts … analysts … Population Take Sample Randomise to 2 groups Old Treatment Compare outcomes (averages) recognising that these are sample results and subject to sampling variation when applying back to the population New Treatment

6 6 Counts of those cured and not cured under the two treatments CuredNot Cured New Treatmentab Control (Standard treatment) cd By comparing the ratios of numbers cured to not cured in the 2 arms of the trial, the Cross Product Ratio (CPR)= (ad)/(cb), it is possible to tell if the new treatment is better.

7 7 Confidence Intervals However there is sampling variability, because we dont study everybody of interest; just our random sample. However there is sampling variability, because we dont study everybody of interest; just our random sample. So cannot have perfect knowledge of the effect of interest, but only an estimate of it within a confidence interval (CI). So cannot have perfect knowledge of the effect of interest, but only an estimate of it within a confidence interval (CI). Need to know how to calculate the CI appropriately. This can be done under assumptions, which seem reasonable for the case of a clinical RCT and leads to a simple formula for the approximate CI (+/-1.96 standard error) of ln(CPR) Need to know how to calculate the CI appropriately. This can be done under assumptions, which seem reasonable for the case of a clinical RCT and leads to a simple formula for the approximate CI (+/-1.96 standard error) of ln(CPR) (s.e. (ln(CPR)) ) 2 = Var(ln(CPR)) = a b c d

8 8 Crime counts before and after in two areas one gets a Crime Reduction Intervention CRI (e.g. 4 on the Methods Scale) A similar table results. But this is not the same as the RCT set up as: 1 Not randomised, so no statistical equivalence exists at the start. 2 The unit is area, rather than crime event. BeforeAfter Treatment Area (Intervention is introduced between the 2 periods ) ab Comparison Area (Nothing is changed) cd

9 9 Lighting and crime There seem to be many theoretical suggestions why lighting might increase or decrease crime. The meta-analysis, HORS251, by Farrington and Welsh suggests strongly that lighting beats crime. However my contention is that this study contains flaws and so we cannot be sure of the effect of lighting on crime. (Note also HORS252 on CCTV)

10 10 Forest Plot as HORS 251 Meta-analysis reconstructed

11 11 But this cant be right. The assumptions for calculating the CIs cannot be correct, in this case. Unit is area not crime. The events are not statistically independent within areas. The assumptions for calculating the CIs cannot be correct, in this case. Unit is area not crime. The events are not statistically independent within areas. Too much variation (heterogeneity) exists as seen by Q- statistic between individual study results compared with the uncertainty indicated by confidence intervals, (if the lighting has the same effect on crime in every study). Too much variation (heterogeneity) exists as seen by Q- statistic between individual study results compared with the uncertainty indicated by confidence intervals, (if the lighting has the same effect on crime in every study). Note there is great variation in crime counts between periods in the comparison areas, where nothing is changed (TAU) (The variance is an order of magnitude greater than the mean). Shows the heterogeneity is inherent to the natural variation of crime. Crime is committed by criminals and it is their activitychanging which can cause great variation. Note there is great variation in crime counts between periods in the comparison areas, where nothing is changed (TAU) (The variance is an order of magnitude greater than the mean). Shows the heterogeneity is inherent to the natural variation of crime. Crime is committed by criminals and it is their activitychanging which can cause great variation.

12 12 Pointing out the problem Marchant (2004), 7 page article in the British Journal of Criminology drawing attention to the problem. The formula for the CIs used must be inappropriate (also mentioning other short- comings). Marchant (2004), 7 page article in the British Journal of Criminology drawing attention to the problem. The formula for the CIs used must be inappropriate (also mentioning other short- comings). The authors of HORS251 had 20-page response on the next page, justifying the claim that lighting reduces crime. The authors of HORS251 had 20-page response on the next page, justifying the claim that lighting reduces crime. But I remain unconvinced by the claim. But I remain unconvinced by the claim.

13 13 Study Name k obs = s 2 / x Atlanta Milwaukee Portland Kansas City Harrisburg New Orleans Fort Worth Indianapolis Dover Bristol Birmingham Dudley Stoke-on-Trent k obs are extremely variable and right skewed. The arithmetic mean is 15 for these comparison areas. Calculated from the before and after counts in the comparison areas Correlation between intervention and comparison tends to reduce the effect of overdispersion. Examine Overdispersion in Comparison Areas Examine Overdispersion in Comparison Areas

14 14 Fixing the Heterogeneity Problem A way of making the problem go away is simply to increase the uncertainty, i.e. stretch the CIs. (A quasi-Poisson/Binomial model). A way of making the problem go away is simply to increase the uncertainty, i.e. stretch the CIs. (A quasi-Poisson/Binomial model). Here the CIs are stretched by a factor of 2.1. (Equivalent to reducing the events counted in every setting by a factor = 4.4. ). This adjustment has been made by the authors. (Farrington and Welsh 2006) Here the CIs are stretched by a factor of 2.1. (Equivalent to reducing the events counted in every setting by a factor = 4.4. ). This adjustment has been made by the authors. (Farrington and Welsh 2006) Problem solved.... or is it? Is such model plausible? Assumes every study should have its CI stretched by the same factor. This cannot be guaranteed. Problem solved.... or is it? Is such model plausible? Assumes every study should have its CI stretched by the same factor. This cannot be guaranteed.

15 15 Some doubts Only relatively few studies. Only relatively few studies. Need to understand the sensitivity of the result to assumptions. Need to understand the sensitivity of the result to assumptions. How does the variation of crime depend on the level of crime? How does the variation of crime depend on the level of crime? Publication bias bedevils systematic reviews, association with positive findings. Publication bias bedevils systematic reviews, association with positive findings. The 2 studies which are given as statistically significant under the quasi –model also have time series of counts and these do not appear to show convincing evidence for lighting benefit. The 2 studies which are given as statistically significant under the quasi –model also have time series of counts and these do not appear to show convincing evidence for lighting benefit.

16 16 The Bristol Study (Shaftoe 1994) Shaftoe said no discernable lighting benefit but HORS251 said z=6.6 Note: had the data for the year immediately prior to the introduction of the relighting, i.e. periods 2 and 3, been used rather than unnaturally using periods 1 and 2 which leaves a gap of ½ year, the effect found would have been half of that claimed. (Shows large variability.)

17 17 Time Variation in Crime It appears that little is known about how crime varies on various scales, in general. It appears that little is known about how crime varies on various scales, in general. Much more needs to be known about the occurrence of crime events to know how to analyse them properly to be able find effects. Much more needs to be known about the occurrence of crime events to know how to analyse them properly to be able find effects. Need access to suitable data sets to examine this issue. This is on going research in which myself and colleagues are engaged. (Plea for data) Need access to suitable data sets to examine this issue. This is on going research in which myself and colleagues are engaged. (Plea for data) A general point: one needs to have knowledge about the system in order to understand if an intervention changes things. (And in order to design studies) A general point: one needs to have knowledge about the system in order to understand if an intervention changes things. (And in order to design studies)

18 18 Household studies In a couple of instances, instead of just counting recorded crimes a, b, c, d in the 4 cells (before, after, intervention, comparison), a household survey, before and after, of crimes recalled for the previous period within the 2 areas (intervention, comparison) was carried out. In a couple of instances, instead of just counting recorded crimes a, b, c, d in the 4 cells (before, after, intervention, comparison), a household survey, before and after, of crimes recalled for the previous period within the 2 areas (intervention, comparison) was carried out. One problem is that (unrecognised by authors Painter and Farrington) spatial correlation between the occurrence of crime needs to be considered. (Shared experience of crime on the local scale). Gives rise to a Design Effect familiar in clustered designs. Reduces the precision of the estimate of the effect. One problem is that (unrecognised by authors Painter and Farrington) spatial correlation between the occurrence of crime needs to be considered. (Shared experience of crime on the local scale). Gives rise to a Design Effect familiar in clustered designs. Reduces the precision of the estimate of the effect. Other problems, e.g. of differential change of composition between periods. Other problems, e.g. of differential change of composition between periods.

19 19 Different crimes at the start. Different crimes at the start. Overdispersion clear in number of crimes per household. Overdispersion clear in number of crimes per household. Differential loss to follow up. Differential loss to follow up. Old people are much less prone to experience crime and their number is much reduced due to loss to follow-up in the comparison area. So the relative composition changes during the experiment. Old people are much less prone to experience crime and their number is much reduced due to loss to follow-up in the comparison area. So the relative composition changes during the experiment. Results are very sensitive to the loss or addition of just one person Results are very sensitive to the loss or addition of just one person But importantly there is correlation between households, giving extra overdispersion (variability). But importantly there is correlation between households, giving extra overdispersion (variability). Essentially its a non-randomised two-cluster trial. Essentially its a non-randomised two-cluster trial. The Dudley Study: some problems

20 20 Spatial Correlation (1) An expression can be derived for the variance of ln(CPR) for a household survey, before and after, intervention-comparison study, i.e. of the Dudley type. This includes, in addition to the variability between households, both: (1) correlations within households between times. (2) correlations between households at any one time.

21 21 Spatial Correlation (2) What you get basically is the expression you would get if you ignored the correlation between households at one time, i.e. ignored the spatial correlation, multiplied by the Design Effect, Deff. (Just as in clustered surveys / trials) Deff=(1+(n-1) ρ s ) ρ s = the spatial correlation n = the number in a cluster, i.e. area

22 22 Spatial Correlation (3) The spatial correlation was not taken into account in the Dudley and Stoke analyses thus ignoring the fact that neighbours share risk. The spatial correlation was not taken into account in the Dudley and Stoke analyses thus ignoring the fact that neighbours share risk. An expression for the variance of the logarithm of the Cross Product Ratio CPR incorporating spatial correlation is: An expression for the variance of the logarithm of the Cross Product Ratio CPR incorporating spatial correlation is:

23 23 Spatial Correlation (4) λ, lambda, can be estimated from the variance of the number crimes experienced by households divided by the mean. λ, lambda, can be estimated from the variance of the number crimes experienced by households divided by the mean. r, the correlation of crimes before to after for the same households. This must be reconstructed because linking never happened. Said to be 0.3. But we do not know and cannot estimate Deff (Deff >1) for this sort of study. But note with n=400 and ρ s = 0.01 gives Deff = 5.

24 24 To summarize the issue overdispersion from the Dudley study On the matter of overdispersion alone, the value obtained from the between household variation cannot be correct as the effect of spatial correlation, i.e. the shared experience of nearby households needs to be taken in account, i.e. crime experiences will be linked. This will increase overdispersion. We would need more than 2 clusters to be able estimate Deff. On the matter of overdispersion alone, the value obtained from the between household variation cannot be correct as the effect of spatial correlation, i.e. the shared experience of nearby households needs to be taken in account, i.e. crime experiences will be linked. This will increase overdispersion. We would need more than 2 clusters to be able estimate Deff.

25 25 Lack of Equivalence between Areas Invariably it is the most crime-ridden area that gets the lighting, whereas the relatively crime-free control area is not re-lit. So there is lack of equivalence at the start. One effect of this is to allow regression towards the mean to operate. The name Control Area is a misnomer.Comparison Area is a better name.

26 26 Regression towards the mean X The before measurement Y The after measurement Cloud of Data Points Line of Equality Line of mean of Y for a given X

27 27 The response given to the lack of equivalence between the 2 areas. (RTM) Farrington and Welsh (2006) claim that RTM is not a problem because the effect in counted crimes in 250 Police Basic Command Units going from 2002/3 to 2003/4 showed only small effect (a few %). This is perhaps unsurprising as the areas and hence the number of crimes counted are an order of magnitude larger than in HORS251 so the year to year correlation may be expected to be higher than for the small lighting study areas. Farrington and Welsh (2006) claim that RTM is not a problem because the effect in counted crimes in 250 Police Basic Command Units going from 2002/3 to 2003/4 showed only small effect (a few %). This is perhaps unsurprising as the areas and hence the number of crimes counted are an order of magnitude larger than in HORS251 so the year to year correlation may be expected to be higher than for the small lighting study areas. Note Wrigley (1995) This tendency for correlation coefficients to increase in magnitude as the size of the areal unit involved increases has been known since the work of Gehlke and Biehl (1934). Note Wrigley (1995) This tendency for correlation coefficients to increase in magnitude as the size of the areal unit involved increases has been known since the work of Gehlke and Biehl (1934).

28 28 Log crime rates in successive periods: data from Tilley et al.

29 29 Estimating the effect of RTM On the basis of log normal crime rates it can be shown that if the intervention has no effect, the expected ln CPR = (1-ρσ y /σ x ) ln x 1 /x 2 x 1 /x 2 is the crime rate ratio; σ x, σ y the standard deviations on the log scale and ρ the correlation on the log scale. x 1 /x 2 is the crime rate ratio; σ x, σ y the standard deviations on the log scale and ρ the correlation on the log scale. Var( ln CPR ) = 2 σ y 2 (1-ρ 2 )

30 30 Estimation of the effect of RTM The simple model of crime rates suggests that the high year to year correlation typically 0.95 for the BCU crime rate data, would indeed give an effect of a few %. The simple model of crime rates suggests that the high year to year correlation typically 0.95 for the BCU crime rate data, would indeed give an effect of a few %. However the smaller areas used in CRI evaluation might be expected to have lower correlation However the smaller areas used in CRI evaluation might be expected to have lower correlation Burglary data from a study of 124 areas has correlation of about 0.8 giving, all else equal, an expected effect 4 times larger. Burglary data from a study of 124 areas has correlation of about 0.8 giving, all else equal, an expected effect 4 times larger. Note: in general we dont know the correlation nor rates being compared for the lighting studies. However, we do know, whereas the household crime rate ratio at the start was 1.40 for Dudley, that for Stoke was 2.51 giving a much larger expected RTM effect. Note: in general we dont know the correlation nor rates being compared for the lighting studies. However, we do know, whereas the household crime rate ratio at the start was 1.40 for Dudley, that for Stoke was 2.51 giving a much larger expected RTM effect. Without better knowledge we cant be definite about the impact of RTM but the indications are that the bias could be important and uncertainty large. Without better knowledge we cant be definite about the impact of RTM but the indications are that the bias could be important and uncertainty large.

31 31 Expected natural log of CPR and its CI for a set of burglary data.

32 32 Insights from a set of police beat data Crime rate ratios (one year to the next ) for crime counts for each police beat in a file of data have been calculated. It is possible to ratio this with that from the next beat in the file thereby constructing a CPR for the pair of beats. This shows that effects of the size claimed for lighting, 20%, are common in such comparisons and are considerably larger than that expected on the basis of s.e. =(1/a+1/b+1/c+1/d).

33 33 Strengths and weaknesses of police beat data The strength is that it does not rely on a statistical model; overdispersion and RTM are automatically taken into account. The strength is that it does not rely on a statistical model; overdispersion and RTM are automatically taken into account. The weakness is that this data set may not represent the situation of the lighting studies. But that will be the case for any other sets. Yet we do not have sufficient relevant information about the individual studies used to make the HORS251 claim. The weakness is that this data set may not represent the situation of the lighting studies. But that will be the case for any other sets. Yet we do not have sufficient relevant information about the individual studies used to make the HORS251 claim.

34 34 Opportunity of checking the effect of lighting Many brighter street lighting programmes are being implemented under government PFI schemes. Many brighter street lighting programmes are being implemented under government PFI schemes. Admittedly non-randomised implementation but presents an opportunity to check the claim. Admittedly non-randomised implementation but presents an opportunity to check the claim. The checking needs to be done to high scientific standards. The checking needs to be done to high scientific standards. Perhaps it is possible to link changes of night- time brightness measured by satellites to changes in crime. Perhaps it is possible to link changes of night- time brightness measured by satellites to changes in crime.

35 35 General potential consequences of weak methods Because there is a tendency to report positive effects (dissemination bias) and probably even more so with less rigorous work, one is likely to end up with an even more distorted research record. Because there is a tendency to report positive effects (dissemination bias) and probably even more so with less rigorous work, one is likely to end up with an even more distorted research record. This might lead to dubious justification of a bad policy. This might lead to dubious justification of a bad policy. While it might be possible to estimate the effect of the excess variability or the effect of RTM, it would seem problematic to be confident about adequately adjusting for them. While it might be possible to estimate the effect of the excess variability or the effect of RTM, it would seem problematic to be confident about adequately adjusting for them. Scientifically stronger methods could avoid many problems and may be very cheap relative to policy costs. Scientifically stronger methods could avoid many problems and may be very cheap relative to policy costs.

36 36 …Paul Marchant, statistician at Leeds Metropolitan University who argues that statistics used in the Home Office Study 251 could equally be used to show that street lighting actually increases levels of crime. This is an argument which the APPLG, alongside the ILE, would hope to show as utterly absurd. Of course it is worth noting that Paul Marchant is also an astronomer as well as being a statistician, and that this may lead to some bias in his interpretation of the statistics he refers to. P56 of the March/April 2004 issue of the Lighting Journal, the magazine of the Institution of Lighting Engineers. APPLG = The All-Party Parliamentary Lighting Group ILE = The Institution of Lighting Engineers My view is there is large uncertainty and one cannot be too sure what lighting does to crime. (Perhaps on average lighting increases crime). My interest

37 37 Some conclusions A Methods Scale seems to suggest that designs weaker than RCTs might suffice, without indicating what is lost. A Methods Scale seems to suggest that designs weaker than RCTs might suffice, without indicating what is lost. I have indicated some of the problems which result. I have indicated some of the problems which result. I remain to be convinced that the deficiencies can be adequately overcome through estimating quantitatively the consequences of using a weaker design. (e.g. as in the lighting study and looking at other crime data. ) I remain to be convinced that the deficiencies can be adequately overcome through estimating quantitatively the consequences of using a weaker design. (e.g. as in the lighting study and looking at other crime data. ) Weaker designs can be useful in preliminary research but it is doubtful they are adequate when there are expensive consequences. Weaker designs can be useful in preliminary research but it is doubtful they are adequate when there are expensive consequences. RCTs can be problematic enough! (We need registered trials, published protocols, declaration of interest, concealment of allocation, blinding etc…..) RCTs can be problematic enough! (We need registered trials, published protocols, declaration of interest, concealment of allocation, blinding etc…..) Evaluations of policies once implemented need to be done to a high scientific standard. Evaluations of policies once implemented need to be done to a high scientific standard.

38 38 References Farrington D.P. and Welsh B.C. (2002) The Effects of Improved Street Lighting on Crime: A Systematic Review, Home Office Research Study 251, Farrington D.P. and Welsh B.C. (2004) Measuring the Effects of Improved Street Lighting on Crime: A reply to Dr. Marchant The British Journal of Criminology Farrington D.P. and Welsh B.C. (2006) How Important is Regression to the Mean in Area-Based Crime Prevention Research?, Crime Prevention and Community Safety 8 50 Feynman RP (1985) Cargo Cult science in Surely Youre Joking Mr Feynman Norton New York Gorard S (2002) Fostering Scepticism: The Importance of Warranting Claims, Evaluation and Research in Education 16 3 p136 Marchant P.R. (2004) A Demonstration that the Claim that Brighter Lighting Reduces Crime is Unfounded The British Journal of Criminology

39 39 References continued Marchant P.R. (2005) What Works? A Critical Note on the Evaluation of Crime Reduction Initiatives, Crime Prevention and Community Safety Painter, K. and Farrington, D. P. (1997) The Crime Reducing Effect of Improved Street Lighting: The Dudley Project, in R.V. Clarke ed., Situational Crime Prevention: Successful case studies Harrow and Heston, Guilderland NY. Shaftoe, H (1994) Easton/Ashley, Bristol: Lighting Improvements, in S. Osborn (ed.) Housing Safe Communities: An Evaluation of Recent Initiatives 72-77, Safe Neighbourhoods Unit, London Tilley N., Pease K., Hough M. and Brown R. (1999) Burglary Prevention: Early Lessons from the Crime Reduction Programme, Crime Reduction Research series Paper1 London Home Office Wrigley N., Revisiting the Modifiable Areal Unit Problem and Ecological Fallacy pp49-71 in Gould PR, Hoare AG and Cliff AD Eds Diffusing Geography: Essays for Peter Haggett


Download ppt "1 Experimental Design In Social Research - Do we know for example what works against crime? Seminar Centre for Census and Survey Research, University of."

Similar presentations


Ads by Google