Treatment of correlated systematic errors PDF4LHC August 2009 A M Cooper-Sarkar Systematic differences combining ZEUS and H1 data  In a QCD fit  In a.

Slides:



Advertisements
Similar presentations
Design of Experiments Lecture I
Advertisements

Lara De Nardo BNL October 8, 2007 QCD fits to the g 1 world data and uncertainties THE HERMES EXPERIENCE QCD fits to the g 1 world data and uncertainties.
High Energy neutrino cross-sections HERA-LHC working week Oct 2007 A M Cooper-Sarkar, Oxford Updated predictions of high energy ν and ν CC cross-sections.
Low-x and PDF studies at LHC Sept 2008 A M Cooper-Sarkar, Oxford At the LHC high precision (SM and BSM) cross section predictions require precision Parton.
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
1 META PDFs as a framework for PDF4LHC combinations Jun Gao, Joey Huston, Pavel Nadolsky (presenter) arXiv: , Parton.
1 Practical Statistics for Physicists LBL January 2008 Louis Lyons Oxford
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem, random variables, pdfs 2Functions.
Report on fitting FINAL new data: e- CC (175pb -1 : P=0.30, 71pb -1, P=-0.27, 104pb -1 )(DESY ) e- NC (169pb -1, P=+0.29, P=-0.27)(DESY ) Now.
HERAPDF0.2 and predictions for W/Z production at LHC PDF4LHC A M Cooper-Sarkar 29 May 2009 Motivation Some of the debates about the best way of estimating.
Sept 2003PHYSTAT41 Uncertainty Analysis (I) Methods.
Methods and Measurement in Psychology. Statistics THE DESCRIPTION, ORGANIZATION AND INTERPRATATION OF DATA.
May 2005CTEQ Summer School1 Uncertainties of Parton Distribution Functions Dan Stump Michigan State University.
H1/ZEUS fitters meeting Jan 15 th 2010 Am Cooper-Sarkar Mostly about fitting the combined F2c data New work on an FFN fit PLUS Comparing HERAPDF to Tevatron.
HERAPDF0.2 and predictions for W/Z production at LHC PDF4LHC A M Cooper-Sarkar 29 May 2009 Motivation Some of the debates about the best way of estimating.
7. Least squares 7.1 Method of least squares K. Desch – Statistical methods of data analysis SS10 Another important method to estimate parameters Connection.
Detect Unknown Systematic Effect: Diagnose bad fit to multiple data sets Advanced Statistical Techniques in Particle Physics Grey College, Durham 18 -
Sept 2003PHYSTAT1 Uncertainties of Parton Distribution Functions Daniel Stump Michigan State University & CTEQ.
Sept 2003PHYSTAT11 … of short-distance processes using perturbative QCD (NLO) The challenge of Global Analysis is to construct a set of PDF’s with good.
Sept 2003PHYSTAT41 Uncertainty Analysis (I) Methods.
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
Preliminary results in collaboration with Pavel Nadolsky Les Houches /6/5.
1 Combination of PDF uncertainties for LHC observables Pavel Nadolsky (SMU) in collaboration with Jun Gao (Argonne), Joey Huston (MSU) Based on discussions.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
G. Cowan Lectures on Statistical Data Analysis Lecture 3 page 1 Lecture 3 1 Probability (90 min.) Definition, Bayes’ theorem, probability densities and.
A Neural Network MonteCarlo approach to nucleon Form Factors parametrization Paris, ° CLAS12 Europen Workshop In collaboration with: A. Bacchetta.
Update on fits for 25/3/08 AM Cooper-Sarkar Central fit: choice of parametrization Central fit: choice of error treatment Quality of fit to data PDFs plus.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
VI. Evaluate Model Fit Basic questions that modelers must address are: How well does the model fit the data? Do changes to a model, such as reparameterization,
Lecture 4 Basic Statistics Dr. A.K.M. Shafiqul Islam School of Bioprocess Engineering University Malaysia Perlis
Intro: “BASIC” STATS CPSY 501 Advanced stats requires successful completion of a first course in psych stats (a grade of C+ or above) as a prerequisite.
The New HERAPDF Nov HERA SFgroup AM Cooper-Sarkar Appears compatible with HERAPDF0.1 when doing fits at Q20=4.0 GeV2 But humpy gluon is Chisq favoured.
ZEUS PDF analysis 2004 A.M Cooper-Sarkar, Oxford Low-x 2004 New Analysis of ZEUS data alone using inclusive cross-sections from all of ZEUS data from HERA-I.
Update of ZEUS PDF analysis A.M Cooper-Sarkar, Oxford DIS2004 New Analysis of ZEUS data alone using inclusive cross-sections from all of HERA-I data –
May 14 th 2008 averaging meeting A M Cooper-Sarkar Look at the HERA-I PDFs in new ways Flavour break-up High-x Compare to ZEUS data alone/ H1 data alone.
Predictions for high energy neutrino cross-sections from ZEUS-S Global fit analysis S Chekanov et al, Phys Rev D67, (2002) The ZEUS PDFs are sets.
PDF fitting to ATLAS jet data- a first look A M Cooper-Sarkar, C Doglioni, E Feng, S Glazov, V Radescu, A Sapronov, P Starovoitov, S Whitehead ATLAS jet.
PDF fits with free electroweak parameters Overview of what has happened since March’06 Collaboration meeting Emphasis on the NC couplings au,vu,ad,vd and.
Ronan McNulty EWWG A general methodology for updating PDF sets with LHC data Francesco de Lorenzi*, Ronan McNulty (University College Dublin)
Flavour break-up July7th 2008 Our aim was modest: 1)To alter fc=0.15 to fc=0.09 following investigations of the charm fraction 2)To take into account the.
NLO QCD fits How far can we get without jet data/HERA-II data? A. M. Cooper-Sarkar March-04 Collaboration Meeting ZEUSNOTE Extended ZEUS-S fits.
More on NLOQCD fits ZEUS Collab Meeting March 2003 Eigenvector PDF sets- ZEUS-S 2002 PDFS accessible on HEPDATA High x valence distributions from ZEUS-Only.
Discussion of calculation of LHC cross sections and PDF/  s uncertainties J. Huston Michigan State University 1.
High Q 2 Structure Functions and Parton Distributions Ringberg Workshop 2003 : New Trends in HERA physics Benjamin Portheault LAL Orsay On behalf of the.
Further investigations on the fits to new data Jan 12 th 2009 A M Cooper-Sarkar Considering ONLY fits with Q 2 0 =1.9 or 2.0 –mostly comparing RTVFN to.
In the context of the HERA-LHC workshop the idea of combining the H1 and ZEUS data arose. Not just putting both data sets into a common PDF fit but actually.
Jets and α S in DIS Maxime GOUZEVITCH Laboratoire Leprince-Ringuet Ecole Polytechnique – CNRS/IN2P3, France On behalf of the collaboration On behalf of.
11 QCD analysis with determination of α S (M Z ) based on HERA inclusive and jet data: HERAPDF1.6 A M Cooper-Sarkar Low-x meeting June 3 rd 2011 What inclusive.
G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate.
June 1st 2008 averaging meeting A M Cooper-Sarkar Model dependence fs Model dependence fc Model dependence need to be consistent when varying Q2_0 Model.
H1 QCD analysis of inclusive cross section data DIS 2004, Štrbské Pleso, Slovakia, April 2004 Benjamin Portheault LAL Orsay On behalf of the H1 Collaboration.
CT14 PDF update J. Huston* PDF4LHC meeting April 13, 2015 *for CTEQ-TEA group: S. Dulat, J. Gao, M. Guzzi, T.-J. Hou, J. Pumplin, C. Schmidt, D. Stump,
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
H1 and ZEUS Combined PDF Fit DIS08 A M Cooper Sarkar on behalf of ZEUS and H1 HERA Structure Function Working Group NLO DGLAP PDF fit to the combined HERA.
D Parton Distribution Functions, Part 2. D CT10-NNLO Parton Distribution Functions.
CHAPTER- 3.1 ERROR ANALYSIS.  Now we shall further consider  how to estimate uncertainties in our measurements,  the sources of the uncertainties,
Costas Foudas, Imperial College, Jet Production at High Transverse Energies at HERA Underline: Costas Foudas Imperial College
MSTW update James Stirling (with Alan Martin, Robert Thorne, Graeme Watt)
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
1 Proton Structure Functions and HERA QCD Fit HERA+Experiments F 2 Charged Current+xF 3 HERA QCD Fit for the H1 and ZEUS Collaborations Andrew Mehta (Liverpool.
1 A M Cooper-Sarkar University of Oxford ICHEP 2014, Valencia.
News from HERAPDF A M Cooper-Sarkar PDF4LHC CERN March
Statistical Issues in PDF fitting
HERA I - Preliminary H1 and ZEUS QCD Fit
HESSIAN vs OFFSET method
Geology Geomath Chapter 7 - Statistics tom.h.wilson
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
May 14th 2008 averaging meeting A M Cooper-Sarkar
ATLAS 2.76 TeV inclusive jet measurement and its PDF impact A M Cooper-Sarkar PDF4LHC Durham Sep 26th 2012 In 2011, 0.20 pb-1 of data were taken at √s.
Presentation transcript:

Treatment of correlated systematic errors PDF4LHC August 2009 A M Cooper-Sarkar Systematic differences combining ZEUS and H1 data  In a QCD fit  In a ‘theory free’ fit

Treatment of correlated systematic errors χ2 = Σ i [ F i QCD (p) – F i MEAS ] 2 (  i STAT ) 2 +( Δ i SYS ) 2 Quadratic combination: errors on the fit parameters, p, evaluated from Δχ2 = 1, THIS IS NOT GOOD ENOUGH if experimental systematic errors are correlated between data points- χ 2 = Σ i Σ j [ F i QCD (p) – F i MEAS ] V ij - 1 [ F j QCD (p) – F j MEAS ] V ij = δ ij (б i STAT ) 2 + Σ λ Δ iλ SYS Δ jλ SYS Where Δ iλ SYS is the correlated error on point i due to systematic error source λ It can be established that this is equivalent to χ 2 = Σ i [ F i QCD (p) –Σ λ s  i SYS – F i MEAS ] 2 + Σs 2 (  i STAT ) 2 Where s λ are systematic uncertainty fit parameters of zero mean and unit variance This has modified the fit prediction by each source of systematic uncertainty CTEQ, ZEUS, H1, MRST/MSTW have all adopted this form of χ2 (MSTW still use some errors as quadratic) – but use it differently in the OFFSET and HESSIAN methods

How do experimentalists often proceed: OFFSET method Perform fit without correlated errors (s λ = 0) for central fit, and propagate statistical errors to the PDFs = T Σ j Σ k ∂ q V jk ∂ q ∂ p j ∂ p k Where T is the χ2 tolerance, T = 1. 1.Shift measurement to upper limit of one of its systematic uncertainties (s λ = +1) 2.Redo fit, record differences of parameters from those of step 1 3.Go back to 2, shift measurement to lower limit (s λ = -1) 4.Go back to 2, repeat 2-4 for next source of systematic uncertainty 5.Add all deviations from central fit in quadrature (positive and negative deviations added in quadrature separately) 6.This method does not assume that correlated systematic uncertainties are Gaussian distributed Fortunately, there are smart ways to do this – see extras (Pascaud and Zomer LAL-95-05, Botje hep-ph )

HESSIAN method (covariance method) Allow s λ parameters to vary for the central fit. The total covariance matrix is then the inverse of a single Hessian matrix expressing the variation of χ2 wrt both theoretical and systematic uncertainty parameters. If we believe the theory why not let it calibrate the detector(s)? Effectively the theoretical prediction is not fitted to the central values of published experimental data, but allows these data points to move collectively according to their correlated systematic uncertainties The fit determines the optimal settings for correlated systematic shifts such that the most consistent fit to all data sets is obtained. In a global fit the systematic uncertainties of one experiment will correlate to those of another through the fit The resulting estimate of PDF errors is much smaller than for the Offset method for Δχ2 = 1 CTEQ have used this method with Δχ2 ~ 100 for 90%CL limits MRST have used Δχ2 ~ 50 H1, Alekhin have used Δχ2= 1 Tolerances are now dynamic according to the eigenvector

Quadratic OFFSET HESSIAN For the HERAPDF with its small correlated systematic errors the treatment of these errors in the PDF fit gives only small differences in PDF central values and PDF uncertainty estimates It is not my purpose to justify these methods Just to compare them. First for the HERAPDF to the combined HERA data for which correlated systematic errors are SMALL relative to the statistical and uncorrelated errors

NOTE the fit formalism, parametrization etc is the same as for the HERAPDF fit. The OFFSET method gives largest errors and covers the difference between the valence PDFs of the other two Secondly for the separate ZEUS and H1 data for which correlated systematic errors are LARGE relative to the statistical and uncorrelated errors for Q2 < 150 Quadratic OFFSET HESSIAN When fitting to data with large correlated systematic errors the treatment of these errors in the PDF fit gives significant differences in PDF central values and PDF uncertainty estimates

Applying an increased tolerance to the HESSIAN fit results in similar sized errors to those of the OFFSET method for Δχ2 ~ 10 OFFSET For the separate ZEUS and H1 data for which correlated systematic errors are LARGE relative to the statistical and uncorrelated errors HESSIAN with increased χ2 tolerance

HESSIAN PDF fit to HERA combined data HESSIAN PDF fit to H1 and ZEUS separately A comparable experimental error results BUT the shapes of the gluon and valence are significantly different. For the PDF fit to the HERA combined data the systematic error parameters of the separate data sets were already shifted in the fit which combined the data. The PDF fit does not make further significant shifts. For the separate H1 and ZEUS data the systematic error parameters of the separate data sets are shifted in the PDF fit itself. hessian Now compare Hessian fitting to the HERA combined data to Hessian fitting to separate H1 and ZEUS data sets.

Fitted systematics Combination fit NLO QCD PDF fit 20 zd1_e_eff zd2_e_theta_a zd3_e_theta_b zd4_e_escale zd5_had zd6_had zd7_had h1670e h1670e h1670e h1670e h1670e h1670e h195-00e h195-00e h195-00e h195-00e h195-00e h195-00e h195-00e h195-00e Agreement on Systematic shift BAD GOOD Middling When you put the separate data sets into a PDF fit it floats the systematic parameters of the data sets to different values than the ‘theory free’ combination fit. Representative examples below So whereas you might think that the QCD PDF fits are already doing the job of combining the HERA data- they are not getting the same answer as the HERA combination fit.

Model dependence is also important The statistical criterion for parameter error estimation within a particular hypothesis is Δχ2 = T 2 = 1. But for judging the acceptability of an hypothesis the criterion is that χ2 lie in the range N ±√2N, where N is the number of degrees of freedom There are many choices, such as the form of the parametrization at Q 2 0, the value of Q 0 2 itself, the flavour structure of the sea, etc., which might be considered as superficial changes of hypothesis, but the χ2 change for these different hypotheses often exceeds Δχ2=1, while remaining acceptably within the range N ±√2N. The model uncertainty on the PDFs generally exceeds the experimental uncertainty, if this has been evaluated using T=1, with the Hessian method.

Compare HESSIAN fitting to HERA combined data To HESSIAN fitting to H1 and ZEUS separately The differences between fitting the combination and fitting the separate data sets are covered by the model and param uncertainties- and would also be covered by a larger Δχ2 tolerance BUT we are trying our best to get it right and to define clearly where the uncertainties come from To Hessian fitting to HERA combined data PLUS model and param. errors

extras

Fortunately, there are smart ways to do this (Pascaud and Zomer LAL-95-05) Define matrices M jk = 1 ∂ 2 χ 2 C jλ = 1 ∂ 2 χ 2 2 ∂p j ∂p k 2 ∂p j ∂s λ Then M expresses the variation of χ2 wrt the theoretical parameters, accounting for the statistical errors, and C expresses the variation of χ2 wrt theoretical parameters and systematic uncertainty parameters. Then the covariance matrix accounting for statistical errors is V p = M -1 and the covariance matrix accounting for correlated systematic uncertainties is V ps = M -1 CC T M -1. The total covariance matrix V tot = V p + V ps is used for the standard propagation of errors to any distribution F which is a function of the theoretical parameters = T Σ j Σ k ∂ F V jk tot ∂ F ∂ pj ∂ pk Where T is the χ2 tolerance, T = 1 for the OFFSET method. This is a conservative method which gives predictions as close as possible to the central values of the published data. It does not use the full statistical power of the fit to improve the estimates of s λ, since it chooses to distrust that systematic uncertainties are Gaussian distributed.

Luckily there are also smart ways to do perform the Hessian method CTEQ have given an analytic method CTEQ hep-ph/ ,hep-ph/ χ2 = Σ i [ F i QCD (p) – F i MEAS ] 2 - B A -1 B ( s i STAT ) 2 where B λ = Σ i Δ iλ sys [F i QCD (p) – F i MEAS ], A λμ = δ λμ + Σ i Δ iλ sys Δ iμ sys ( s i STAT ) 2 such that the contributions to χ2 from statistical and correlated sources can be evaluated separately.

This leads them to suggest a modification of the χ2 tolerance, Δχ2 = 1, with which errors are evaluated such that Δχ2 = T 2, T = 10. Why? Pragmatism. The size of the tolerance T is set by considering the distances from the χ2 minima of individual data sets from the global minimum for all the eigenvector combinations of the parameters of the fit. All of the world’s data sets must be considered acceptable and compatible at some level, even if strict statistical criteria are not met, since the conditions for the application of strict statistical criteria, namely Gaussian error distributions are also not met. One does not wish to lose constraints on the PDFs by dropping data sets, but the level of inconsistency between data sets must be reflected in the uncertainties on the PDFs. CTEQ6 look at eigenvector combinations of their parameters rather than the parameters themselves. They determine the 90% C.L. bounds on the distance from the global minimum from ∫ P(χ e 2, N e ) dχ e 2 =0.9 for each experiment illustration for eigenvector-4 Distance WHY change the χ2 tolerance?