HESSIAN vs OFFSET method

Slides:



Advertisements
Similar presentations
Design of Experiments Lecture I
Advertisements

Lara De Nardo BNL October 8, 2007 QCD fits to the g 1 world data and uncertainties THE HERMES EXPERIENCE QCD fits to the g 1 world data and uncertainties.
High Energy neutrino cross-sections HERA-LHC working week Oct 2007 A M Cooper-Sarkar, Oxford Updated predictions of high energy ν and ν CC cross-sections.
Low-x and PDF studies at LHC Sept 2008 A M Cooper-Sarkar, Oxford At the LHC high precision (SM and BSM) cross section predictions require precision Parton.
Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
1 META PDFs as a framework for PDF4LHC combinations Jun Gao, Joey Huston, Pavel Nadolsky (presenter) arXiv: , Parton.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem, random variables, pdfs 2Functions.
HERAPDF0.2 and predictions for W/Z production at LHC PDF4LHC A M Cooper-Sarkar 29 May 2009 Motivation Some of the debates about the best way of estimating.
Sept 2003PHYSTAT41 Uncertainty Analysis (I) Methods.
May 2005CTEQ Summer School1 Uncertainties of Parton Distribution Functions Dan Stump Michigan State University.
Detect Unknown Systematic Effect: Diagnose bad fit to multiple data sets Advanced Statistical Techniques in Particle Physics Grey College, Durham 18 -
Sept 2003PHYSTAT41 Uncertainty Analysis (I) Methods.
Physics 114: Lecture 15 Probability Tests & Linear Fitting Dale E. Gary NJIT Physics Department.
1 Combination of PDF uncertainties for LHC observables Pavel Nadolsky (SMU) in collaboration with Jun Gao (Argonne), Joey Huston (MSU) Based on discussions.
A Neural Network MonteCarlo approach to nucleon Form Factors parametrization Paris, ° CLAS12 Europen Workshop In collaboration with: A. Bacchetta.
Update on fits for 25/3/08 AM Cooper-Sarkar Central fit: choice of parametrization Central fit: choice of error treatment Quality of fit to data PDFs plus.
Lecture 4 Basic Statistics Dr. A.K.M. Shafiqul Islam School of Bioprocess Engineering University Malaysia Perlis
Update of ZEUS PDF analysis A.M Cooper-Sarkar, Oxford DIS2004 New Analysis of ZEUS data alone using inclusive cross-sections from all of HERA-I data –
May 14 th 2008 averaging meeting A M Cooper-Sarkar Look at the HERA-I PDFs in new ways Flavour break-up High-x Compare to ZEUS data alone/ H1 data alone.
Predictions for high energy neutrino cross-sections from ZEUS-S Global fit analysis S Chekanov et al, Phys Rev D67, (2002) The ZEUS PDFs are sets.
Ronan McNulty EWWG A general methodology for updating PDF sets with LHC data Francesco de Lorenzi*, Ronan McNulty (University College Dublin)
More on NLOQCD fits ZEUS Collab Meeting March 2003 Eigenvector PDF sets- ZEUS-S 2002 PDFS accessible on HEPDATA High x valence distributions from ZEUS-Only.
Discussion of calculation of LHC cross sections and PDF/  s uncertainties J. Huston Michigan State University 1.
High Q 2 Structure Functions and Parton Distributions Ringberg Workshop 2003 : New Trends in HERA physics Benjamin Portheault LAL Orsay On behalf of the.
Further investigations on the fits to new data Jan 12 th 2009 A M Cooper-Sarkar Considering ONLY fits with Q 2 0 =1.9 or 2.0 –mostly comparing RTVFN to.
Treatment of correlated systematic errors PDF4LHC August 2009 A M Cooper-Sarkar Systematic differences combining ZEUS and H1 data  In a QCD fit  In a.
In the context of the HERA-LHC workshop the idea of combining the H1 and ZEUS data arose. Not just putting both data sets into a common PDF fit but actually.
SWADHIN TANEJA (STONY BROOK UNIVERSITY) K. BOYLE, A. DESHPANDE, C. GAL, DSSV COLLABORATION 2/4/2016 S. Taneja- DIS 2011 Workshop 1 Uncertainty determination.
Jets and α S in DIS Maxime GOUZEVITCH Laboratoire Leprince-Ringuet Ecole Polytechnique – CNRS/IN2P3, France On behalf of the collaboration On behalf of.
11 QCD analysis with determination of α S (M Z ) based on HERA inclusive and jet data: HERAPDF1.6 A M Cooper-Sarkar Low-x meeting June 3 rd 2011 What inclusive.
G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate.
H1 QCD analysis of inclusive cross section data DIS 2004, Štrbské Pleso, Slovakia, April 2004 Benjamin Portheault LAL Orsay On behalf of the H1 Collaboration.
CT14 PDF update J. Huston* PDF4LHC meeting April 13, 2015 *for CTEQ-TEA group: S. Dulat, J. Gao, M. Guzzi, T.-J. Hou, J. Pumplin, C. Schmidt, D. Stump,
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
H1 and ZEUS Combined PDF Fit DIS08 A M Cooper Sarkar on behalf of ZEUS and H1 HERA Structure Function Working Group NLO DGLAP PDF fit to the combined HERA.
D Parton Distribution Functions, Part 2. D CT10-NNLO Parton Distribution Functions.
MSTW update James Stirling (with Alan Martin, Robert Thorne, Graeme Watt)
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
1 Proton Structure Functions and HERA QCD Fit HERA+Experiments F 2 Charged Current+xF 3 HERA QCD Fit for the H1 and ZEUS Collaborations Andrew Mehta (Liverpool.
1 A M Cooper-Sarkar University of Oxford ICHEP 2014, Valencia.
MECH 373 Instrumentation and Measurements
Logic of Hypothesis Testing
Dependent-Samples t-Test
Physics 114: Lecture 13 Probability Tests & Linear Fitting
Chapter 23 Comparing Means.
Michigan State University
Global QCD Analysis and Collider Phenomenology — CTEQ
News from HERAPDF A M Cooper-Sarkar PDF4LHC CERN March
Going Backwards In The Procedure and Recapitulation of System Identification By Ali Pekcan 65570B.
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
Chapter 9 Hypothesis Testing.
Statistics Review ChE 477 Winter 2018 Dr. Harding.
Statistical Issues in PDF fitting
Introduction to Instrumentation Engineering
HERA I - Preliminary H1 and ZEUS QCD Fit
Discrete Event Simulation - 4
Geology Geomath Chapter 7 - Statistics tom.h.wilson
Comparing Populations
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
10701 / Machine Learning Today: - Cross validation,
PDF4LHC: LHC needs February 2008 A M Cooper-Sarkar, Oxford
Counting Statistics and Error Prediction
May 14th 2008 averaging meeting A M Cooper-Sarkar
ATLAS 2.76 TeV inclusive jet measurement and its PDF impact A M Cooper-Sarkar PDF4LHC Durham Sep 26th 2012 In 2011, 0.20 pb-1 of data were taken at √s.
Computing and Statistical Data Analysis / Stat 7
Inferential Statistics
Chapter 24 Comparing Means Copyright © 2009 Pearson Education, Inc.
Psych 231: Research Methods in Psychology
Uncertainties of Parton Distribution Functions
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
Presentation transcript:

HESSIAN vs OFFSET method PDF4LHC February 2008 A M Cooper-Sarkar Comparisons on using the SAME NLOQCD fit analysis in a global fit In a fit to just ZEUS data Model dependence in Hessian and Offset fits- comparing ZEUS and H1 Systematic differences combining ZEUS and H1 data In a QCD fit In a ‘theory free’ fit

Treatment of correlated systematic errors χ2 = Σi [ FiQCD (p) – Fi MEAS]2 (siSTAT)2+(ΔiSYS)2 Errors on the fit parameters, p, evaluated from Δχ2 = 1, THIS IS NOT GOOD ENOUGH if experimental systematic errors are correlated between data points- χ2 = Σi Σj [ FiQCD(p) – Fi MEAS] Vij-1 [ FjQCD(p) – FjMEAS] Vij = δij(бiSTAT)2 + Σλ ΔiλSYS ΔjλSYS Where ΔiλSYS is the correlated error on point i due to systematic error source λ It can be established that this is equivalent to χ2 = Σi [ FiQCD(p) –ΣλslDilSYS – Fi MEAS]2 + Σsl2 (siSTAT) 2 Where sλ are systematic uncertainty fit parameters of zero mean and unit variance This has modified the fit prediction by each source of systematic uncertainty CTEQ, ZEUS, H1, MRST/MSTW have all adopted this form of χ2 – but use it differently in the OFFSET and HESSIAN methods …hep-ph/0205153

How do experimentalists often proceed: OFFSET method Perform fit without correlated errors (sλ = 0) for central fit, and propagate statistical errors to the PDFs < б2q > = T Σj Σk ∂ q Vjk ∂ q ∂ pj ∂ pk Where T is the χ2 tolerance, T = 1. Shift measurement to upper limit of one of its systematic uncertainties (sλ = +1) Redo fit, record differences of parameters from those of step 1 Go back to 2, shift measurement to lower limit (sλ = -1) Go back to 2, repeat 2-4 for next source of systematic uncertainty Add all deviations from central fit in quadrature (positive and negative deviations added in quadrature separately) This method does not assume that correlated systematic uncertainties are Gaussian distributed Fortunately, there are smart ways to do this (Pascaud and Zomer LAL-95-05, Botje hep-ph-0110123)

< б2F > = T Σj Σk ∂ F Vjk tot ∂ F Fortunately, there are smart ways to do this (Pascaud and Zomer LAL-95-05) Define matrices Mjk = 1 ∂2 χ2 C jλ = 1 ∂2 χ2 2 ∂pj ∂pk 2 ∂pj ∂sλ Then M expresses the variation of χ2 wrt the theoretical parameters, accounting for the statistical errors, and C expresses the variation of χ2 wrt theoretical parameters and systematic uncertainty parameters. Then the covariance matrix accounting for statistical errors is Vp = M-1 and the covariance matrix accounting for correlated systematic uncertainties is Vps = M-1CCT M-1. The total covariance matrix Vtot = Vp + Vps is used for the standard propagation of errors to any distribution F which is a function of the theoretical parameters < б2F > = T Σj Σk ∂ F Vjk tot ∂ F ∂ pj ∂ pk Where T is the χ2 tolerance, T = 1 for the OFFSET method. This is a conservative method which gives predictions as close as possible to the central values of the published data. It does not use the full statistical power of the fit to improve the estimates of sλ, since it chooses to distrust that systematic uncertainties are Gaussian distributed.

There are other ways to treat correlated systematic errors- HESSIAN method (covariance method) Allow sλ parameters to vary for the central fit. The total covariance matrix is then the inverse of a single Hessian matrix expressing the variation of χ2 wrt both theoretical and systematic uncertainty parameters. If we believe the theory why not let it calibrate the detector(s)? Effectively the theoretical prediction is not fitted to the central values of published experimental data, but allows these data points to move collectively according to their correlated systematic uncertainties The fit determines the optimal settings for correlated systematic shifts such that the most consistent fit to all data sets is obtained. In a global fit the systematic uncertainties of one experiment will correlate to those of another through the fit The resulting estimate of PDF errors is much smaller than for the Offset method for Δχ2 = 1 CTEQ have used this method with Δχ2 = 100 for 90%CL limits MRST have used Δχ2 = 50 H1, Alekhin have used Δχ2= 1

χ2 = Σi [ FiQCD(p) – Fi MEAS]2 - B A-1B Luckily there are also smart ways to do this: CTEQ have given an analytic method CTEQ hep-ph/0101032,hep-ph/0201195 χ2 = Σi [ FiQCD(p) – Fi MEAS]2 - B A-1B (siSTAT) 2 where Bλ = Σi Δiλ sys [Fi QCD(p) – Fi MEAS] , Aλμ = δλμ + Σi Δiλ sys Δiμsys (siSTAT) 2 (siSTAT) 2 such that the contributions to χ2 from statistical and correlated sources can be evaluated separately.

illustration for eigenvector-4 WHY change the χ2 tolerance? CTEQ6 look at eigenvector combinations of their parameters rather than the parameters themselves. They determine the 90% C.L. bounds on the distance from the global minimum from ∫ P(χe2, Ne) dχe2 =0.9 for each experiment Distance This leads them to suggest a modification of the χ2 tolerance, Δχ2 = 1, with which errors are evaluated such that Δχ2 = T2, T = 10. Why? Pragmatism. The size of the tolerance T is set by considering the distances from the χ2 minima of individual data sets from the global minimum for all the eigenvector combinations of the parameters of the fit. All of the world’s data sets must be considered acceptable and compatible at some level, even if strict statistical criteria are not met, since the conditions for the application of strict statistical criteria, namely Gaussian error distributions are also not met. One does not wish to lose constraints on the PDFs by dropping data sets, but the level of inconsistency between data sets must be reflected in the uncertainties on the PDFs.

Compare gluon PDFs for HESSIAN and OFFSET methods for the ZEUS global PDF fit analysis Hessian method T2=1 Hessian method T2=50 The Hessian method gives comparable size of error band as the Offset method, when the tolerance is raised to T2 ~ 50 – (similar ball park to CTEQ, T2=100) BUT this was not just an effect of having many different data sets of differing levels of compatibility in this fit

Comparison off Hessian and Offset methods for ZEUS-JETS FIT 2005 which uses only ZEUS data For the gluon and sea distributions the Hessian method still gives a much narrower error band. A comparable size of error band to the Offset method, is again achieved when the tolerance is raised to T2 ~ 50. Note this is not a universal number T2 ~5 is more appropriate for the valence distributions. It depends on the relative size of the systematic and statistical experimental errors which contribute to the distribution

Model dependence is also important when comparing Hessian and Offset methods The statistical criterion for parameter error estimation within a particular hypothesis is Δχ2 = T2 = 1. But for judging the acceptability of an hypothesis the criterion is that χ2 lie in the range N ±√2N, where N is the number of degrees of freedom There are many choices, such as the form of the parametrization at Q20, the value of Q02 itself, the flavour structure of the sea, etc., which might be considered as superficial changes of hypothesis, but the χ2 change for these different hypotheses often exceeds Δχ2=1, while remaining acceptably within the range N ±√2N. The model uncertainty on the PDFs generally exceeds the experimental uncertainty, if this has been evaluated using T=1, with the Hessian method. Compare ZEUS-JETS 2005 and H1 PDF2000 both of which use restricted data sets, both use Δχ2=1 but with OFFSET and HESSIAN methods respectively

For the H1 analysis model uncertainties are larger than the HESSIAN experimental uncertainties because each change of model assumption can give a different set of systematic uncertainty parameters, sλ, and thus a different estimate of the shifted positions of the data points. For the H1 fit √2N ~ 35 For the ZEUS analysis model uncertainties are smaller than the OFFSET experimental uncertainties because sλ cannot change in evaluation of the central values and because the OFFSET uncertainty is equivalent to T2~50 - larger than the hypothesis uncertainty for the ZEUS fit √2N ~ 35

ZEUS/H1 published fits comparison including model uncertainties gives similar total uncertainty

QCD fits to both ZEUS and H1 data One can make an NLOQCD fit to both data sets using the Hessian method OR one can combine the data sets using the Hessian method with no theoretical assumption- other than that the data measure the same ‘truth’ The systematic shift parameters as determined by these two fits are quite different. systematic shift sλ QCDfit ZEUS+H1 ‘theory free’ ZEUS+H1 zd1_e_eff 1.65 -0.41 zd3_e_theta_b -1.26 -0.29 zd4_e_escale -1.04 1.05 zd6_had2 -0.85 0.01 zd7_had3 1.05 -0.73 h2_Ee_Spacal -0.51 0.63 h8_H_Scale_L -0.26 -0.99 h9_Noise_Hca 1.00 -0.43 h11_GP_BG_LA -0.36 1.44 What then is the optimal setting for these parameters? One can also make an NLOQCD fit to the combined data set The QCDfit to the combined HERA data set gives different central values from the QCDfit to the separate data sets

QCD PDF fit to the H1 and ZEUS combined data set QCD PDF fit to H1 and ZEUS separate data sets The central values of the PDFs are rather different particularly for gluon and dv because the systematic shifts determined by these fits are different NOTE: this is very preliminary and there is no model uncertainty applied

Summary OFFSET method is non Bayesian and non-optimal in terms of using the full statistical power of the fit to set the systematic shifts sλ BUT it does produce uncertainty estimates compatible with those of the HESSIAN method with increased tolerance T2 ~50-100 The uncertainty estimates are also generous enough to encompass a large variety of model dependent uncertainties Systematic shifts set by the Hessian method depend on the theoretical input of the fit- this makes experimentalists feel uncomfortable