Unfolding with system identification

Slides:



Advertisements
Similar presentations
Copula Representation of Joint Risk Driver Distribution
Advertisements

General Linear Model With correlated error terms  =  2 V ≠  2 I.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Ch11 Curve Fitting Dr. Deshi Ye
N.D.GagunashviliUniversity of Akureyri, Iceland Pearson´s χ 2 Test Modifications for Comparison of Unweighted and Weighted Histograms and Two Weighted.
The General Linear Model. The Simple Linear Model Linear Regression.
Markov processes in a problem of the Caspian sea level forecasting Mikhail V. Bolgov Water Problem Institute of Russian Academy of Sciences.
Lecture 6 Resolution and Generalized Inverses. Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture.
Statistical Image Modelling and Particle Physics Comments on talk by D.M. Titterington Glen Cowan RHUL Physics PHYSTAT05 Glen Cowan Royal Holloway, University.
Development of Empirical Models From Process Data
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Lecture 11 Vector Spaces and Singular Value Decomposition.
Ordinary least squares regression (OLS)
Linear and generalised linear models
Basics of regression analysis
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
AN ITERATIVE METHOD FOR MODEL PARAMETER IDENTIFICATION 4. DIFFERENTIAL EQUATION MODELS E.Dimitrova, Chr. Boyadjiev E.Dimitrova, Chr. Boyadjiev BULGARIAN.
R. Kass/W03P416/Lecture 7 1 Lecture 7 Some Advanced Topics using Propagation of Errors and Least Squares Fitting Error on the mean (review from Lecture.
Stochastic Linear Programming by Series of Monte-Carlo Estimators Leonidas SAKALAUSKAS Institute of Mathematics&Informatics Vilnius, Lithuania
CS433: Modeling and Simulation Dr. Anis Koubâa Al-Imam Mohammad bin Saud University 15 October 2010 Lecture 05: Statistical Analysis Tools.
Unfolding jet multiplicity and leading jet p T spectra in jet production in association with W and Z Bosons Christos Lazaridis University of Wisconsin-Madison.
Progress in identification of damping: Energy-based method with incomplete and noisy data Marco Prandina University of Liverpool.
Introduction 1. Similarity 1.1. Mechanism and mathematical description 1.2. Generalized variables 1.3. Qualitative analysis 1.4. Generalized individual.
PROCESS MODELLING AND MODEL ANALYSIS © CAPE Centre, The University of Queensland Hungarian Academy of Sciences Statistical Model Calibration and Validation.
1 Iterative dynamically stabilized (IDS) method of data unfolding (*) (*arXiv: ) Bogdan MALAESCU CERN PHYSTAT 2011 Workshop on unfolding.
Monte-Carlo method for Two-Stage SLP Lecture 5 Leonidas Sakalauskas Institute of Mathematics and Informatics Vilnius, Lithuania EURO Working Group on Continuous.
Chapter 28 Cononical Correction Regression Analysis used for Temperature Retrieval.
Data Modeling Patrice Koehl Department of Biological Sciences National University of Singapore
FORECASTING METHODS OF NON- STATIONARY STOCHASTIC PROCESSES THAT USE EXTERNAL CRITERIA Igor V. Kononenko, Anton N. Repin National Technical University.
8th December 2004Tim Adye1 Proposal for a general-purpose unfolding framework in ROOT Tim Adye Rutherford Appleton Laboratory BaBar Statistics Working.
Spectrum Reconstruction of Atmospheric Neutrinos with Unfolding Techniques Juande Zornoza UW Madison.
G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1 Statistical Data Analysis: Lecture 5 1Probability, Bayes’ theorem 2Random variables and.
Computacion Inteligente Least-Square Methods for System Identification.
Simple and multiple regression analysis in matrix form Least square Beta estimation Beta Simple linear regression Multiple regression with two predictors.
Econometrics III Evgeniya Anatolievna Kolomak, Professor.
Colorado Center for Astrodynamics Research The University of Colorado 1 STATISTICAL ORBIT DETERMINATION Statistical Interpretation of Least Squares ASEN.
I. Statistical Methods for Genome-Enabled Prediction of Complex Traits OUTLINE THE CHALLENGES OF PREDICTING COMPLEX TRAITS ORDINARY LEAST SQUARES (OLS)
Data Modeling Patrice Koehl Department of Biological Sciences
Chapter 7. Classification and Prediction
Compressive Coded Aperture Video Reconstruction
(5) Notes on the Least Squares Estimate
Reduction of Variables in Parameter Inference
M. Kuhn, P. Hopchev, M. Ferro-Luzzi
Background on Classification
LECTURE 09: BAYESIAN ESTIMATION (Cont.)
LECTURE 10: DISCRIMINANT ANALYSIS
The Elements of Statistical Learning
Evgeniya Anatolievna Kolomak, Professor
Going Backwards In The Procedure and Recapitulation of System Identification By Ali Pekcan 65570B.
Structure from motion Input: Output: (Tomasi and Kanade)
LOCATION AND IDENTIFICATION OF DAMPING PARAMETERS
Matrices Definition: A matrix is a rectangular array of numbers or symbolic elements In many applications, the rows of a matrix will represent individuals.
Unfolding Problem: A Machine Learning Approach
6-1 Introduction To Empirical Models
SVD: Physical Interpretation and Applications
By Viput Subharngkasen
OVERVIEW OF LINEAR MODELS
Numerical Analysis Lecture14.
Singular Value Decomposition SVD
5.2 Least-Squares Fit to a Straight Line
5.4 General Linear Least-Squares
OVERVIEW OF LINEAR MODELS
Introduction to Unfolding
LECTURE 09: DISCRIMINANT ANALYSIS
Multiplication of Matrices
Juande Zornoza UW Madison
Volume 90, Issue 10, Pages (May 2006)
Causal Relationships with measurement error in the data
Structure from motion Input: Output: (Tomasi and Kanade)
Presentation transcript:

Unfolding with system identification Dimensional independent unfolding with D-optimal system identification Unfolding with system identification Nikolai Gagunashvili Faculty of Information Technology, University of Akureyri, Iceland nikolai@unak.is 5/4/2019 PHYSTAT 05 Oxford, UK

Contents Introduction Basic equation System identification The unfolding procedure A numerical example Conclusions References 5/4/2019 PHYSTAT 05 Oxford, UK

Introduction 5/4/2019 PHYSTAT 05 Oxford, UK

Introduction (cont.) 5/4/2019 PHYSTAT 05 Oxford, UK

Dimensional independent unfolding with D-optimal system identification Basic equation We will use the linear model for the transformation of a true distribution to the measured one where f= (f1,f2,….,fm)T is vector of an experimentally measured histogram content, φ= (φ1, φ2,..,φn)T is vector of some true histogram content, ε = (ε1, ε2,.....εm)T is vector of random residual components with mean value equal to 0, and diagonal variance matrix, where σ1,.... σm are the statistical errors of measured distribution. is the matrix of transformation 5/4/2019 PHYSTAT 05 Oxford, UK

Basic equation (cont.) 5/4/2019 PHYSTAT 05 Oxford, UK

Basic equation (cont.) A Least Squares Method can give an estimator of the true distribution where , the estimator, is called the unfolded distribution. The full matrix of errors of the unfolded distribution is according to the Least Squares Method. 5/4/2019 PHYSTAT 05 Oxford, UK

There are two stage of solving unfolding (inverse) problem: 1. Investigation and calculation matrix P known as problem of identification system and may be defined as the process of determining a model of a dynamic system using observed input-output data. 2. Solution of equation (1) that gives unfolded function with complete matrix of statistical errors of . 5/4/2019 PHYSTAT 05 Oxford, UK

System identification or calculation of matrix P The Monte-Carlo simulation of a set-up can be used for identification to get input-output data. Most popular input control signals are step control signal and impulse control signal. Matrix can be calculated with usage impulse control signals for identification. Equation (1), with the matrix calculated this way, gives a high fluctuated unfolded distribution with large statistical errors. Also it is possible that solution does not exist because matrix is singular. 5/4/2019 PHYSTAT 05 Oxford, UK

System identification or calculation of matrix P Regularization of solution (1) can be achieved if we will use for the system identification not an impulse control distribution, but an a priori distributions that may be known from theory, or from some other experimental data or can be proposed hypothetically. Assume we have q control generated distributions, and present them as matrix where each row represents one control generated histogram. 5/4/2019 PHYSTAT 05 Oxford, UK

System identification or calculation of matrix P For each i-th row of the matrix P we can write the equation where is a vector of reconstructed i-th bin content for different generated control distributions, is a vector of random residuals with expectation value equal to 0 and diagonal variance matrix where is the statistical error of the reconstructed distribution for the i-th bin and j-th control generated distribution. Least Squares Method gives and estimator for pi 5/4/2019 PHYSTAT 05 Oxford, UK

System identification or calculation of matrix P Columns of matrix can correlate with each other. It means that transformation of the control generated distribution to the i-th bin of reconstructed distribution can be parameterized by subset of elements of the row pi. May be more than one subset that describes this transformation in sufficiently good manner. Example: 5/4/2019 PHYSTAT 05 Oxford, UK

System identification or calculation of matrix P Thus for each i-th reconstructed bin we will have Ni candidate rows, and for all reconstructed bins candidate matrices P. We need to choose a matrix P that is good, or optimal , in some sense. The most convenient in this case is the criteria of D-optimality that is related to the minimization of determinant of full matrix of errors of unfolded distribution 5/4/2019 PHYSTAT 05 Oxford, UK

System identification or calculation of matrix P Main advantages D-optimization Minimizes the generalized variance of the components of an unfolded distribution Minimizes the volume of the confidence ellipsoid for this distribution There are many computer algorithms for optimization. Further optimization can be achieved by introducing selection criteria for control generated distributions used for identification. 5/4/2019 PHYSTAT 05 Oxford, UK

System identification or calculation of matrix P Selection criteria for control generated distributions A control generated distribution has corresponding the reconstructed control distribution that can be compared with the experimentally measured distribution by χ2 test. Let us take for identification a generated distribution that has a corresponding reconstructed distribution satisfying a χ2 <a selection criteria. The parameter a defines a significant level p(a) for the comparison of two histograms. Application of this selection criteria increase number of candidate matrix P and can decrease value of determinant of full matrix of errors and statistical errors of unfolded distribution. 5/4/2019 PHYSTAT 05 Oxford, UK

System identification or calculation of matrix P Experimental distribution Monte-Carlo Set of generated control distributions Set of reconstructed control distributions 5/4/2019 PHYSTAT 05 Oxford, UK

The unfolding procedure Initialization Define a binning for experimental data. Define a binning for the unfolding distribution. System identification Choose a set of control generated distributions. Calculate the set of candidates for the matrix P. Calculate the D-optimal matrix P. Basic equation solution Calculate unfolded distribution with full matrix of errors Test of goodness of the unfolding Fit unfolded distribution and compare the experimental distribution and the reconstructed simulated distribution 5/4/2019 PHYSTAT 05 Oxford, UK

The unfolding procedure Initialization Define a binning for experimental data 5/4/2019 PHYSTAT 05 Oxford, UK

The unfolding procedure Initialization Define a binning for the unfolded distribution 5/4/2019 PHYSTAT 05 Oxford, UK

A numerical example We take a true distribution with parameters An experimentally measured distribution is defined as where the acceptance and is the detector resolution function with σ=1.5. 5/4/2019 PHYSTAT 05 Oxford, UK

An example of the true distribution φ(x), the acceptance function A(x) and the resolution function R(x,10) 5/4/2019 PHYSTAT 05 Oxford, UK

An example of the measured distribution f 5/4/2019 PHYSTAT 05 Oxford, UK

5/4/2019 PHYSTAT 05 Oxford, UK

A numerical example (cont.) A histogram of the measured distribution was obtained by simulating 104 events. Random parameters are generated uniformly on the intervals [1,3] for A1; [0.5,1.5] for A2; [8,12] for B1 ; [10,18] for B2; [0.5,1.5] for C1; [0.5,1.5] for C2; which define a distribution for identification. 5/4/2019 PHYSTAT 05 Oxford, UK

Control distributions generated for system identification and an unfolded distribution for different χ2 cut 5/4/2019 PHYSTAT 05 Oxford, UK

Conclusions The proposed method use of a set of a priori distributions for identification to obtain stable solution of unfolding problem. D-optimization and the application of the Least Squares Method gives the possibility of minimizing the statistical errors of the solution. χ2 selection criteria permits to decrease the possible bias of the procedure. The procedure has no restriction due to dimensionality of the problem. The procedure can be applied for solving unfolding problem with smooth solution as well as non-smooth solution. Based only on a statistical approach the method has a good statistical interpretation. 5/4/2019 PHYSTAT 05 Oxford, UK

References V.Blobel, Unfolding methods in high-energy physics experiments, CERN 85-02 (1985). V.P.Zhigunov, Improvement of resolution function as an inverse problem, Nucl. Instrum. Meth. 216(1983)183. A.Höcker,V.Kartvelishvili, SVD approach to data unfolding, Nucl. Instrum. Meth. A372(1996)469. N.D.Gagunashvili, Unfolding of true distributions from experimental data distorted by detectors with finite resolutions, Nucl. Instrum. Meth. A451(1993)657. N.D.Gagunashvili, test for comparison of weighted and unweighted histograms, PHYSTAT 05 Oxford,UK 5/4/2019 PHYSTAT 05 Oxford, UK