Presentation is loading. Please wait.

Presentation is loading. Please wait.

Luca Lista, IEEE NSS-MIC 2003, Portland A Toolkit for Multi-variate Fitting Designed with Template Metaprogramming Luca Lista 1, Francesco Fabozzi 1,2.

Similar presentations


Presentation on theme: "Luca Lista, IEEE NSS-MIC 2003, Portland A Toolkit for Multi-variate Fitting Designed with Template Metaprogramming Luca Lista 1, Francesco Fabozzi 1,2."— Presentation transcript:

1 Luca Lista, IEEE NSS-MIC 2003, Portland A Toolkit for Multi-variate Fitting Designed with Template Metaprogramming Luca Lista 1, Francesco Fabozzi 1,2 1 INFN Napoli 2 Università della Basilicata

2 Luca Lista, IEEE NSS-MIC 2003, Portland Introduction The toolkit provides:  a language to describe and model parametric fit problems in C++  utilities to study the fit frequentistic properties Not intended to provide new mathematical algorithms  The underlying minimization engine is Minuit Motivated for analysis in BaBar experiment requiring complex fit modeling and Toy MC

3 Luca Lista, IEEE NSS-MIC 2003, Portland Main functionalities Description of Probability Distribution Functions (PDF)  most common PDFs provided (Gaussian, Poisson, etc.)  random number generators for each provided PDF  utilities to combine PDFs Manipulation of symbolic expression  simplifies the definition of PDF models and fit functions Fitter tools  different Unbinned Maximum Likelihood (UML) fitters and Chi-square fitter supported Toy Monte Carlo  utility to generate random data samples to validate the fit results (pull distribution, fit bias estimate, etc.) User-defined components can be easily plugged-in

4 Luca Lista, IEEE NSS-MIC 2003, Portland Design choices The code is optimized for speed  Toy Monte Carlo of complex fits are very CPU intensive It can be achieved without loosing good OO design  avoid virtual functions where not necessary  using template generic programming  the Boost C++ library provides powerful tools Metaprogramming permits type manipulations at compile time User don’t “see” these technical detail in the interface External package dependencies are well isolated  Random number generator engines (ROOT, CLHEP, …)  Minuit wrapper (ROOT, …) Other minimizers may be adopted (NAG, …)

5 Luca Lista, IEEE NSS-MIC 2003, Portland A PDF implements the “()” operator: P = f( x, y, … ) Users can define new PDFs respecting the above interface PDF interface struct Flat : { PdfFlat( double a, double b ) : min( a ), max( b ) { } double operator()( double x ) const { return ( x max ? 0 : 1 / ( max - min ) ); } double min, max; }; struct Poissonian { PdfPoissonian( double m ) : mean( m ) { } double operator()( int n ) const { return ( exp( - mean ) * pow( mean, n ) / factorial( n ) ); } double mean; }; Variable set; a sequence of any variable type is supported Returns dP(x) / dx Returns P(n)

6 Luca Lista, IEEE NSS-MIC 2003, Portland Implements the “generate” method: r.generate( x, y, … ) Random number generators template struct RandomGenerator { RandomGenerator( const Flat& pdf ) : _min( pdf.min ), _max( pdf.max ) { } void generate( double & x ) const{ x = Generator::shootFlat( _min, _max ); } private: const double& _min, &_max; }; RANDOM_GENERATOR_SAMPLE(MyPdf, Bins, Min, Max) RANDOM_GENERATOR_HITORMISS(MyPdf, Min, Max, fMax) Users can define new generators with the preferred method Numerical implementations are provided trapezoidal PDF sampling “hit or miss” technique Random engine: CLHEP, ROOT, … Partial specialization

7 Luca Lista, IEEE NSS-MIC 2003, Portland Combining PDFs Argus shoulder ( 5.20, 5.28, -0.1 ); Gaussian peak( 5.28, 0.05 ); typedef Mixture Mix; Mix pdf( peak, shoulder, 0.1 ); RandomGenerator rnd; double x; rnd.generate( x ); Gaussian sigX( 5.28, 0.05 ); Gaussian sigY ( 0, 0.015 ); typedef Independent SigXY; RandomGenerator rndXY; double x, y; rndXY.generate( x, y ); 10% peaking component Argus + Gaussian peaking Transformation of variables is also supported  Random variables are be generated in the original coordinate system, then transformed 2D Gaussian peaking Random generators defined automatically

8 Luca Lista, IEEE NSS-MIC 2003, Portland Fit PDF parameters and run Toy MC const int sig = 100 ; double mean = 0, sigma = 1; Gaussian pdf( mean, sigma ); Likelihood like( pdf ); UMLParameterFitter > fitter( like ); fitter.addParameter( "mean", & pdf.mean ); fitter.addParameter( "sigma", & pdf.sigma ); Poissonian num( sig ); // alternative: Constant Gaussian pdfExp( mean, sigma ); Experiment experiment( num, pdfExp ); for ( int i = 0; i < 50000; i++ ) { Sample sample; experiment.generate( sample ); double par[ 2 ] = { mean, sigma }, err[ 2 ] = { 1, 1 }, logLike; logLike = fitter.fit( par, err, sample ); double pullm = ( par[ 0 ] - mean ) / err[ 0 ]; double pulls = ( par[ 1 ] - sigma ) / err[ 1 ]; } Poisson PDF for MC generation Parameters “linked” to the fitter Definition of fit model and fitter Type list deduced from Likelihood type

9 Luca Lista, IEEE NSS-MIC 2003, Portland Parameter fit Results (Pulls) There is a bias (as expected):  2 = 1 / n  i (x i -  ) 2  1 / n-1  i (x i -  ) 2

10 Luca Lista, IEEE NSS-MIC 2003, Portland UML Yield fit const int sig = 10, bkg = 5; typedef Independent PdfSig; typedef Independent PdfBkg; PdfSig pdfSig( Gaussian( 0, 1 ), Gaussian( 0, 0.5 ) ); PdfBkg pdfBkg( Flat( -5, 5 ), Flat( -5, 5 ) ); typedef ExtendedLikelihood2 Likelihood; Likelihood like( pdfSig, pdfBkg ); UMLYieldFitter fitter( like ); typedef Poissonian Fluctuation; // alternative: Constant Fluctuation fluctuationSig( sig ), fluctuationBkg( bkg ); typedef Experiment ToySig; typedef Experiment ToyBkg; ToySig toySig( fluctuationSig, pdfSig ); ToyBkg toyBkg( fluctuationBkg, pdfBkg ); Experiment2 toy( toySig, toyBkg ); for ( int i = 0; i < 50000; i++ ) { Sample sample; toy.generate( sample ); double s[] = { sig, bkg }, err[] = { 1, 1 }; double logLike = fitter.fit( s, err, sample ); double pull1 = ( s[0] - sig ) / err[0] ), pull2 = ( ( s[1] - bkg ) / err[1] ); } Ext. Likelihood with two samples Yield fitter extracts the yield of the two components In 2 dimensions: Flat background in a signal box Gaussian signal

11 Luca Lista, IEEE NSS-MIC 2003, Portland Yield fit Results (Pulls) Discrete structure because of low statistics Poisson fluctuation = 10 = 5

12 Luca Lista, IEEE NSS-MIC 2003, Portland Combined Yield and parameter fit const int sig = 10, bkg = 5; typedef Poissonian Fluctuation; Fluctuation fluctuationSig( sig ), fluctuationBkg( bkg ); typedef Independent< Gaussian, Gaussian > PdfSig; typedef Independent< Flat, Flat > PdfBkg; Gaussian g1( 0, 1 ), g2( 0, 0.5 ); Flat f1( -5, 5 ), f2( -5, 5 ); Sig pdfSig( g1, g2 ); Bkg pdfBkg( f1, f2 ); typedef Experiment<Fluctuation, Sig> ToySig; typedef Experiment<Fluctuation, Bkg> ToyBkg; ToySig toySig( fluctuationSig, pdfSig ); ToyBkg toyBkg( fluctuationBkg, pdfBkg ); Experiment2 toy( toySig, toyBkg ); typedef ExtendedLikelihood2<PdfSig, PdfBkg> Likelihood; Gaussian G1( 0, 1 ); Sig pdfSig1( G1, g2 ); Likelihood like( pdfSig1, pdfBkg ); UMLYieldAndParameterFitter fitter( like ); fitter.addParameter( "mean", & G1.mean ); double pull1, pull2, pull3; for ( int i = 0; i < 50000; i++ ) { Sample sample; toy.generate( sample ); double s[] = { sig, bkg, 0 }; double err[] = { 1, 1, 1 }; double logLike = fitter.fit( s, err, sample ); pull1 = ( s[ 0 ] - sig ) / err[ 0 ]; pull2 = ( s[ 1 ] - bkg ) / err[ 1 ]; pull3 = ( s[ 2 ] - 0 ) / err[ 2 ]; } 2D Gaussian signal over a 2D flat background: Simultaneous fit of yields and Gaussian mean

13 Luca Lista, IEEE NSS-MIC 2003, Portland Symbolic function package Symbolic expressions makes the definition of PDFs easier { X x; // declare the variable x // normalize using the symbolic integration at c-tor PdfNonParametric f1( sqr( sin(x) + cos(x) ), 0, 4 * M_PI ); // recompute the normalization every time, since // the parameter tau may change from call to call Parameter tau( 0.123 ); PdfParametric f2( x * exp( - tau * x ), 0, 10 ); } User can specify different way of performing normalization and integration Normalization: Analytic integral performed by the compiler

14 Luca Lista, IEEE NSS-MIC 2003, Portland Example of  2 fit { X x; Parameter a( 0 ), b( 1 ), c( 0 ); Function parabola( c + x*( b + x*a ) ); UniformPartition partition( 100, -1.0, 1.0 ); Chi2 > chi2( parabola, partition ); Chi2Fitter > > fitter( chi ); fitter.addParameter( "a", a.ptr() ); fitter.addParameter( "b", b.ptr() ); fitter.addParameter( "c", c.ptr() ); SampleErr sample( partition.bins() ); // fill the sample... double par[] = { a, b, c }, err[] = { 1, 1, 1 }; fitter.fit( par, err, sample ); }

15 Luca Lista, IEEE NSS-MIC 2003, Portland Possible future improvement Upper limit extraction based on Toy Monte Carlo  Could be based on existing code from BaBar B    analysis Support for  2 fit with correlated errors and covariance matrix Provide more “standard” PDFs  Crystal ball, Tchebichev polynomials,… Managing singular PDF  Delta-Dirac components Managing (un)folding …

16 Luca Lista, IEEE NSS-MIC 2003, Portland Conclusion We designed a new tool to model fit problems Using template generic programming we obtained:  Generality: User can plug-in new components (PDF, transformations, random generators, etc.) Easy to incorporate in the tool external contributions  Light-weight Most of the code is contained in header ( #include ) files Mild external dependencies  Easy to use Very “synthetic” and “expressive” code  CPU Speed Virtual function calls are extremely limited Most of the methods are inlined Interest has been expressed from:  Geant4 Statistical testing toolkit  LCG/PI (LHC Computing Grid - Physics Interfaces) Will focus on a release version shortly


Download ppt "Luca Lista, IEEE NSS-MIC 2003, Portland A Toolkit for Multi-variate Fitting Designed with Template Metaprogramming Luca Lista 1, Francesco Fabozzi 1,2."

Similar presentations


Ads by Google