DC2 at GSFC - 28 Jun 05 - T. Burnett 1 DC2 C++ decision trees Toby Burnett Frank Golf Quick review of classification (or decision) trees Training and testing.

Slides:



Advertisements
Similar presentations
DECISION TREES. Decision trees  One possible representation for hypotheses.
Advertisements

Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
CS 4100 Artificial Intelligence Prof. C. Hafner Class Notes March 27, 2012.
Decision Tree Approach in Data Mining
Introduction Training Complexity, Pruning CART vs. ID3 vs. C4.5
Classification Techniques: Decision Tree Learning
Chapter 7 – Classification and Regression Trees
Chapter 7 – Classification and Regression Trees
Searching for Single Top Using Decision Trees G. Watts (UW) For the DØ Collaboration 5/13/2005 – APSNW Particles I.
Bill Atwood, August, 2003 GLAST 1 Covariance & GLAST Agenda Review of Covariance Application to GLAST Kalman Covariance Present Status.
Analysis Meeting 24Oct 05T. Burnett1 UW classification: new background rejection trees.
Analysis Meeting 26 Sept 05T. Burnett1 UW classification: background rejection (A summary – details at a future meeting)
Bill Atwood, July, 2003 GLAST 1 A GLAST Analysis Agenda Overarching Approach & Strategy Flattening Analysis Variables Classification Tree Primer Sorting.
Bill Atwood, SCIPP/UCSC, Oct, 2005 GLAST 1 DC2 Discussion – What Next? 1)Alternatives - Bill's IM analysis - ???? 2)Backing the IM Analysis into Gleam.
T. Burnett: IRF status 30-Jan-061 DC2 IRF Status Meeting agenda, references at:
Analysis Meeting 31Oct 05T. Burnett1 Classification, etc. status at UW Implement “nested trees” Generate “boosted” trees for all of the Atwood categories.
Bill Atwood, SCIPP/UCSC, Dec., 2003 GLAST 1 DC1 Instrument Response Agenda Where are we? V3R3P7 Classification Trees Covariance Scaled PSF Pair Energies.
Analysis Meeting 31Oct 05T. Burnett1 Classification, etc. status Implement pre-filters Energy estimation.
Three kinds of learning
Analysis Meeting 31Oct 05T. Burnett1 Classification, etc. status at UW Background rejection for the v7r2 “500M” run at SLAC”: first look at v7r3.
Bill Atwood, SCIPP/UCSC, Nov, 2003 GLAST 1 Update on Rome Results Agenda A eff Fall-off past 10 GeV V3R3P7 Classification Trees Covariance Scaled PSFs.
Bill Atwood, SCIPP/UCSC, August, 2005 GLAST 1 Energy Method Selections Data Set: All Gamma (GR-HEAD1.615) Parametric Tracker Last Layer Profile 4 Methods.
Bill Atwood, SCIPP/UCSC, Dec., 2003 GLAST ALL GAMMAs using the New Recons First runs of ALL GAMMA – from glast-ts and SLAC Unix farm glast-ts sample: no.
C&A 8 May 06 1 Point Source Localization: Optimizing the CTBCORE cut Toby Burnett University of Washington.
Learning decision trees derived from Hwee Tou Ng, slides for Russell & Norvig, AI a Modern Approachslides Tom Carter, “An introduction to information theory.
Learning decision trees derived from Hwee Tou Ng, slides for Russell & Norvig, AI a Modern Approachslides Tom Carter, “An introduction to information theory.
Bill Atwood, Rome, Sept, 2003 GLAST 1 Instrument Response Studies Agenda Overarching Approach & Strategy Classification Trees Sorting out Energies PSF.
C&A 10April06 1 Point Source Detection and Localization Using the UW HealPixel database Toby Burnett University of Washington.
Analysis Meeting 17 Oct 05T. Burnett1 UW classification: background rejection Update with new all gamma and background samples.
ICS 273A Intro Machine Learning
Classification and Prediction by Yen-Hsien Lee Department of Information Management College of Management National Sun Yat-Sen University March 4, 2003.
Bill Atwood, Nov. 2002GLAST 1 Classification PSF Analysis A New Analysis Tool: Insightful Miner Classification Trees From Cuts Classification Trees: Recasting.
Analysis Meeting 4/09/05 - T. Burnett 1 Classification tree Update Toby Burnett Frank Golf.
1 Using Insightful Miner Trees for Glast Analysis Toby Burnett Analysis Meeting 2 June 2003.
Bill Atwood, SCIPP/UCSC, DC2 Closeout, May 31, 2006 GLAST 1 Bill Atwood & Friends.
T. Burnett: IRF status 5-feb-061 DC2 IRF Status Meeting agenda, references at:
Analysis Meeting 14Nov 05T. Burnett1 Classification, etc. status at UW New application to apply trees to existing tuple Background rejection for the v7r2.
Classification and Regression Trees for Glast Analysis: to IM or not to IM? Toby Burnett Data Challenge Meeting 15 June 2003.
Analysis Meeting 8 Aug 05T. Burnett1 Status of UW classification and PSF fits Data selection Classification PSF.
Bill Atwood, SCIPP/UCSC, Oct, 2005 GLAST 1 Back Ground Rejection Status for DC2 a la Ins. Miner Analysis.
Bill Atwood, SCIPP/UCSC, Jan., 2006 GLAST 1 The 3 rd Pass Back Rejection Analysis using V7R3P4 (repo) 1)Training sample: Bkg: Runs (SLAC) v7r3p4.
Decision Tree Models in Data Mining
Introduction to Directed Data Mining: Decision Trees
Chapter 9 – Classification and Regression Trees
Lecture 7. Outline 1. Overview of Classification and Decision Tree 2. Algorithm to build Decision Tree 3. Formula to measure information 4. Weka, data.
Comparison of Bayesian Neural Networks with TMVA classifiers Richa Sharma, Vipin Bhatnagar Panjab University, Chandigarh India-CMS March, 2009 Meeting,
B-tagging Performance based on Boosted Decision Trees Hai-Jun Yang University of Michigan (with X. Li and B. Zhou) ATLAS B-tagging Meeting February 9,
CS690L Data Mining: Classification
Training of Boosted DecisionTrees Helge Voss (MPI–K, Heidelberg) MVA Workshop, CERN, July 10, 2009.
ASSESSING LEARNING ALGORITHMS Yılmaz KILIÇASLAN. Assessing the performance of the learning algorithm A learning algorithm is good if it produces hypotheses.
Feb. 7, 2007First GLAST symposium1 Measuring the PSF and the energy resolution with the GLAST-LAT Calibration Unit Ph. Bruel on behalf of the beam test.
ECE 471/571 – Lecture 20 Decision Tree 11/19/15. 2 Nominal Data Descriptions that are discrete and without any natural notion of similarity or even ordering.
GLAST/LAT analysis group 31Jan05 - T. Burnett 1 Creating Decision Trees for GLAST analysis: A new C++-based procedure Toby Burnett Frank Golf.
1 Classification: predicts categorical class labels (discrete or nominal) classifies data (constructs a model) based on the training set and the values.
Analysis of H  WW  l l Based on Boosted Decision Trees Hai-Jun Yang University of Michigan (with T.S. Dai, X.F. Li, B. Zhou) ATLAS Higgs Meeting September.
Classification and Regression Trees
CIS 335 CIS 335 Data Mining Classification Part I.
Machine Learning Recitation 8 Oct 21, 2009 Oznur Tastan.
T. Burnett: Anagroup 5/24/04 1 Latest PSF and Point-source sensitivity prediction status Anagroup meeting Toby Burnett 24 May 04.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
BY International School of Engineering {We Are Applied Engineering} Disclaimer: Some of the Images and content have been taken from multiple online sources.
Background Rejection Activities in Italy
DECISION TREES An internal node represents a test on an attribute.
Decision Trees an introduction.
Background Rejection Prep Work
LAT performance for DC2 and the “4-panel plot”
Decision Trees By Cole Daily CSCI 446.
Searching for photons in the LAT
CR Photon Working Group
STT : Intro. to Statistical Learning
Presentation transcript:

DC2 at GSFC - 28 Jun 05 - T. Burnett 1 DC2 C++ decision trees Toby Burnett Frank Golf Quick review of classification (or decision) trees Training and testing How Bill does it with Insightful Miner Applications to the “good-energy” trees: how does it compare?

DC2 at GSFC - 28 Jun 05 - T. Burnett 2 Quick Review of Decision Trees Introduced to GLAST by Bill Atwood, using InsightfulMiner Each branch node is a predicate, or cut on a variable, like CalCsIRLn > If true, this defines the right branch, otherwise the left branch. If there is no branch, the node is a leaf; a leaf contains the purity of the sample that reaches that point Thus the tree defines a function of the event variables used, returning a value for the purity from the training sample predicate purity true: right false: left

DC2 at GSFC - 28 Jun 05 - T. Burnett 3 Training and testing procedure Analyze a training sample containing a mixture of “good” and “bad” events: I use the even events in order to have an independent set for testing Choose set of variables and find the optimal cut for such that the left and right subsets are purer than the orignal. Two standard criteria for this: “Gini” and entropy. I currently use the former. W S : sum of signal weights W B : sum of background weights Gini = 2 W S W B /(W S +W B ) Thus Gini wants to be small. Actually maximize the improvement: Gini(parent)-Gini(left child)-Gini(right child) Apply this recursively until too few events. (100 for now) Finally test with the odd events: measure purity for each node

DC2 at GSFC - 28 Jun 05 - T. Burnett 4 Evaluate by Comparing with IM results CAL-Low CT Probabilities CAL-High CT Probabilities All Good Bad Good All Bad Good From Bill’s Rome ’03 talk: The “good cal” analysis

DC2 at GSFC - 28 Jun 05 - T. Burnett 5 Compare with Bill, cont Since Rome: Three energy ranges, three trees. CalEnergySum: 5-350; ; >3500 Resulting decision trees implemented in Gleam by a “classification” package: results saved to the IMgoodCal tuple variable. Development until now for training, comparison: the all_gamma from v4r2 760 K events E from 16 MeV to 160 GeV (uniform in log(E), and 0<  < 90 (uniform in cos(  ) ). Contains IMgoodCal for comparison Now:

DC2 at GSFC - 28 Jun 05 - T. Burnett 6 Bill-type plots for all_gamma

DC2 at GSFC - 28 Jun 05 - T. Burnett 7 Another way of looking at performance Define efficiency as fraction of good events after given cut on node purity; determine bad fraction for that cut

DC2 at GSFC - 28 Jun 05 - T. Burnett 8 The classifier package Properties Implements Gini separation (entropy now implemented) Reads ROOT files Flexible specification of variables to use Simple and compact ascii representation of decision trees Recently implemented: multiple trees, boosing Not yet: pruning Advantages vs IM: Advanced techniques involving multiple trees, e.g. boosting, are available Anyone can train trees, play with variables, etc.

DC2 at GSFC - 28 Jun 05 - T. Burnett 9 Growing trees For simplicity, run a single tree: initially try all of Bill’s classification tree variables: list at right. The run over the full 750 K events, trying each of the 70 variables takes only a few minutes! "AcdActiveDist", "AcdTotalEnergy", "CalBkHalfRatio", "CalCsIRLn", "CalDeadTotRat", "CalDeltaT", "CalEnergySum", "CalLATEdge", "CalLRmsRatio", "CalLyr0Ratio", "CalLyr7Ratio", "CalMIPDiff", "CalTotRLn", "CalTotSumCorr", "CalTrackDoca", "CalTrackSep", "CalTwrEdge", "CalTwrGap", "CalXtalRatio", "CalXtalsTrunc", "EvtCalETLRatio", "EvtCalETrackDoca", "EvtCalETrackSep", "EvtCalEXtalRatio", "EvtCalEXtalTrunc", "EvtCalEdgeAngle", "EvtEnergySumOpt", "EvtLogESum", "EvtTkr1EChisq", "EvtTkr1EFirstChisq", "EvtTkr1EFrac", "EvtTkr1EQual", "EvtTkr1PSFMdRat", "EvtTkr2EChisq", "EvtTkr2EFirstChisq", "EvtTkrComptonRatio", "EvtTkrEComptonRatio", "EvtTkrEdgeAngle", "EvtVtxEAngle", "EvtVtxEDoca", "EvtVtxEEAngle", "EvtVtxEHeadSep", "Tkr1Chisq", "Tkr1DieEdge", "Tkr1FirstChisq", "Tkr1FirstLayer", "Tkr1Hits", "Tkr1KalEne", "Tkr1PrjTwrEdge", "Tkr1Qual", "Tkr1TwrEdge", "Tkr1ZDir", "Tkr2Chisq", "Tkr2KalEne", "TkrBlankHits", "TkrHDCount", "TkrNumTracks", "TkrRadLength", "TkrSumKalEne", "TkrThickHits", "TkrThinHits", "TkrTotalHits", "TkrTrackLength", "TkrTwrEdge", "VtxAngle", "VtxDOCA", "VtxHeadSep", "VtxS1", "VtxTotalWgt", "VtxZDir"

DC2 at GSFC - 28 Jun 05 - T. Burnett 10 Performance of the truncated list Note: this is cheating: evaluation using the same events as for training 1691 leaf nodes 3381 total Nameimprovement CalCsIRLn21000 CalTrackDoca3000 AcdTotalEnergy1800 CalTotSumCorr800 EvtCalEXtalTrunc600 EvtTkr1EFrac560 CalLyr7Ratio470 CalTotRLn400 CalEnergySum370 EvtLogESum370 EvtEnergySumOpt310 EvtTkrEComptonRatio290 EvtCalETrackDoca220 EvtVtxEEAngle140

DC2 at GSFC - 28 Jun 05 - T. Burnett 11 Separate tree comparison IM classifier

DC2 at GSFC - 28 Jun 05 - T. Burnett 12 Current status: boosting works! Example using D0 data

DC2 at GSFC - 28 Jun 05 - T. Burnett 13 Current status, next steps The current classifier “goodcal” single-tree algorithm applied to all energies is slightly better than the three individual IM trees Boosting will certainly improve the result Done: One-track vs. vertex: which estimate is better? In progress in Seattle (as we speak) PSF tail suppression: 4 trees to start. In progress in Padova (see F. Longo’s summary) Good-gamma prediction, or background rejection Switch to new all_gamma run, rename variables.