Presentation is loading. Please wait.

Presentation is loading. Please wait.

Performance Monitoring Update Daniele Francesco Kruse August 2010.

Similar presentations


Presentation on theme: "Performance Monitoring Update Daniele Francesco Kruse August 2010."— Presentation transcript:

1 Performance Monitoring Update Daniele Francesco Kruse August 2010

2 Summary Refinement of Nehalem analysis methodology following David Levinthal’s recommendations Added CSV exportation feature (./pfm_analysis results/ -–csv ) for spreadsheet programs (e.g. MS Excel) Simbol level detail accessible directly from modular analysis page links for each module Problems and future work 2

3 New analysis methodology for Nehalem 3 BASIC STATS: Total Cycles, Instructions Retired, CPI; IMPROVEMENT OPPORTUNITY: iMargin, iFactor; BASIC STALL STATS: Stalled Cycles, % of Total Cycles, Total Counted Stalled Cycles; INSTRUCTION USEFUL INFO: Instruction Starvation, # of Instructions per Call; FLOATING POINT EXCEPTIONS: % of Total Cycles spent handling FP exceptions; LOAD OPS STALLS: L2 Hit, L3 Unshared Hit, L2 Other Core Hit, L2 Other Core Hit Modified, L3 Miss -> Local DRAM Hit, L3 Miss -> Remote DRAM Hit, L3 Miss -> Remote Cache Hit; DTLB MISSES: L1 DTLB Miss Impact, L1 DTLB Miss % of Load Stalls; DIVISION & SQUAREROOT STALLS: Cycles spent during DIV & SQRT Ops; L2 IFETCH MISSES: Total L2 IFETCH misses, IFETCHes served by Local DRAM, IFETCHes served by L3 (Modified), IFETCHes served by L3 (Clean Snoop), IFETCHes served by Remote L2, IFETCHes served by Remote DRAM, IFETCHes served by L3 (No Snoop); BRANCHES, CALLS & RETS: Total Branch Instructions Executed, % of Mispredicted Branches, Direct Near Calls, Indirect Near Calls, Indirect Near Non-Calls, All Near Calls, All Non Calls, All Returns, Conditionals; ITLB MISSES: L1 ITLB Miss Impact, ITLB Miss Rate; INSTRUCTION STATS: Branches, Loads, Stores, Other, Packed UOPS;

4 First draft results for Nehalem 4 Results for a first analysis on CMSSW 3.8.0 are available at the following addresses: http://dkruse.web.cern.ch/dkruse/2010_07_30_CMSSW_3_8_0_slc5_amd64_gcc434_minbias/ http://dkruse.web.cern.ch/dkruse/2010_07_30_CMSSW_3_8_0_slc5_amd64_gcc434_qcd/ http://dkruse.web.cern.ch/dkruse/2010_07_30_CMSSW_3_8_0_slc5_amd64_gcc434_ttbar/ The analysis has been carried out on a quad-core single-socket Nehalem system (core i7) with the following configurations: cmsDriver.py recominbias -s RAW2DIGI,RECO -n -1 --filein file:500evt_MinBias_cfi_GEN_SIM_DIGI_L1_DIGI2RAW_HLT_RAW2DIGI_L1Reco.root --eventcontent RECOSIM --conditions auto:mc --no_exec cmsDriver.py recottbar -s RAW2DIGI,RECO -n -1 --filein file:100evt_TTbar_cfi_GEN_SIM_DIGI_L1_DIGI2RAW_HLT_RAW2DIGI_L1Reco.root --eventcontent RECOSIM - -conditions auto:mc --no_exec cmsDriver.py recoqcd -s RAW2DIGI,RECO -n -1 --filein file:100evt_QCD_Pt_3000_3500_cfi_GEN_SIM_DIGI_L1_DIGI2RAW_HLT_RAW2DIGI_L1Reco.root -- eventcontent RECOSIM --conditions auto:mc --no_exec

5 Problems and future work 5 Perfmon2 not yet compatible with Westmere-based processors Events with custom Umasks don’t work correctly all the time with libpfm Waiting for the final validation of formulas used in the analysis for Nehalem from David Levinthal Deployment for CMSSW asap Deployment for Gaudi & Geant4 (end of August / beginning of September)

6 Thank you, Questions ?


Download ppt "Performance Monitoring Update Daniele Francesco Kruse August 2010."

Similar presentations


Ads by Google