2-D PT module concept for the sLHC CMS tracker

2-D PT module concept for the sLHC CMS tracker
Geoff Hall Special thanks to M. Pesaresi, M. Raymond, A. Rose A. Marchioro CMS Track-trigger task force CMS Tracker upgrade simulation team

Outline Basic idea of, and motive for, PT module
already presented by M Pesaresi Describe one possible way of building such a module not fully worked out ideas are still evolving and we do not know the best route The requirements are not fully defined originally the long term upgrade aimed to operate at 1035 cm-2.s-1 it’s obvious that this is a long way in the future and we should expect the requirements might evolve with the LHC physics discoveries and the operation of the LHC machine, which still faces many challenges nevertheless, the LHC physics programme is of utmost importance and the longevity of the accelerator hence R&D on new module concepts will prove invaluable crucial to advance the means to construct advanced module types must understand better the technical drivers to assemble these modules since significant investment is required Geoff Hall WIT 2010

Basic module requirements
Compare binary pattern of hit pixels on upper and lower sensors Pass Fail High pT tracks can be identified if hits lie within a search window in R-f (rows) in second layer R Upper Sensor 1-2 mm ~200μm Lower Sensor ~100μm f ~2.5mm f (rows) Sensor separation and search window determines pT cut z (columns) Geoff Hall WIT 2010

Schematic of PT module Transfer hits to both edges –with minimal power – for comparison logic also store hits on pixel for L1 readout Module 25.6mm x 80mm 256 x 32 sub-units = 8192 channels Occupancy ~ 0.5% => 40 hit pixels PT reduction ≈ 20 => 2 hit channels/BX 2 x 2.5mm 12.8mm 104 x 2 Data out 128 x100µm x 2x2.5mm Correlator data Geoff Hall WIT 2010

What defines module size?
Assumed radial location ~25-45cm allows to cover full CMS h range, with only barrel assembly Pixel size ~100µm x 2.5mm ~100µm – matches required resolution and likely assembly precision for double layers ~2.5mm - defined by approximate projection of luminous region at R ~25cm and likely radial spacing ~2mm (next slide) coarse enough for low cost bump bonding (or perhaps even wire for prototyping) 256 x 32 - binary multiple of pixels with practical dimensions ~25.6mm x 80mm => 4 sensors in 150mm wafer – ie one contribution to yield Chip height ~15mm – fits reticle Expected occupancy at R = 25cm 1035 so average <1 hit in column which may allow transfer to edge Of course, this is history and explains initial parameters chosen for simulations module can probably be enlarged, eg 384 x 32 but concept has avoided chip to chip connections – i.e. >2 ASICs in height Geoff Hall WIT 2010

PT layer pixel size R-f: compare to likely assembly precision ~ 100µm
Should reduce need to compare many nearby columns D independent of h, but offset in z between layers increases with h d R Luminous region L ∆ R = 25cm, L =28cm d = 2mm ∆ ≈ 2 mm d R L ∆ q q’ ∆ = dL/R LHC luminous region L ≈ 28cm (±3s) – may be larger or smaller at SLHC Geoff Hall WIT 2010

Why edge-readout? Original motivations were:
it was (is) far from clear how to solve layer to layer interconnection problem edge readout seemed to offer means to factorise this, eg by constructing a small, dedicated component real connectors close to requirements do exist but would probably not be usable for mechanical reasons for ~8000 connections profit from low occupancy and high density of lines on ASIC no penalty in making comparisons at edge of module constructing two module layers independently, then making final assembly is attractive conceptually This logic exposes some prejudices (which could be wrong) module should evolve from known technologies design should profit from known constraints (ie occupancy) design should be adaptable to changes in requirements a unique double layer assembly - which is best commercially – may not be possible the logic to be included is not yet well defined and may also evolve Geoff Hall WIT 2010

Comparison logic Modules are flat, not arcs
Compensate for Lorentz drift Orientation of module => position dependent logic d z offset h dependent search window to allow for luminous region and quantization => 3 pixels (if not tiny) p = ∞ IP ~200µm ~12mm h = 2.5 R-f view R-z view Luminous region (large!) Family of modules with offsets in z Geoff Hall WIT 2010

Possible PT layer readout
column of 128 ( or 192) pixels shift register ruled out: 128/25ns = 5.12Gb/s probably even if allow several BX and latency penalty worst case occupancy may mean >1 hit/BX with adequate fluctuations to be permitted NB from simulations, jets don’t have much impact equip column with N x 8 address/data lines N = maximum number of clusters allowed ignore combinations consistent with wide clusters seems easy to read out 3, or more clusters (also suspect occupancy is pessimistic) Geoff Hall WIT 2010

go through logic functionality stage-by-stage then show some examples
one possible design for transferring up to 3 cluster addresses to end of column logic An+4+Dn+3 M Raymond n+4 CWD Dn+4 Ck40 An+3+Dn+2 n+3 CWD Dn+3 Ck40 An+2+Dn+1 n+2 CWD Dn+2 Ck40 An+1+Dn n+1 CWD Dn+1 Ck40 An+Dn-1 channel n CWD Dn Ck40 go through logic functionality stage-by-stage then show some examples end of column logic

front end produces a 25 ns pulse if signal exceeds comp. thresh
one possible design for transferring up to 3 cluster addresses to end of column logic An+4+Dn+3 n+4 CWD Dn+4 Ck40 An+3+Dn+2 n+3 CWD Dn+3 Ck40 An+2+Dn+1 n+2 CWD Dn+2 Ck40 An+1+Dn n+1 CWD Dn+1 Ck40 An+Dn-1 channel n CWD Dn Ck40 end of column logic front end produces a 25 ns pulse if signal exceeds comp. thresh

cluster width discrimination logic passes hit if cluster width ≤ 2
one possible design for transferring clusters to end of column logic An+4+Dn+3 n+4 CWD Dn+4 Ck40 An+3+Dn+2 n+3 CWD Dn+3 Ck40 An+2+Dn+1 n+2 CWD Dn+2 Ck40 An+1+Dn n+1 CWD Dn+1 Ck40 An+Dn-1 channel n CWD Dn Ck40 end of column logic cluster width discrimination logic passes hit if cluster width ≤ 2

further logic ensures that only
one possible design for transferring clusters to end of column logic An+4+Dn+3 n+4 CWD Dn+4 Ck40 An+3+Dn+2 n+3 CWD Dn+3 Ck40 An+2+Dn+1 n+2 CWD Dn+2 Ck40 An+1+Dn n+1 CWD Dn+1 Ck40 An+Dn-1 channel n CWD Dn Ck40 further logic ensures that only one channel signals a hit even if it is part of a two channel cluster end of column logic

priority logic to determine which
one possible design for transferring clusters to end of column logic An+4+Dn+3 n+4 CWD Dn+4 Ck40 An+3+Dn+2 n+3 CWD Dn+3 Ck40 An+2+Dn+1 n+2 CWD Dn+2 Ck40 An+1+Dn n+1 CWD Dn+1 Ck40 An+Dn-1 channel n CWD Dn Ck40 priority logic to determine which mux is active for a particular cluster end of column logic

up to 3 cluster addresses can be transmitted to end-of-column logic
one possible design for transferring clusters to end of column logic An+4+Dn+3 n+4 CWD Dn+4 Ck40 An+3+Dn+2 n+3 CWD Dn+3 Ck40 An+2+Dn+1 n+2 CWD Dn+2 Ck40 An+1+Dn n+1 CWD Dn+1 Ck40 An+Dn-1 channel n CWD Dn Ck40 up to 3 cluster addresses can be transmitted to end-of-column logic in this example

2 multiplexer columns are selected An+3+Dn+2
2 hits on non-adjacent channels An+4+Dn+3 n+4 CWD Dn+4 Ck40 An+3+Dn+2 n+3 CWD Dn+3 Ck40 An+2+Dn+1 n+2 CWD Dn+2 Ck40 An+1+Dn n+1 CWD Dn+1 Ck40 An+Dn-1 channel n CWD Dn Ck40 An+1+Dn 2 multiplexer columns are selected An+3+Dn+2 other examples in backup slides

Track stub generation by matching layers
upper layer n bit bus n assembler multi-via column 128/192 channels n column 128/192 channels interconnect chip lower layer store + (memory buffer for full readout) n bit bus “assembler” n bits from column above transmits column to each neighbour receives n inputs from each neighbour and stores receives n inputs from module below compares pattern from module below with three (n bit) stored patterns need to proceed to VHDL simulation Geoff Hall WIT 2010

Layout ROC-1: 128 chan ROC-2: 128 chan Assembler ROC-3: 128 chan
Make ROC + Assembler as single ASIC identical for both layers may switch off elements Practical to produce a chip to serve several columns which eases data transfer and line density Looks easy to match 4 x 2.5mm columns eg 2mm pitch Looks possible to cover 8 columns with pitch ≈ 2mm Width: 16mm to read 8 x 2.5mm Chip size ≈ 18mm x 16mm (128) ~21mm x 16mm (192) Geoff Hall WIT 2010

Module schematic 80mm data out control in 26mm concentrator
(could be part of GBT system) f z sensor assembler ROC TPG or C foam PCB ~2mm R-f section f R Geoff Hall WIT 2010

Possible assembly sequence
Sensor ROCs bump bonded to sensor Invert sensor-ROC object Exposed ROC areas then face up Place assembly on hybrid hybrid has pre-mounted ancillary chips Wire bond ROCs to hybrid Prepare partner module assemble and connect together Geoff Hall WIT 2010

Weak points in this approach
It is worth emphasising that this concept was originally to kick start a design effort. Nevertheless, so far insufficient attention given (by us) to several important issues cooling – seems feasible to include thermally conductive layers but not necessarily more difficult, as no interior interlayer connections location of links – most likely to be on, or close to, the module originally it seemed that links could be deployed on a remote bulkhead this now seems implausible – high speed electrical links are undesirable for noise and material reasons and there are no obvious close locations optimal matching of bandwidth and power (if adjustable) not yet obvious provision of power to module we assume DC-DC conversion and local converters overall mechanical design involves constraints above Geoff Hall WIT 2010

Data volumes and link requirements
Assume 24 bits/hit to transfer in each 25ns BX includes time stamp and error coding send trigger data from one layer of stack bubut for 40M channels in stacked layer L = 1035 cm-2s-1 Channels/chip 128 Occupancy 0.005 PT data reduction 0.050 Channels above PT cut/BX/layer 5,000 bits/channel 24 No 5Gbps (3.2Gbps data) 1,500 Power/link [W] 2.0 Link Power [kW] 3.0 Power/chan [µW] with 50% BW usage 150 Comments 150µW is conservative estimate for trigger data only assumes 50% use of bandwidth and 2W 5Gbps GBT GBT power may improve ideally should optimise power & speed but additional links required for full readout with 6.4µs storage on each FE pixel Geoff Hall WIT 2010

Power estimate for PT module
P [µW] per pixel Functions Front end 25 amplifier, discriminator local logic, cf ATLAS 130nm pixel Control, PLL 10 1 5mW, x 2 Digital logic 2 transfer to edge (M Raymond) Comparison 4 logic (0.5mW/column) Data transfer 0.25 few cm across module Data to local GBT transmit 1pJ/bit? ≈ 2mW/module Concentrator 5 buffer to and from GBT: 2 20mW Full readout 20 following L1 trigger, extrapolate from CMS pixel Sub-total ~67 Total with DC-DC ~90µW 75% efficiency for DC-DC conversion NB big uncertainties and significant guesswork eg SEU-robustness, full control and timing, data volumes,….all required essential to improve on this with real design work Geoff Hall WIT 2010

Other considerations “Simplicity” is desirable
designers and users may have different perspectives on issues such as grounding and shielding control software development initialisation and data unpacking software off-detector firmware and digital processing (trigger system will probably continue to be debugged with real triggers only in-situ) Up to now, there was modest overlap between real end-users and ASIC designers this will probably be more crucial for these modules substantial evaluation programme should be foreseen Geoff Hall WIT 2010

An parable of evolution
To make soap powder, liquid is blown through a nozzle. As it streams out, the pressure drops and a cloud of particles forms… thirty years ago, the spray came through a simple pipe that narrowed from one end to the other… it had problems with irregularities in size of grains, liquid or blockages… the nozzle has become an intricate duct, longer than before, with many constrictions and chambers. The liquid follows a complex path before it sprays. Each type of powder has its own nozzle design which does the job with great efficiency. The problem was too hard to allow even the finest engineers to explore with mathematics and design… they tried another approach… evolution: preservation of favourable variations and rejection of those injurious. Take a nozzle that works quite well and make copies, each changed at random. Test them for how well they make powder. Then impose a struggle for existence by insisting that not all can survive. Many altered devices are no better (or worse) than the parental form. They are discarded, but the few able to do a superior job are allowed to reproduce and are copied – but again not perfectly. As generations pass, there emerges a new and efficient pipe of complex and unexpected shape. from Steve Jones. Almost Like a Whale. Geoff Hall WIT 2010

Conclusions The requirements for trigger modules are not yet clear
especially the physics case and luminosity scenario a lot of factors involved in constructing the module Power is crucial for such layers in future trackers the trigger data transfer still dominates but maybe can be optimised Evolution is about finding a solution which matches the environment intelligence may not be what we think it should be we need to try out alternatives and really evaluate them It may be necessary to think hard about using expensive technologies in a way which allows us to try variants before committing to a single solution Geoff Hall WIT 2010

Backups Geoff Hall WIT 2010

Approximate parameters of trigger layers
For stacked layer (doublet) Pixel size 100µm x 2.5mm ROC 8 x 128 channels <Power>/pixel 250µW (*) |hMAX| 2.5 Bandwidth efficiency 50% R [cm] L [m] A [m2] Nface Nchan NROC Nmodule Nlinks P [kw] 25 3.0 9.6 64 38.5M 38k 4700 2880 35 4.2 18.7 88 75M 73k 9200 5610 (*) With overlaps in R-f or h expect additional 10-15% present tracker ~35kW Geoff Hall WIT 2010

Making a trigger Stubs provide track trigger primitives
Not yet proven how these contribute to trigger and rate reduction achievable many simulation studies under way expect to match a series of stubs to a calorimeter or muon object using off-detector processors Questions to answer include how many layers are needed? what is the optimal location, allowing sufficient h coverage? what is the impact of material? – in trigger layers and elsewhere how important is z-measurement, and resolution? what is the impact on tracking performance? cost, power and material budget? L0 trigger to guide? Geoff Hall WIT 2010

some digital power estimates (1)
M Raymond 8 bits data transmission through mux hit occupancy per strixel column = 0.5% x 128 = 64% 93% of time only one cluster per 128 strixel column (simulation) => close to 100% each mux line can change state 50 % of time (either ‘1’ or ‘0’, so 50% of time will change from ‘0’ to ‘1’) so translating to an “average” toggling speed 20 MHz (line can only change state every 25 ns) x 0.64 x 0.5 = 6.4 MHz average power consumed in the mux gates 2 (gates / mux line) x 8 (lines) x 6.4 (MHz) x 9 (nW / MHz / gate) = 1 uW (note: pessimistic since not all mux gates will be active – depends on location of hit) = ~ 50 nW per strixel channel negligible transmission power associated with transmitting CMOS levels across chip (note this power also consumed in the gate driving the line, but consider separate here) (use 2 uW/MHz/cm) 2 x 6.4 (MHz) x 1.28 (cm) x 8 (mux lines) /128 strixels = ~ 1 uW per pixel select A B OUT

some digital power estimates (2)
40 MHz clock distribution: 100 uW (2 uW/MHz/cm) (128 channnel chip, 1.28 cm high) = ~ 1 uW / channel (assumes 1 V transitions, 2 pF / cm) channel logic (including mux select): ~ 30 gates toggling at channel occupancy frequency (0.5% x 40 MHz) = 48 nW / channel (9 nW / MHz / gate) negligible => digital total associated with correlation data to edge of chip only ~ few uW / channel

some circuit area estimates
…. for some of the logic described here assuming 10 um2 per inverter, 20 um2 per 2 I/P NAND/NOR (in 130nm)* CWD logic: ~ 16 x 16 um2 priority logic to select mux: ~ 16 x 16 um2 mux logic for 3 per pixel: ~ 50 x 50 um2 so no big area consumption here – c.f. pixel size ~ 100 x 1500 um2 but much other functionality not yet included * W. Erdmann:

cluster width discrimination
C OUT FUNCTIONALITY REQUIRED C OUT = ( C-2 . C-1 . C . C+1 ) + ( C-1 . C . C+1 ) + ( C-1 . C . C+1 . C+2 ) central channel and one above only central channel only central channel and one below only In the diagrams the cluster width discrimination logic above is represented by CWD

address of hit pixel passes through multiplexer chain to end of column
example - one hit on one channel An+4+Dn+3 n+4 CWD Dn+4 Ck40 An+3+Dn+2 n+3 CWD Dn+3 Ck40 An+2+Dn+1 n+2 CWD Dn+2 Ck40 An+1+Dn n+1 CWD Dn+1 Ck40 An+Dn-1 channel n CWD Dn Ck40 An+2+Dn+1 25 nsec pulse propagates through logic, selecting multplexer in first column address of hit pixel passes through multiplexer chain to end of column

so 7-bit address of only one channel (An+3) is selected
one hit shared between 2 channels An+4+Dn+3 n+4 CWD Dn+4 Ck40 An+3+Dn+2 n+3 CWD Dn+3 Ck40 An+2+Dn+1 n+2 CWD Dn+2 Ck40 An+1+Dn n+1 CWD Dn+1 Ck40 An+Dn-1 channel n CWD Dn Ck40 An+3+Dn+2 simple logic stops the lower channel signal feeding through to the mux select input so 7-bit address of only one channel (An+3) is selected double channel cluster indicated by including hit signal Dn+2 from lower channel (so 8 bits altogether)

CWD logic rejects signals in 3 (or more) adjacent channels
one hit shared between 3 channels An+4+Dn+3 n+4 CWD Dn+4 Ck40 An+3+Dn+2 n+3 CWD Dn+3 Ck40 An+2+Dn+1 n+2 CWD Dn+2 Ck40 An+1+Dn n+1 CWD Dn+1 Ck40 An+Dn-1 channel n CWD Dn Ck40 CWD logic rejects signals in 3 (or more) adjacent channels no multiplexer is selected

2 multiplexer columns are selected An+3+Dn+2
2 adjacent + 1 isolated An+4+Dn+3 n+4 CWD Dn+4 Ck40 An+3+Dn+2 n+3 CWD Dn+3 Ck40 An+2+Dn+1 n+2 CWD Dn+2 Ck40 An+1+Dn n+1 CWD Dn+1 Ck40 An+Dn-1 channel n CWD Dn Ck40 An+1+Dn 2 multiplexer columns are selected An+3+Dn+2

all 3 multiplexers are now active
3 isolated hits An+4+Dn+3 n+4 CWD Dn+4 Ck40 An+3+Dn+2 n+3 CWD Dn+3 Ck40 An+2+Dn+1 n+2 CWD Dn+2 Ck40 An+1+Dn n+1 CWD Dn+1 Ck40 An+Dn-1 channel n CWD Dn Ck40 An+Dn-1 An+4+Dn+3 An+2+Dn+1 all 3 multiplexers are now active a cluster further up the chip would be lost in this design (design would have to expand if more than 3 clusters required – but linear expansion only)

2-D PT module concept for the sLHC CMS tracker

Similar presentations

Presentation on theme: "2-D PT module concept for the sLHC CMS tracker"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

2-D PT module concept for the sLHC CMS tracker

Similar presentations

Presentation on theme: "2-D PT module concept for the sLHC CMS tracker"— Presentation transcript:

Similar presentations

About project

Feedback