Presentation is loading. Please wait.

Presentation is loading. Please wait.

Progress report on the alignment of the tracking system A. Bonissent D. Fouchez A.Tilquin CPPM Marseille Mechanical constraints from optical measurement.

Similar presentations


Presentation on theme: "Progress report on the alignment of the tracking system A. Bonissent D. Fouchez A.Tilquin CPPM Marseille Mechanical constraints from optical measurement."— Presentation transcript:

1 Progress report on the alignment of the tracking system A. Bonissent D. Fouchez A.Tilquin CPPM Marseille Mechanical constraints from optical measurement Inversion of big matrix in a decent execution time Pixel barrel as a test bench for the alignment code March 6, 2003

2 Mechanical constraints Aleph case: Basic unit was a face composed of six silicon wafers, rigidly glued to each other and to an Omega structure During construction,3-dimensional coordinate measurements were made at each corner and on each wafer The invariance of the locations of neighbouring corners was taken into account by projecting the linear system of equations into the orthogonal subspace  40% reduction of the number of degree of freedom  Stabilisation of the poorly constrained end wafer  Improvement of the final resolution We investigated such implementation for the ATLAS pixels

3 Projection factor and execution speed With similar approach: Two degrees of freedom per junction between two modules can be eliminated: 2 translations along and perpendicular to the stave 12*3=36 degrees of freedom for a total of 13*6  40% reduction We can expect to gain, inversion is an N 3 process. But do not forget the projection in subspace, and the backward projection to the initial space. 1-reduction factor Full inversion Projection Start to gain with a reduction of at least 60% for 60 modules, and more than 90% for barrel pixel. Projection is not justified by execution time Worse Better

4 Projection and precision The projection technique is equivalent to neglecting completely the errors on the optical measurements  Measurements have to be much better than final resolution Optical measurements: Done in the lab with an accuracy of ~1 micron But at room temperature and before transportation to CERN! Temperature effects are difficult to evaluate. Mechanical junction between modules not rigid enough to guarantee movements smaller than few microns. No mechanical measurement after installation and during operation  The full alignment would have to be done at the beginning: To get the correct position To verify stability of mechanical constraint  Unstable mechanical constraint may introduce artificial distortion Conclusions: Mechanical constraints not justified (at the beginning)

5 Matrix inversion Our last conclusions using Millipede package and 1.4 Ghz PC was: Pixels Barrel5 hours Whole pixels8 hours SCT alone2.5 days Whole ID10 days For only one iteration In 2007 we may expect a reduction factor between 5 or 10 in execution time due to an increase of PC power. But still too long. Number of iterations at least 3: Non linearity: 2 iterations Minimum check:  i 2 -  I+1 2  Test before 2007 will be difficult with this execution time ~10 is more realistic New ideas and new technologies should be investigated

6 New investigations The general problem can be split in 2 independent parts: Solve the linear system, without inversion and iterate Faster (~2), more accurate and robust Invert the big matrix at the end to get the errors Not really necessary if like in ALEPH, the errors due to alignment are small compare to intrinsic one. But a factor 2 of is still not enough The new technology is parallelism For that purpose we have investigated two freeware package: HPL: High performance LinPack (Solver) ScalaPACK: Scalable LAPACK (Inversion) Both emulate a massive parallel computer on a PC cluster

7 HPL: High Performance LinPack HPL: It solves dense linear system in double precision (64 bits) on distributed-memory computer ( PC’s). It uses: Two dimensional block- cyclic memory data distribution: Each PC’s receives only part of the big matrix LU factorisation with row partial pivoting Recursive factorisation with pivot search See http://www.netlib.org/benchmark/hpl/

8 Installation and features Installation. It needs: MPI (Message Passing Interface):Freeware BLAS(Basic Linear Algebra Subprograms):Freeware Provides: A testing program (random matrix generation):accuracy A timing program: execution time and estimate of Gflops An optimisation program: The blocking factor: size of A ij sub matrices (64*64) The cluster size : number of PC’s or processes The process grid: n rows *n columns = n machine ( N PC *1) To investigate performances we have used 16 personal PC’s from our Lab: CPU from 500 Mhz up to 1.8 Ghz Memory from 128 Mb up to 256 Mb Rem:Overall performances are driven by the least powerful PC

9 Evolution of performances with cluster size On a single machine: Linear resolution 2 times faster than Mellepede matrix inversion. Resolution of a 2000 linear equations system: ~Scalable with number of PC’s up to 4. Above 4, limitation is due to the communication protocol. Matrix is too small. Overall gain compared to Mellepede is a factor of 10 with an accuracy close to the hardware limit.

10 Evolution of performances with matrix size Performances measured with 16 PC’s up to a system of 11000 equations (limited by memory size). Square:Measured Circle:corrected It’s an N 3 process up to 8000 Above 8000, process slow down due to memory swap. Time renormalization with Gflop/s at peek (best point): Nominal performances with an ideal system (dedicated/enough memory)

11 Final performances N=matrix size n=number of PC’s S=Speed (Ghz) of the slowest Actual performances with our 16 PC’s cluster and S=0.5 Ghz HPLMillipede(1.4 Ghz) Pixel barrel50 mn8 hours Whole Pixel1.4 hours2.6 days Whole ID (extrapolated) 18 hours10 days But still too long: More than one iteration will be needed depending on the final accuracy we want.

12 Future improvements The processor grid and blocking size can be optimised: We used : (16 rows * 1 column)  (4*4) or (2*8) : 64 blocking size  128,256…xxx The BLAS routines are not optimised: From specialist we may gain a big factor by using either: Optimised BLAS version : Not freeware ATLAS (Automatically Tuned Linear Algebra Software) Free, not tested yet because of our too old Linux version The power of the machine in five years at least 5 Ghz (%0.5) 64 bits floating point arithmetic unit already available First estimate on DEC machine is a factor of at least 10 But not independent of CPU and memory speed ~2~2 ~5~5 ~10 >2>2 (18 hours*10 iterations)/200= 0.9 hour for the whole ID ID alignment execution time is no more a problem T=200

13 Matrix inversion Main problems: Matrix inversion is always slower than linear resolution (~2) Accuracy is worse :non uniformity of the matrix elements It is less robust: null eigenvalue However: Computing the errors gives more confidence to the final results It needs to be done only once at the end The solution is ScalaPack (Scalable LAPACK) The package is a general parallel linear algebra including inversion It needs: PBLAS (BLAS level 1-2-3) BLACS (Basic Linear Algebra Communication Subprograms) MPI (Message passing Interface)

14 First investigations The philosophy of the package is very close to HPL: Same data distribution, basic algebra and communication protocol The installation is easier Pre-compiled library and MPI already installed But example programs not that clear No inversion example However we succeeded to invert matrices, but only on single processor (PII at 0.5 Ghz) SizeInversionHPL 1000*100069 s35 s 2000*200010.5 mn5.5 mn A factor 2 slower than HPL  Millepede Same N 3 dependency than HPL apart a multiplicative factor of 2 Because of scalability, all what we said about HPL is valid for ScalaPack

15 Conclusions on Matrix inversion Using software as HPL and/or ScalaPack A cluster of 16 dedicated PC’s 5 Ghz CPU clock 1 Gb of fast memory 64 floating point arithmetic unit The whole ID will be aligned in one hour or so !! If you were to buy new hardware, my guess is that you could solve this problem 10 times in a day on one multiprocessor machine. The forthcoming AMD Opteron gets roughly 85% of peak, and is a 64 bit machine so that you can put enough memory on it. There's certainly no need for 32 nodes for such a small problem. R Clint Whaley Incredible ? Here is a mail from the expert itself

16 Pixel barrel alignment To test the alignment procedure we used: A sample of multimuons generated by Richard 1500 events ( 0<  <2 ,  <1.8, 2<p T <50 GeV ) 10 muons per event with a common vertex Vertex spread: 5.6 cm in z and 2 mm in x and y Richard’s code to produce the ntuple with perfect geometry for the full pixel barrel (1418 modules) or ring (100 modules) Alignment code running in two passes: Preparation of vector and matrix (Pawel’s code) and storage on disk (no vertex fit applied yet) Solving the linear system with HPL using a farm of 25 PC’s (no inversion)

17 Alignment parameters correction Structure in the alignment parameters and errors point out remaining problems The comparison on a single ring between HPL and standard CERN library shows similar results. Machinery is ready for a successful test of the pixel barrel allignement

18 Conclusions Mechanical constraints are not justified : No gain in execution time Stability not guaranteed at the level of few microns Linear system resolution and matrix inversion no more a problem With a cluster of 16 modern PCs,whole ID alignment in 1 hour First test of full scale alignment of the pixel barrel has been done Some problems remain in the alignment procedure Detailed documentation on Pawel software and geometry package recommended. However, we are very close from a successful test


Download ppt "Progress report on the alignment of the tracking system A. Bonissent D. Fouchez A.Tilquin CPPM Marseille Mechanical constraints from optical measurement."

Similar presentations


Ads by Google