Presentation is loading. Please wait.

Presentation is loading. Please wait.

Processors with Hyper-Threading and AliRoot performance Jiří Chudoba FZÚ, Prague.

Similar presentations


Presentation on theme: "Processors with Hyper-Threading and AliRoot performance Jiří Chudoba FZÚ, Prague."— Presentation transcript:

1 Processors with Hyper-Threading and AliRoot performance Jiří Chudoba FZÚ, Prague

2 chudoba@fzu.cz2 Motivation How to choose the optimal hardware Contributions are counted in SPECInts But how to measure it? www.spec.orgwww.spec.org: CPU2000 tests – 150 USD Many results are published, but: The hardware is often not identical with our machines Results depend on OS, compiler, …

3 chudoba@fzu.cz3 HP ProLiant DL360 G3 2xXeon 2.8 GhZ HT, cache 512 KB, 2x18.2 GB Ultra320 Hot Pluggable Drives http://h18004.www1.hp.com/products/servers/proliantdl360/description-g3.html SPECInt2000 results: ProLiant DL560, 2.8 GHz, Intel Xeon MP (2MB L3 cache), HT disabled in BIOS SPECint2000 1247 (SPECint_base2000 1196) ProLiant DL360 G3, 3.06GHz, Intel Xeon), 512KB L2, 1MB L3, no HT SPECint2000: 1258 Dell PowerEdge 2650, 2.8 GHz Xeon, 512KB L2 cache, HT disabled SPECint2000: 907 Intel D875PBZ motherboard (2.80C GHz PIV, HT maybe on – default status) SPECint2000: 1204 Intel D875PBZ motherboard (AA-301) (2.8E GHz, 1MB cache, HT maybe on) SPECint2000: 1269

4 chudoba@fzu.cz4

5 5

6 6 10:49pm up 3 days, 4:03, 1 user, load average: 0.00, 0.00, 0.00 31 processes: 30 sleeping, 1 running, 0 zombie, 0 stopped CPU0 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle CPU1 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle CPU2 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle CPU3 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle Mem: 2069804K av, 1386140K used, 683664K free, 0K shrd, 145584K buff Swap: 2097112K av, 0K used, 2097112K free 1006064K cached Duplication of the architectural state on each processor, while sharing one set of processor execution resources Details on http://www.intel.com/technology/hyperthread/http://www.intel.com/technology/hyperthread/ 2 logical processors

7 chudoba@fzu.cz7 Not Doubled Performance Note that a CPU that supports hyper-threading is not going to provide comparable performance with two physical processors rated at the same speed. The simple reason for this is because the two logical processors that make up your hyper-threaded CPU have to share resources, namely the execution engine, cache, and access to the system bus. Intel promises 10-30% performance increase...

8 chudoba@fzu.cz8 Hyper-Threading – not always better http://www.2cpu.com/Hardware/ht_analysis/3.html … but it not always the case:

9 chudoba@fzu.cz9 Other tests Unix Benchmark Utility v.0.3 Author: Sergei Viznyuk noHTHT gcc 2.96gcc 3.2gcc 2.96gcc 3.2 CPU test186849224203206039234571 11.21.101.05 (1.26) Klaus Schossmaier reported (numbers per CPU): Opteron 1.4 GHz 74955 1.8 GHz 97749 Xeon 2.4 GHz 88064 Itanium 1.0 GHz 66714

10 chudoba@fzu.cz10 Results for AliRoot HT with scheduling HTnoHT 2 592 s 2.26 674 s 2 594 s 2+2 jobs 1.73 514 ± 3 s 1.73 515 ± 5 s 2 596 ± 10 s 4 jobs, parallel 1 296 ± 2 s 1.13 337 ± 48 s 1 297 ±1 s 2 jobs, parallel CERN RH 7.3.3, kernel 2.4.20, AliRoot v4-01-05, 1000 tracks HIJINGParam, Real time ftp://ftp.kernel.org/pub/linux/kernel/people/rml/cpu-affinity/ftp://ftp.kernel.org/pub/linux/kernel/people/rml/cpu-affinity/ + http://freshmeat.net/projects/sched-utils/ CPU0 states: 100.0% user, 0.0% system, 0.0% nice, 0.0% idle CPU1 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle CPU2 states: 100.0% user, 0.0% system, 0.0% nice, 0.0% idle CPU3 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle CPU0 states: 100.0% user, 0.0% system, 0.0% nice, 0.0% idle CPU1 states: 100.0% user, 0.0% system, 0.0% nice, 0.0% idle CPU2 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle CPU3 states: 0.0% user, 0.1% system, 0.0% nice, 99.0% idle

11 chudoba@fzu.cz11 Conclusions CPU resource estimates are probably very rough HT can add 15% in performance but in some cases in Real Time Publicly available results of some our standard CPU test would help (update of Root benchmark tests ?)

12 chudoba@fzu.cz12 Root benchmark stress results: Root 3.10.02, gcc 3.2, -O 4 parallel jobs, 9000 events, HT: 512 4 parallel jobs, 9000 events, noHT: 732 2 parallel jobs, 9000 events, HT: 733 Opteron 1.4 GHz 1 MB cache, 8 GB RAM Opteron 1.8 GHz 1 MB cache, 8 GB RAM Itanium2 1.0 GHz 3 MB cache, 2 GB RAM P4 Xeon 3.06 GHz 512 KB cache, 2 GB RAM 750 rootmarks g++ 3.2.1 (-O2) 950 rootmarks g++ 3.2.1 (-O2) 497 rootmarks g++ 3.2.3 (-O2) 750 rootmarks g++ 3.2.3 (-O2) 550 rootmarks 32-bit binary compiled on P4 with g++ 3.2.2 (-O2) 1020 rootmarks ecc 7.1 (-O) Klaus Schossmaier


Download ppt "Processors with Hyper-Threading and AliRoot performance Jiří Chudoba FZÚ, Prague."

Similar presentations


Ads by Google