Presentation is loading. Please wait.

Presentation is loading. Please wait.

High-performance tracing of many-core systems with LTTng

Similar presentations


Presentation on theme: "High-performance tracing of many-core systems with LTTng"— Presentation transcript:

1 High-performance tracing of many-core systems with LTTng
Simon Marchi Laboratoire DORSAL Département de génie informatique Noyau d'un système d'exploitation

2 Outline Intro to tracing Problem description Studied platforms
Characteristics of many-core processors Work done and planned work Noyau d'un système d'exploitation

3 Intro to tracing (1/2) Very high performance logging How ?
Very compact output format Lockless synchronization Low-level optimization (architecture dependent) Small footprint Won't block the application Noyau d'un système d'exploitation

4 Intro to tracing (2/2) Used by Kernel and application developers
Sysadmins Security analysts Education Noyau d'un système d'exploitation

5 LTTng ! Open source project started at Polytechnique
Linux kernel and userspace application tracer Very active development – many industrial partners Noyau d'un système d'exploitation

6 LTTng ! Noyau d'un système d'exploitation

7 Problem description Latest generation of many-core processors
Tilera, Intel Xeon Phi, Freescale, Adapteva Expected to become more popular Energy-efficient Best way to use ever-increasing number of transistors on chips Developers need good tools LTTng helps developers with performance problems or bugs related to parallel programming. There is no doubt a tracer will be a good friend on a 50 core machine. Noyau d'un système d'exploitation

8 Problem description Port and optimize LTTng for many-core architectures Expected challenges Limited storage High volume of data generated Highly parallel architectures, performance scaling We expect to do more at the same time with these processors, so necessarily there will be more to trace. Noyau d'un système d'exploitation

9 Studied platforms Tilera TILE-Gx8036
36 cores (versions up to 100 cores to come) Target market: cloud computing, packet processing, data mining, multimedia, etc. Already available at the lab Intel Xeon Phi 60 x86-compatible cores Target market: coprocessor in servers, high performance computing Launched November 2012, general availability January on its way ! Noyau d'un système d'exploitation

10 Tilera TILE-Gx architecture
Source: TILE-Gx8036 product brief, Noyau d'un système d'exploitation

11 Intel Xeon Phi architecture
Source: Intel, Noyau d'un système d'exploitation

12 Common characteristics
Interconnection network between cores Shared memory becomes a bottleneck TILE-Gx: mesh-like network Xeon Phi: ring interconnect Very fast Distributed cache architecture Each core has its own L1/L2 cache On L2 cache miss, the core looks up in the other L2 (virtual L3) Uses the interconnection network Direct L2 to I/O transfer to avoid main memory Tilera: 64k L k L2 Xeon Phi: ? Very fast network, delay for interconnection: ~1-5 cycles per hop Noyau d'un système d'exploitation

13 Common characteristics
In-memory filesystem 8 GB of memory no permanent storage The trace has to be stored somewhere else. High bandwidth I/O PCI Express link to host Tilera: 4 x 10GbE network controller Runs a full Linux OS Most standard tools (e.g. gdb, oprofile) are already compatible Noyau d'un système d'exploitation

14 Tilera TILE-Gx characteristrics
Mesh network Developers can use it as a “software” ASIC Many hardware accelerators for Packet processor/router Cryptography (SSL, DSA, RSA, IPSec, etc...) Compression (gzip) Runs a hypervisor Possibility to dedicate cores to different simultaneously running OSes Possibility to run Zero Overhead Linux and bare metal applications Software asic: lots of small processors connected together, short wires Noyau d'un système d'exploitation

15 Work done Basic port of LTTng (UST and kernel) to the Tilera
Only one small fix was necessary on the LTTng side A few issues reported to Tilera were fixed on their side Noyau d'un système d'exploitation

16 Planned work Direct port of LTTng to the platforms
- Just get it to work Develop a benchmark suite - Various real-life, heavily parallel applications Find bottlenecks, optimize - Make use of the special communication hardware - Adapt to the architectural features Integrate the work - Find ways to abstract for other many-core platforms Noyau d'un système d'exploitation

17 Conclusion Problem: many-core = a lot of data to trace
Require different approaches than classic processors Different bottlenecks / constraints New hardware features / accelerators Noyau d'un système d'exploitation

18 Question ? Noyau d'un système d'exploitation


Download ppt "High-performance tracing of many-core systems with LTTng"

Similar presentations


Ads by Google