Presentation is loading. Please wait.

Presentation is loading. Please wait.

Thomas Jefferson National Accelerator Facility Page 1 CLAS12 Software D.P. Weygand Thomas Jefferson National Accelerator Facility.

Similar presentations


Presentation on theme: "Thomas Jefferson National Accelerator Facility Page 1 CLAS12 Software D.P. Weygand Thomas Jefferson National Accelerator Facility."— Presentation transcript:

1 Thomas Jefferson National Accelerator Facility Page 1 CLAS12 Software D.P. Weygand Thomas Jefferson National Accelerator Facility

2 Thomas Jefferson National Accelerator Facility Page 2 Projects ClaRA Simulation GEMC CCDB Geometry Service Event Display Tracking SOT Gen III Event Reconstruction Post-Reconstruction Data Access Data-Mining Slow Controls Documentation Doxygen Javadoc Testing/Authentication Detector Subsystems (Reconstruction and Calibration) EC PCAL FTOF CTOF LTCC HTCC OnLine Code Management SVN Bug Reporting Support Visualization Services Support Packages Eg. CLHEP Root

3 Thomas Jefferson National Accelerator Facility Page 3 Service Oriented Architecture Overview: Services are unassociated loosely coupled units of functionality that have no calls to each other embedded in them. Each service implements one action. Rather than services embedding calls to each other in their source code, they use defined protocols that describe how services pass and parse messages. SOA aims to allow users to string together fairly large chunks of functionality to form ad hoc applications that are built almost entirely from existing software services. The larger the chunks, the fewer the interface points required to implement any given set of functionality; however, very large chunks of functionality may not prove sufficiently granular for easy reuse. Each interface brings with it some amount of processing overhead, so there is a performance consideration in choosing the granularity of services. The great promise of SOA suggests that the marginal cost of creating the nth application is low, as all of the software required already exists to satisfy the requirements of other applications. Ideally, one requires only orchestration to produce a new application.

4 Thomas Jefferson National Accelerator Facility Page 4 SOA is principally based on object oriented design. Each service is built as a discrete piece of code. This makes it possible to reuse the code in different ways throughout the application by changing only the way an individual service interoperates with other services that make up the application, versus making code changes to the service itself. SOA design principles are used during software development and integration. Software complexity is a term that encompasses numerous properties of a piece of software, all of which affect internal interactions. There is a distinction between the terms complex and complicated. Complicated implies being difficult to understand but with time and effort, ultimately knowable. Complex, on the other hand, describes the interactions between a number of entities. As the number of entities increases, the number of interactions between them would increase exponentially, and it would get to a point where it would be impossible to know and understand all of them. Similarly, higher levels of complexity in software increase the risk of unintentionally interfering with interactions and so increases the chance of introducing defects when making changes. In more extreme cases, it can make modifying the software virtually impossible. SOA/Complexity

5 Thomas Jefferson National Accelerator Facility Page 5 ClaRa and Cloud Computing Address physics data processing major components as services. Services and information bound to those services can be further abstracted to process layers and composite applications for developing various analyses solutions. Agility, or the ability to change physics data processing process on top of existing services. Ability to monitor points of information and points of service, in real time, to determine the well-being of entire physics data processing application. SOA is the choice for ClaRA as a key architecture: highly concurrent: cloud computing.

6 Thomas Jefferson National Accelerator Facility Page 6 ClaRA Stress Test V. Gyurjyan S. Mancilla JLAB Scientific Computing Group

7 Thomas Jefferson National Accelerator Facility Page 7 ClaRA Components DPE Platform (cloud controller) C S DPE C S C S Orchestrator “container” “service” compute node

8 Thomas Jefferson National Accelerator Facility Page 8 Batch Deployment

9 Thomas Jefferson National Accelerator Facility Page 9 16 core hyper-threaded (no IO) Event Reconstruction Rate vs Number of Threads kHz 150 ms/event/thread

10 Thomas Jefferson National Accelerator Facility Page 10 Batch job submission <![CDATA[ setenv CLARA_SERVICES /group/clas12/ClaraServices; $CLARA_SERVICES/bin/clara-dpe -host claradm-ib ]]>

11 Thomas Jefferson National Accelerator Facility Page 11 R R W W Administrative Services Administrative Services ClaRA Master DPE Persistent Storage Persistent Storage AO Executive Node Farm Node N S1 S2 Sn S1 S2 Sn Single Data-stream Application orchestrator

12 Thomas Jefferson National Accelerator Facility Page 12 R R W W Administrative Services Administrative Services ClaRA Master DPE AO Executive Node Farm Node N S1 S2 Sn S1 S2 Sn Multiple Data-stream Application Persistent Storage Persistent Storage DS Persistent Storage Persistent Storage

13 Thomas Jefferson National Accelerator Facility Page 13 Batch queue Common queue Exclusive queue : CentOS 6.2 16 core, 12 processing nodes

14 Thomas Jefferson National Accelerator Facility Page 14 Single Data-stream Application Clas12 Reconstruction: JLAB batch farm

15 Thomas Jefferson National Accelerator Facility Page 15 Computing Capacity Growth Today: 1K cores in the farm (3 racks, 4-16 cores per node, 2 GB/core) 9K LQCD cores (24 racks, 8-16 cores per node 2-3 GB/core) 180 nodes w/ 720 GPU + Xeon Phi as LQCD compute accelerators 2016: 20K cores in the Farm (10 racks, 16-64 cores per node, 2 GB/core) Accelerated nodes for Partial Wave Analysis? Even 1 st Pass? Total footprint, power and cooling will grow only slightly. Capacity for detector simulation will be deployed in 2014 and 2015, with additional capacity for analysis in 2015 and 2016. Today Experimental Physics has < 5% of the compute capacity of LQCD. In 2016 it will be closer to 50% in dollar terms and number of racks (still small in terms of flops).

16 Thomas Jefferson National Accelerator Facility Page 16 Compute Paradigm Changes Today, most codes and jobs are serial. Each job uses one core, and we try to run enough jobs to keep all cores busy, without overusing memory or I/O bandwidth. Current weakness: if we have 16 cores per box, and run 24 jobs to keep them all busy, that means that there are 24 input and 24 output file I/O streams running just for this one box! => lots of “head thrashing” in the disk system. Future: most data analysis will be event parallel (“trivially parallel”). Each thread will process one event. Each box will process 1 job (DPE) 32-64 events in parallel, with 1 input and 1 output => much less head thrashing, higher I/O rates. Possibility: the farm will include GPU or Xeon Phi accelerated nodes! As software becomes ready, we will deploy it!

17 Thomas Jefferson National Accelerator Facility Page 17 Tested on calibration and simulated data V. Ziegler & M. Mestayer

18 Thomas Jefferson National Accelerator Facility Page 18 Code to reconstruct the signals from the Forward Time-of-Flight system (FTOF) * Written as a software service so that it can be easily integrated into ClaRA framework. FTOF code converts the TDC and ADC signals into times and energies and corrects for effects like time walk. The position of the hit along the paddle determined by the difference between the TDC signals and the time of the hit -- reconstructed using the average TDC signal and correcting for the propagation time of light along the paddle. Energy deposited extracted from the ADC signal and corrected for light attenuation along the paddle. Modifications to this procedure applied when one of more of the ADC or TDC signals are missing. FTOF code is up and running and will be used in the upcoming ‘stress test’ of the full CLAS12 event reconstruction package. Fig. 1: histogram of the number N adj of adjacent paddles in a cluster normalized to the total number of events. Clusters are formed in a single panel by grouping adjacent hits together. The red, open circles are N adj for panel 1b. The black, filled squares are for panel 1a which is behind panel 1b relative to the target. Most events consist of a single hit, but there is a significant number that have additional paddles in each cluster. * modeled after the ones used in the CLAS6 FTOF reconstruction and tested using Monte Carlo data from the CLAS12, physics-based simulation gemc. Jerry Gilfoyle & Alex Colvill

19 Thomas Jefferson National Accelerator Facility Page 19 The Intel Xeon Phi KNC processor is essentially a 60-core SMP chip where each core has a dedicated 512-bit wide SSE (Streaming SIMD Extensions) vector unit. All the cores are connected via a 512-bit bidirectional ring interconnect (Figure 1). Currently, the Phi coprocessor is packaged as a separate PCIe device, external to the host processor. Each Phi contains 8 GB of RAM that provides all the memory and file-system storage that every user process, the Linux operating system, and ancillary daemon processes will use. The Phi can mount an external host file-system, which should be used for all file-based activity to conserve device memory for user applications. Intel Xeon Phi MIC Processor

20 Thomas Jefferson National Accelerator Facility Page 20 Virtual Machine

21 Thomas Jefferson National Accelerator Facility Page 21 Virtual Machine

22 Thomas Jefferson National Accelerator Facility Page 22 CLAS12 Constants Database (CCDB) Johann Goetz & Yelena Prok

23 Thomas Jefferson National Accelerator Facility Page 23 CLAS12 Constants Database (CCDB)

24 Thomas Jefferson National Accelerator Facility Page 24 Proposed Programming Standards Programming standards create a unified collaboration among members Standards and documentation increase the expected life of software by creating unified design aiding in future maintenance This is a proposed working standard and is open for suggestions and modification. It is found on the Hall-B wiki.

25 Thomas Jefferson National Accelerator Facility Page 25 Profiling and Static Analysis Profiling is a system of dynamic program analysis. o Individual call stacks o Time analysis o Memory analysis Static Analysis is a pre-run analysis that pinpoints areas of potential error. o Possible bugs o Dead code o Duplicate code o Suboptimal code 150 ms/event/thread

26 Thomas Jefferson National Accelerator Facility Page 26 Testing Unit Testing o extends life of code o catch errors made by modification o decrease amount of debugging o decrease amount of suboptimal code Testing is a required portion of the proposed coding standards and decreases the amount of time spent working with incorrect code.

27 Thomas Jefferson National Accelerator Facility Page 27 PYTHON/ClaRa Python has a simple syntax that allows for very quick prototyping of experimental analysis services, along with a large number of incredibly useful built in functions and data-types. There are also a huge number of open source, highly optimized, and well documented computational analysis modules available to be imported into any analysis service. A small handful of the supported areas are: Full Statistical and Function Optimization Toolkits Eigenvalue and Eigenvectors of large Sparse Matrices Integration Functions with Support for Integration of Ordinary Differential Equations Fourier TransformsInterpolationSignal Processing Linear AlgebraSpecial FunctionsFull Statistics Suite Highly efficient File I/OSpatial Algorithm and Data Structures Clustering Algorithms for theory, target detection, and other areas

28 Thomas Jefferson National Accelerator Facility Page 28 The existing Cmsg Protocol will be wrapped into an importable Python module that provides the needed methods to receive and send a Cmsg Container to/from the Python Service and CLARA. If the received Cmsg Container contains EVIO data, a separate EVIO Support module can be imported in order to provide the needed functions to read and append data to the EVIO Event stored in the Cmsg container. Python Analysis Service EVIO Support Module Written in Python Existing CMsg written in C Python Wrapper Imported PYTHON/ClaRa

29 Thomas Jefferson National Accelerator Facility Page 29 Scaled SOA implemented successfully via ClaRa Software infrastructure is being integrated into the Jlab batch farm system More services need to written More Orchestrators/Applications Progress on Major Systems CCDB Geometry Service Reconstruction Tracking TOF & EC Testing/Standards/QA Actively being developed Summary

30 Thomas Jefferson National Accelerator Facility Page 30 User Interfaces/Ease of Use GUI to CCDB Data Handling Data Access/Mining EVIO data access via dictionary Virtual Box Programming Libraries: ROOT,ScaVis, SciPy/NumPy … Examples Examples Examples Summary cont.

31 Thomas Jefferson National Accelerator Facility Page 31


Download ppt "Thomas Jefferson National Accelerator Facility Page 1 CLAS12 Software D.P. Weygand Thomas Jefferson National Accelerator Facility."

Similar presentations


Ads by Google