Daimler Benz Aerospace Airbus

Slides:



Advertisements
Similar presentations
Operating-System Structures
Advertisements

Discovering Computers Fundamentals, Third Edition CGS 1000 Introduction to Computers and Technology Fall 2006.
Setting up of condor scheduler on computing cluster Raman Sehgal NPD-BARC.
MCTS GUIDE TO MICROSOFT WINDOWS 7 Chapter 10 Performance Tuning.
Network Redesign and Palette 2.0. The Mission of GCIS* Provide all of our users optimal access to GCC’s technology resources. *(GCC Information Services:
Understand Virtualized Clients Windows Operating System Fundamentals LESSON 2.4.
Network Redesign and Palette 2.0. The Mission of GCIS* Provide all of our users optimal access to GCC’s technology resources. *(GCC Information Services:
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Chapter Nine NetWare-Based Networking. Objectives Identify the advantages of using the NetWare network operating system Describe NetWare’s server hardware.
INTRODUCTION OS/2 was initially designed to extend the capabilities of DOS by IBM and Microsoft Corporations. To create a single industry-standard operating.
Sun Grid Engine Grid Computing Assignment – Fall 2005 James Ruff Senior Department of Mathematics and Computer Science Western Carolina University.
Workload Management Massimo Sgaravatto INFN Padova.
BMC Control-M Architecture By Shaikh Ilyas
Asynchronous Solution Appendix Eleven. Training Manual Asynchronous Solution August 26, 2005 Inventory # A11-2 Chapter Overview In this chapter,
PRASHANTHI NARAYAN NETTEM.
Cambodia-India Entrepreneurship Development Centre - : :.... :-:-
Slide 1 of 9 Presenting 24x7 Scheduler The art of computer automation Press PageDown key or click to advance.
Hands-On Microsoft Windows Server 2008 Chapter 11 Server and Network Monitoring.
Windows Server 2008 Chapter 11 Last Update
Talend 5.4 Architecture Adam Pemble Talend Professional Services.
Client/Server Architectures
Hands-On Microsoft Windows Server 2008 Chapter 1 Introduction to Windows Server 2008.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Cracow Grid Workshop’10 Kraków, October 11-13,
Riccardo de Asmundis INFN Napoli [Certified LabVIEW Developer]
Introduction to HP LoadRunner Getting Familiar with LoadRunner >>>>>>>>>>>>>>>>>>>>>>
Hands-On Microsoft Windows Server 2008 Chapter 1 Introduction to Windows Server 2008.
MCTS Guide to Microsoft Windows 7
Tools and Utilities for parallel and serial codes in ENEA-GRID environment CRESCO Project: Salvatore Raia SubProject I.2 C.R. ENEA-Portici. 11/12/2007.
An Introduction to IBM Systems Director
Upgrade to Real Time Linux Target: A MATLAB-Based Graphical Control Environment Thesis Defense by Hai Xu CLEMSON U N I V E R S I T Y Department of Electrical.
Hosted Virtualization Lab Last Update Copyright Kenneth M. Chipps Ph.D.
Chapter 4 System Software. Software Programs that tell a computer what to do and how to do it. Sets of instructions telling computers to perform actions.
Computer Emergency Notification System (CENS)
Using Virtual Servers for the CERN Windows infrastructure Emmanuel Ormancey, Alberto Pace CERN, Information Technology Department.
CE Operating Systems Lecture 3 Overview of OS functions and structure.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
Framework for MDO Studies Amitay Isaacs Center for Aerospace System Design and Engineering IIT Bombay.
SAN DIEGO SUPERCOMPUTER CENTER Inca Control Infrastructure Shava Smallen Inca Workshop September 4, 2008.
COMPUTER SOFTWARE FORM 1. Learning Area Introduction to computer software Operating System (OS) Application Software Word Processing Software Presentation.
Easier Platform Administration using SAS 9.4 Grid Option Sets SAS New South Wales User Group - Nov 2015 Andrew Howell ANJ Solutions Pty Ltd.
AMH001 (acmse03.ppt - 03/7/03) REMOTE++: A Script for Automatic Remote Distribution of Programs on Windows Computers Ashley Hopkins Department of Computer.
CSC190 Introduction to Computing Operating Systems and Utility Programs.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
Chapter 1 Basic Concepts of Operating Systems Introduction Software A program is a sequence of instructions that enables the computer to carry.
LSF Universus By Robert Stober Systems Engineer Platform Computing, Inc.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Chapter 6A Operating System Basics PART I.
SPI NIGHTLIES Alex Hodgkins. SPI nightlies  Build and test various software projects each night  Provide a nightlies summary page that displays all.
CATI Pitié-Salpêtrière CATI: A national platform for advanced Neuroimaging In Alzheimer’s Disease Standardized MRI and PET acquisitions Across a wide network.
Chapter – 8 Software Tools.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
Next Generation of Apache Hadoop MapReduce Owen
An operating system (OS) is a collection of system programs that together control the operation of a computer system.
Active-HDL Server Farm Course 11. All materials updated on: September 30, 2004 Outline 1.Introduction 2.Advantages 3.Requirements 4.Installation 5.Architecture.
Mainframe – Control-M Architecture.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
Grid Computing: An Overview and Tutorial Kenny Daily BIT Presentation 22/09/2016.
Copyright © 2003 by Prentice Hall 1 Computers: Tools for an Information Age Chapter 3 Operating Systems: Software in the Background BSM025 Computers.
Installing Windows 7 Lesson 2.
Hands-On Microsoft Windows Server 2008
MCTS Guide to Microsoft Windows 7
Introduction to Operating System (OS)
Bomgar Remote support software
Lecture 4 : Windows 7 By MSc. Manar Joundy Hazar 2017
Ch > 28.4.
Chapter 2: System Structures
Objective Understand web-based digital media production methods, software, and hardware. Course Weight : 10%
Language Processors Application Domain – ideas concerning the behavior of a software. Execution Domain – Ideas implemented in Computer System. Semantic.
Utility-Function based Resource Allocation for Adaptable Applications in Dynamic, Distributed Real-Time Systems Presenter: David Fleeman {
Process/Code Migration and Cloning
Presentation transcript:

Daimler Benz Aerospace Airbus Batch Computing in Client/Server IT-infrastructure’s using LSF and the MSC/Analysis Manager Klaus-Peter Wessel Daimler Benz Aerospace Airbus Dept. EIC (CAE-Systems support) Tel.: 0421-538-3161 Fax: 0421-538-4994 E-Mail: klaus-peter.Wessel@airbus.dasa.de

My Person:. Klaus-Peter Wessel. Dipl. -Ing My Person: Klaus-Peter Wessel Dipl.-Ing. (Aerospace Technology), University of Stuttgart Responsible at DASA Airbus for: Installation, Customizing and User Support of Pre-/Postprocessor Systems MSC/Patran (major system) I_DEAS, MSC/ARIES (in the use with MSC/EMAS), MENTAT Services Intranet CAE-Services Batch-Queueing System (Unix) LSF (Load Sharing Facility) and MSC/Analysis Manager Covered by this paper!

General Themes of this paper CAE Downsizing Project at DASA Airbus from host based to distributed computing Hardware situation in the CAE area realization with LSF and MSC/Analysis Manager CAE-Jobmanagement/Handling Batch Queuing-System (Load Sharing Facility (LSF) from Platform Computing Ltd.) Job-Submitting Tool (MSC/Analysis Manager)

CAE Downsizing at DASA Airbus DASA Airbus is running a CAE-downsizing project started in 1994 Main issue is to put all MSC/Nastran and self developed CAE specific Fortran applications from the IBM Mainframe to decentralized Unix-based workstations and reduce elapsed/CPU-time factor of typical CAE-Batch jobs to a maximum of 2 Platform decision was (in 1995): HP-workstations as the desktop systems for the engineers IBM RS6000/R24 as central CAE-Batch Server systems (at each site) No further DEC investment because of no 100% IEEE compatibility Platform decision revised (09/97): Implement a pure HP-environment in the major CAE-departments Use more decentralized Batch Servers instead of the former IBM RS6000/R24

CAE Downsizing at DASA Airbus What is LSF? LSF is the abbreviation for Load Sharing Facility It is developed by Platform Computing Ltd., Canada It is the world leading Unix-based Batch Queueing and Load Leveling System (LSF can even share batch-jobs with NQS running on a CRAY)

CAE Downsizing at DASA Airbus and why did we choose LSF? It combines all Unix based systems to one "virtual cluster" The CAE-user get a "one system" view It is a tool with which the usage of workstation resources can be optimized It is very flexible in setting up specific Batch-Queues It is flexible in setting up site specific load indices It works in an optimal way together with the MSC/Analysis Manager

LSF architecture

xlsbatch GUI for submitting, controlling and monitoring of batch jobs, hosts and queues DA-specific implementation: click in the "submit" button opens MSC/Analysis Manager Note: This tool is currently not in use by the "normal" CAE-user LSF is used at DA more or less as a "black box" behind the MSC/Analysis Manager

Load Indices There are a number of built-in load indices which are measured continously by the LIM process on each participating workstation Examples: r15s 15 sec run-queue length r1m 1 minute run-queue length r15m 15 min run-queue length ut CPU utilization it interactive idle time of workstation pg paging rate mem memory usage swp swap space External load indices can be defined in addition Example: nas_scratch free disc space in directory /nas_scratch

Load Indices All load indices can be used for a number of configuration options as well as for job-submit options Examples: 1.) In overall LSF workstation declarations declare a workstation to be busy if load index is above a specific value declare a workstation as busy if it is interactively used (index: it) 2.) In queue definitions take only that workstations for decision where to run a job were load index is not higher than specified 3.) In submit commands bsub -q normal -type==any order[ut:r1m:] command or: bsub -q server -type==HP700 -mem>200 command 4.) During the runtime of a job if load index gets above a specified value (loadstop): stop the job if load index gets below a specified value (loadsched): resume or start a job if software allows and LSF is configured to do it: migrate the job to another machine

Flexibility in Queue definitions Scheduling policies FIRST IN - FIRST OUT FAIRSHARE POLICY PREEMPTIVE and PREEMPTABLE jobs EXCLUSIVE jobs USER (and/or department shares) Limits can be set CPULIMIT RUNLIMIT FILELIMIT DATALIMIT STACKLIMIT CORELIMIT MEMLIMIT PROCLIMIT Resources/Load indices can be used RUN_WINDOWS DISPATCH_WINDOWS USERS HOSTS PREEXEC/POSTEXEC commands LOADSCHED/LOADSTOP values of all Load indices MULTICLUSTERING

Flexibility in Queue definitions bigmem-queue contains only one big pure CAE-batch server accepts one job per processor (currently one) accepts two jobs in the Queue per user No overload indices declared Note: see later described Multi-Clustering smallmem-queue contains all CAE-workstations, eg. old and small HP/710´s jobs are killed when requiring more than 22 MB Memory accepts two jobs per processor accepts five jobs in the Queue per user jobs will be suspended when load index is above limit jobs will be resumed when load index is going below limit

xlsmon GUI for monitoring the cluster load here: different stati of hosts ok: host accepts new batch load busy: host threshold values too high and host will not accept new load unav.: host is currently offline closed: host run-window is currently closed

xlsmon GUI for monitoring the cluster load here: actual workload indices of two hosts Usage of the tool is good for LSF-administrators and/or CAE-systems support people

LSF MultiCluster feature Reasons for independent (multiple) Clusters may be: decentralized IT infrastructure independent clusters for organizational reasons geographical distributed sites With the LSF MultiCluster application/functionality: independent LSF-clusters can be combined to one “virtual Cluster” jobs can be distributed from one cluster to another

Additional LSF features There is a number of functionalities given by LSF, but currently not in use at DASA Airbus: Production job scheduling (Calendars) Sharing / load leveling of interactive jobs Load sharing shell (lstcsh) parallel (or distributed) make utility

What is the MSC/Analysis Manager? It is an easy to use graphical user interface (GUI) for submitting and monitoring typical CAE-batch jobs It is developed by the MacNeal Schwendler Corp. and why did we choose it? In downsizing projects GUI´s are needed to assure an ease of use of complex software The need of developing own GUI´s/tools can be reduced by using a standard but customizable GUI’s It works in an optimal way together with LSF and MSC/Nastran No need of user specific training to use the LSF cluster features Other codes can be easily integrated The CAE-user can use one Interface for all his CAE-batch jobs It is an integral part of MSC/Patran but can also be run in a standalone mode

Submit CAE-jobs with MSC/Analysis Manager All kind of typical CAE-Batch systems can be integrated in the GUI. At DASA Airbus are integrated: MSC/Nastran (shown left) LS-Dyna 3D, Marc, Exform, VSAERO, STARS In addition DASA Airbus has integrated a special feature (shown right) that allow users to compile, link and run any Fortran-job, even on pure batch-servers without any interactive access. Later the developed code can be integrated as a real application inside the MSC/Analysis Manager GUI Note: The GUI changes the outfit according to the chosen application

Submit CAE-jobs with MSC/Analysis Manager The user I/F is the same for all applications! The user just has to choose with a click on the GROUP button his application and the GUI will be adjusted to the settings applicable for that application The I/F changes the outfit accordingly to the organisation/ department choosen Usage for standard jobs is very easy choose the input file (e.g. MSC/Nastran *.dat file) click on the "Apply" button Note: In MSC/Patran this procedure is fully integrated What happens after clicking on the "Apply" button?

What happens with a MSC/Nastran job after clicking on the "Apply" button? The monitoring window will appear on the screen The MSC/Analysis Manager will build one MSC/Nastran input-file (included files) The MSC/Analysis Manager can/will generate MSC/Nastran FMS commands The CAE-job will be submitted to the chosen LSF-Queue LSF considers which workstation is capable to run the job Files needed for the job will be copied to the execute host The application program (e. g. MSC/Nastran) will be executed All generated (or specified) files will be copied back to the submit host Specifically for MSC/Nastran *.f06 file will be searched through for FATAL messages Note: All is done automatically, the user doesn´t have to use any OS-System command!

Job Monitoring with MSC/Analysis Manager During the execution time users can download some files to their desktop system At the end of a job the user is informed which files are re-ceived from his "submit-host" Each file will be opened directly by a mouse click Monitoring of important tasks and the MSC/Nastran *.log file and at the end of the job looking for MSC/Nastran FATAL messages

Setting up job chains inside MSC/Analysis Manager With the PRE and POST options users can setup any job-chain All Unix-commands, scripts and programs can be chained Here is shown: transfer the .nastxxrc file to the execute host before execution of MSC/Nastran at the end of the MSC/Nastran run generate on the execute host an I_DEAS universal-file compress all generated file on the execute-host (e.g. to reduce network-traffic) uncompress all transfered files on the submit-host

Setting up specific Job-options inside MSC/Analysis Manager left: MSC/Nastran and LSF Memory parameters center: MSC/Nastran Restart options right: Submit-time of the job

Future enhancements (requests) for the MSC/Analysis Manager More in depth integration of other CAE-applications like MSC/Nastran (application specific job-handling) Better procedure for running jobs on local machines (no copy of files) Interaction to / Integration of MSC/Estimate choose the right host for the job setup MSC/Nastran specific job parameters automatically (e.q. memory/disc) Continue/enhance partnership with Platform Computing e. g. integrate LSF libraries with MSC/Nastran to make a job checkpointable, rerunnable and migratable inside the LSF environment

Conclusion With the combination of LSF and MSC/Analysis Manager CAE batch-jobs can be setup, scheduled and monitored in a very easy way In a heterogenous UNIX-environment with the combination of both tools a maximum performance for each batch-job a maximum job throughput and a maximum usage of unused hardware resources was achieved at DASA Airbus (e.g. turnaround times of typical jobs where reduced by a factor of 6 compared with the mainframe) All workstations are coupled to a "virtual cluster"; the CAE-user doesn´t need special Unix know-how to use other workstations than "his own"