Distributed Monte Carlo Instrument Simulations at ISIS Tom Griffin, ISIS Facility & University of Manchester.

Slides:



Advertisements
Similar presentations
University of Southampton Electronics and Computer Science M-grid: Using Ubiquitous Web Technologies to create a Computational Grid Robert John Walters.
Advertisements

Operating Systems Manage system resources –CPU scheduling –Process management –Memory management –Input/Output device management –Storage device management.
1. Topics Is Cloud Computing the way to go? ARC ABM Review Configuration Basics Setting up the ARC Cloud-Based ABM Hardware Configuration Software Configuration.
M-grid Using Ubiquitous Web Technologies to create a Computational Grid R J Walters and S Crouch 21 January 2009.
Distributed computing at the Facility level: applications and attitudes Tom Griffin STFC ISIS Facility NOBUGS 2008, Sydney.
Distributed Computing for Crystallography experiences and opportunities Dr. Kenneth Shankland & Tom Griffin ISIS Facility CCLRC Rutherford Appleton Laboratory.
How’s My Network (HMN)? A Java approach to Home Network Measurement Alan Ritacco, Craig Wills, and Mark Claypool Computer Science Department Worcester.
ManageEngine TM Applications Manager 8 Monitoring Custom Applications.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
Operating systems This work is licensed under a Creative Commons Attribution-Noncommercial- Share Alike 3.0 License. Skills: none IT concepts: popular.
CompuNet Grid Computing Milena Natanov Keren Kotlovsky Project Supervisor: Zvika Berkovich Lab Chief Engineer: Dr. Ilana David Spring, /
Programming Introduction November 9 Unit 7. What is Programming? Besides being a huge industry? Programming is the process used to write computer programs.
Operating Systems.
SEEM4570: XAMPP, Eclipse, Summary of Html Kangfei Zhao Room 711,ERB
Linux Basics. What is an Operating System (OS)? An Operating System (OS) is an interface between hardware and user which is responsible for the management.
Fundamentals of Networking Discovery 1, Chapter 2 Operating Systems.
NETWORK CENTRIC COMPUTING (With included EMBEDDED SYSTEMS)
1 The SpaceWire Internet Tunnel and the Advantages It Provides For Spacecraft Integration Stuart Mills, Steve Parkes Space Technology Centre University.
How Java Programs Work MIS 3023 Business Programming Concepts II The University of Tulsa Professor: Akhilesh Bajaj All slides in this presentation ©Akhilesh.
 Introduction to Operating System Introduction to Operating System  Types Of An Operating System Types Of An Operating System  Single User Single User.
Standard Grade Computing System Software & Operating Systems.
Capture and Replay Often used for regression test development –Tool used to capture interactions with the system under test. –Inputs must be captured;
Upgrade to Real Time Linux Target: A MATLAB-Based Graphical Control Environment Thesis Defense by Hai Xu CLEMSON U N I V E R S I T Y Department of Electrical.
Distributed Grid Computing at ISIS using the Grid MP System Tom Griffin, ISIS Facility & University of Manchester / UMIST.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
Monte Carlo Instrument Simulation Activity at ISIS Dickon Champion, ISIS Facility.
Recall: Three I/O Methods Synchronous: Wait for I/O operation to complete. Asynchronous: Post I/O request and switch to other work. DMA (Direct Memory.
1 st December 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Operating Systems Overview Part 2: History (continued)
Grid MP at ISIS Tom Griffin, ISIS Facility. Introduction About ISIS Why Grid MP? About Grid MP Examples The future.
School of Computer Science & Information Technology G6DICP Introduction to Computer Programming Milena Radenkovic.
CS 4720 Dynamic Web Applications CS 4720 – Web & Mobile Systems.
Technical Presentation
10/8: Software What is software? –Types of software System software: Operating systems Applications Creating software –Evolution of software development.
ASP (Active Server Pages) by Bülent & Resul. Presentation Outline Introduction What is an ASP file? How does ASP work? What can ASP do? Differences Between.
Application Software System Software.
Performance Testing Test Complete. Performance testing and its sub categories Performance testing is performed, to determine how fast some aspect of a.
TOPIC 7.0 LINUX SERVICES AND CONFIGURATION. ROOT USER Root user is called “super user” because it has power far beyond those of mortal user. As root,
Chapter 1 Basic Concepts of Operating Systems Introduction Software A program is a sequence of instructions that enables the computer to carry.
The Gateway Computational Web Portal Marlon Pierce Indiana University March 15, 2002.
Selenium server By, Kartikeya Rastogi Mayur Sapre Mosheca. R
Mobile Analyzer A Distributed Computing Platform Juho Karppinen Helsinki Institute of Physics Technology Program May 23th, 2002 Mobile.
Simulation Production System Science Advisory Committee Meeting UW-Madison March 1 st -2 nd 2007 Juan Carlos Díaz Vélez.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
Enabling Grids for E-sciencE LRMN ThIS on the Grid Sorina CAMARASU.
Creating Grid Resources for Undergraduate Coursework John N. Huffman Brown University Richard Repasky Indiana University Joseph Rinkovsky Indiana University.
A Web Based Job Submission System for a Physics Computing Cluster David Jones IOP Particle Physics 2004 Birmingham 1.
GLAST LAT ProjectNovember 18, 2004 I&T Two Tower IRR 1 GLAST Large Area Telescope: Integration and Test Two Tower Integration Readiness Review SVAC Elliott.
BY: SALMAN 1.
BY: SALMAN.
2. OPERATING SYSTEM 2.1 Operating System Function
Web Interface for Formatter
TYPES OFF OPERATING SYSTEM
Chapter 1: Introduction
Week 1 Gates Introduction to Information Technology cosc 010 Week 1 Gates
Computer Software CS 107 Lecture 2 September 1, :53 PM.
Programmable Logic Controllers (PLCs) An Overview.
slides borrowed and adapted from Alex Mariakis and CSE 390a
Chapter 2: System Structures
CIS16 Application Development – Programming with Visual Basic
Information Technology Ms. Abeer Helwa
Lecture Topics: 11/1 General Operating System Concepts Processes
ONLINE SECURE DATA SERVICE
SOFTWARE TECHNOLOGIES
LO2 – Understand Computer Software
Gordon Erlebacher Florida State University
Chapter-1 Computer is an advanced electronic device that takes raw data as an input from the user and processes it under the control of a set of instructions.
DBOS DecisionBrain Optimization Server
Introduction to research computing using Condor
Presentation transcript:

Distributed Monte Carlo Instrument Simulations at ISIS Tom Griffin, ISIS Facility & University of Manchester

What is Distributed Computing The software we use VITESS Specifics McStas Specifics Conclusions Introduction

What do I mean by ‘Distributed Grid’? A way of speeding up large, compute intensive tasks Break large jobs into smaller chunks Send these chunks out to (distributed) machines Distributed machines do the work Collate and merge the results

Spare Cycles Concept Typical PC usage is about 10% Most PCs not used at all after 5pm Even with ‘heavily used’ (Outlook, Word, IE) PCs, the CPU is still grossly underutilised Everyone wants a fast PC! Can we use (“steal?”) their unused CPU cycles? World Community Grid (

Toolkit e.g. COSM Low level toolkit – source code level integration So time consuming work, for each application Entropia DC Grid Trial run at ISIS two years ago. Some success Company bought out and in limbo (?) United Devices Grid MP What we’re currently using Quite expensive Condor Free (academic research project) In our experience 2 yrs ago, not reliable with Windows Possible Software Implementations

The United Devices System Server hardware We use two, dual Xeon servers client licenses Could (will) easily cope with more clients Software Servers run RedHat Linux Advanced Server / DB2 Clients available for Windows, Linux, SPARCs and Macs Programming MGSI – Web Services interface – XML, SOAP Accessed with C++ and Java classes etc Management Console Web browser based Can manage services, jobs, devices etc

Visual Introduction to the Grid

CPU Intensive Low to moderate memory use Not too much file output Coarse grained Command line / batch driven Licensing issues? Suitable / Unsuitable Applications

Program McStas Job wish_simulation Jobstep Workunit sent to a Device Data Set Data Objects within the Grid

1)Think about how to split your data and merge results 2)Wrap and upload your executable 3)Write the application service Pre and Post processing 4)Use the Grid Fairly easy to write Interface to grid via Web Services So far used: C++, Java, Perl, Fortran, C# How to write Grid Programs

Executable + any dlls etc Standard data files Compression Encryption Capture screen output Set Environmental Variables Command Line Wrapping Your Executable

Pre-processing 1) Partition data 2) Package data partitions 3) Log in to the Grid server 4) Create a Job and Job Step 5) Create a Data Set 6) Create Datas and upload data packages 7) Create Workunits 8) Set the Job running Post-Processing 1) Retrieve results 2) Merge results Application Service

Two scenarios: Single large simulation run Split the neutrons into smaller numbers and execute separately Merge results in some way Many smaller runs Parameter scan Monte Carlo Speed-up Ideas

Easy mode of operation: fixed executables + data files Executables held on server Split command line into bits – divide Ncount Vary the random seed Create data packages Upload data packages VITESS – Splitting It

Use GUI to create instrument – Save As Command “Parameter directory” set to “.” VITESS – Running It Submit program parses bat file Substitutes ‘V’ and ‘P’ Removes ‘header’ and ‘footer’ Creates many new bat files with different ‘--Z’s and

Submit program creates many bat files VITESS – Running It C:\My_GRID\VITESSE\VITESSE\build>Vitess-Submit.exe example_job example.bat req_files 20 logging in to as tom.... Adding Vitesse dataset.... Adding Vitesse datas.... 3e+007 neutrons split into 20 chunks, of -n neutrons Total number of Vitesse 'runs' = 20 Uploading data for run #1... Uploading data for run #2.... Uploading data for run #19... Uploading data for run #20... Adding Vitesse datas to system.... Adding job.... Adding jobstep.... Turning on automatic workunit generation.... Closing jobstep.... All done Your job_id is 4878

Web Interface VITESS – Monitoring It

Download the ‘chunks’ Merge Data files DetectedNeutrons.dat : concatenate vpipes : trajectories & count rate Two classes of files 1D - Values: sum & divide by num chunks- - Errors: square, sum and divide 2D –Sum / num of chunks VITESS – Merging It

Many times faster: linear increase Needs verification runs (x3) Typically 11 (potentially) 30+ times faster 12 hours runs in 1 hour! Very large simulations reach random limits VITESS – Advantages and Problems

VITESS – Some Results 176 hours 59 hours 6hrs 20mins

Different executable for every run Executable must be uploaded at run time Split –n into chunks or run many instances (parameter scan) Create data (+ executable) packages Upload packages McStas – Splitting It

Use McGui to create and compile executable Create input file for Submit program McStas – Running It

Large run Submit program breaks up –n##### Uploads new command line + data + executable Parameter Scan Send each run to a separate machine McStas – Running It

Many output files  Separate merge program PGPLOT and Matlab implemented Very similar PGPLOT 1D – intensities: sum and divide. Errors: square, sum and divide. Events: Sum 2D – intensities: sum and divide. Errors: square, sum and divide. Events: Sum Matlab 1D – Same maths, different format 2D – Virtually the same ‘Metadata’ leave untouched McStas – Merging It

Security: Do we trust users? 100 times faster[?] Linux version much faster than Windows [?] How do we merge certain fields? values = ' e '; statistics = 'X0=3.5418; dX= ; Y0= ; dY=1.0288;'; Some issue related to randomness of moderator file McStas – Advantages and Problems

Expansion Proposal accepted for an additional 400 licenses Giving us a total of 480 Change in licensing model Future Developments - Expansion Completed Funded Seeking funding $50k $45k $50k $83k Bottom Line: Costs Setup, server licenses, 80 client licenses + support – $18k – CMSD Total ≈ $250k

Both run well under Grid MP Submit & Retrieve a few hours work Merge a bit more Needs to merge more output formats [?] Issues with very large simulations More info on Grid MP at Conclusions