Wenjing Wu Computer Center, Institute of High Energy Physics Chinese Academy of Sciences, Beijing 2015-10-17 BOINC workshop 2013.

Slides:



Advertisements
Similar presentations
Building Portals to access Grid Middleware National Technical University of Athens Konstantinos Dolkas, On behalf of Andreas Menychtas.
Advertisements

University of Southampton Electronics and Computer Science M-grid: Using Ubiquitous Web Technologies to create a Computational Grid Robert John Walters.
Test Case Management and Results Tracking System October 2008 D E L I V E R I N G Q U A L I T Y (Short Version)
Grid Resource Allocation Management (GRAM) GRAM provides the user to access the grid in order to run, terminate and monitor jobs remotely. The job request.
Status of BESIII Distributed Computing BESIII Workshop, Mar 2015 Xianghu Zhao On Behalf of the BESIII Distributed Computing Group.
Bookshelf.EXE - BX A dynamic version of Bookshelf –Automatic submission of algorithm implementations, data and benchmarks into database Distributed computing.
B.Sc. Multimedia ComputingMedia Technologies Database Technologies.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Asynchronous Web Services Approach Enrique de Andrés Saiz.
Volunteer Computing and Hubs David P. Anderson Space Sciences Lab University of California, Berkeley HUBbub September 26, 2013.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
Introduction to HP LoadRunner Getting Familiar with LoadRunner >>>>>>>>>>>>>>>>>>>>>>
Public-resource computing for CEPC Simulation Wenxiao Kan Computing Center/Institute of High Physics Energy Chinese Academic of Science CEPC2014 Scientific.
Server-side Scripting Powering the webs favourite services.
1 port BOSS on Wenjing Wu (IHEP-CC)
Cloud Usage Overview The IBM SmartCloud Enterprise infrastructure provides an API and a GUI to the users. This is being used by the CloudBroker Platform.
A Guided Tour of BOINC David P. Anderson Space Sciences Lab University of California, Berkeley TACC November 8, 2013.
London April 2005 London April 2005 Creating Eyeblaster Ads The Rich Media Platform The Rich Media Platform Eyeblaster.
A Distributed Computing System Based on BOINC September - CHEP 2004 Pedro Andrade António Amorim Jaime Villate.
Robert Fourer, Jun Ma, Kipp Martin Copyright 2006 An Enterprise Computational System Built on the Optimization Services (OS) Framework and Standards Jun.
WP9 Resource Management Current status and plans for future Juliusz Pukacki Krzysztof Kurowski Poznan Supercomputing.
Module 7: Fundamentals of Administering Windows Server 2008.
ORGANIZING AND ADMINISTERING OF VOLUNTEER DISTRIBUTED COMPUTING PROJECT Oleg Zaikin, Nikolay Khrapov Institute for System Dynamics and Control.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Online Translation Service Capstone Design Eunyoung Ku Jason Roberts Jennifer Pitts Gregory Woodburn Kim Tran.
Introduction to the BOINC software David P. Anderson Space Sciences Laboratory University of California, Berkeley.
Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002.
Contents 1.Introduction, architecture 2.Live demonstration 3.Extensibility.
Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.
David Cameron Riccardo Bianchi Claire Adam Bourdarios Andrej Filipcic Eric Lançon Efrat Tal Hod Wenjing Wu on behalf of the ATLAS Collaboration CHEP 15,
LCG Middleware Testing in 2005 and Future Plans E.Slabospitskaya, IHEP, Russia CERN-Russia Joint Working Group on LHC Computing March, 6, 2006.
The PROGRESS Grid Service Provider Maciej Bogdański Portals & Portlets 2003 Edinburgh, July 14th-17th.
1Computer Sciences Department Princess Nourah bint Abdulrahman University.
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
Enabling Grids for E-sciencE EGEE-III INFSO-RI Using DIANE for astrophysics applications Ladislav Hluchy, Viet Tran Institute of Informatics Slovak.
Framework of Job Managing for MDC Reconstruction and Data Production Li Teng Zhang Yao Huang Xingtao SDU
1 Large-Scale Profile-HMM on the Grid Laurent Falquet Swiss Institute of Bioinformatics CH-1015 Lausanne, Switzerland Borrowed from Heinz Stockinger June.
240-Current Research Easily Extensible Systems, Octave, Input Formats, SOA.
Using SWARM service to run a Grid based EST Sequence Assembly Karthik Narayan Primary Advisor : Dr. Geoffrey Fox 1.
Copyright © 2012 UNICOM Systems, Inc. Confidential Information z/Ware Product Overview illustro Systems International A Division of UNICOM Global.
The EDGeS project receives Community research funding 1 Porting Applications to the EDGeS Infrastructure A comparison of the available methods, APIs, and.
BOINC: Progress and Plans David P. Anderson Space Sciences Lab University of California, Berkeley BOINC:FAST August 2013.
TEMPLATE DESIGN © BOINC: Middleware for Volunteer Computing David P. Anderson Space Sciences Laboratory University of.
VIGNAN'S NIRULA INSTITUTE OF TECHNOLOGY & SCIENCE FOR WOMEN TOOLS LINKS PRESENTED BY 1.P.NAVEENA09NN1A A.SOUJANYA09NN1A R.PRASANNA09NN1A1251.
Korea Workshop May GAE CMS Analysis (Example) Michael Thomas (on behalf of the GAE group)
TOPIC 7.0 LINUX SERVICES AND CONFIGURATION. ROOT USER Root user is called “super user” because it has power far beyond those of mortal user. As root,
K. Harrison CERN, 22nd September 2004 GANGA: ADA USER INTERFACE - Ganga release status - Job-Options Editor - Python support for AJDL - Job Builder - Python.
OPTIMIZATION OF DIESEL INJECTION USING GRID COMPUTING Miguel Caballer Universidad Politécnica de Valencia.
BOINC: An Open Platform for Public-Resource Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC.
ECHO Technical Interchange Meeting 2013 Timothy Goff 1 Raytheon EED Program | ECHO Technical Interchange 2013.
NCBI Grid Presentation. NCBI Grid Structure NetCache NetSchedule Load Balancer (LBSM) Load Balancer (LBSM) Worker Nodes CGI Gateway.
A volunteer computing project ( un projet de calcul volontaire ) Gang Chen, IHEP FCPPL workshop, Lyon April 8, 2010.
MapReduce: Simplied Data Processing on Large Clusters Written By: Jeffrey Dean and Sanjay Ghemawat Presented By: Manoher Shatha & Naveen Kumar Ratkal.
Volunteer Computing: Involving the World in Science David P. Anderson U.C. Berkeley Space Sciences Lab February 16, 2007.
Scientific Data Processing Portal and Heterogeneous Computing Resources at NRC “Kurchatov Institute” V. Aulov, D. Drizhuk, A. Klimentov, R. Mashinistov,
Volunteer Computing with BOINC: a Tutorial David P. Anderson Space Sciences Laboratory University of California – Berkeley May 16, 2006.
Frontiers of Volunteer Computing David Anderson Space Sciences Lab UC Berkeley 28 Nov
Petr Škoda, Jakub Koza Astronomical Institute Academy of Sciences
Status of WLCG FCPPL project
The 9th Annual BOINC Workshop
BEST CLOUD COMPUTING PLATFORM Skype : mukesh.k.bansal.
Volunteer Computing for Science Gateways
Work report Xianghu Zhao Nov 11, 2014.
Database Driven Websites
University of California, Berkeley
Intro to PHP.
University of Westminster Centre for Parallel Computing
Implementation of a small-scale desktop grid computing infrastructure in a commercial domain    
Web Application Development Using PHP
Presentation transcript:

Wenjing Wu Computer Center, Institute of High Energy Physics Chinese Academy of Sciences, Beijing BOINC workshop 1

outline project Applications: – Lammps: dynamical molecular simulation – treeThreader: protein structure prediction Remote Job Submission BOINC workshop

BOINC workshop First and Only Volunteer Project in mainland China Launched in June 2010, hosted by the computer center of IHEP, CAS To support scientific computing from Chinese Academy of Sciences and other Research Institutes Host multiple applications from various research fields, including nanotechnology, bioinformation, physics

status BOINC workshop Ever Since it was launched in June K active users 1/3 are Chinese 10K active users 1/3 are Chinese 23K active hosts 7M CPU hours Since Nov M CPU hours Since Nov 2012 Hosting 3 applications: Lammps, treeThreader, Aevol Other ongoing applications: BOSS (VBoxwrapper based) Hosting 3 applications: Lammps, treeThreader, Aevol Other ongoing applications: BOSS (VBoxwrapper based) 1.3 TFLOPS (real time computing power) 1.3 TFLOPS (real time computing power) Peak: 1M/month validated CPU hours Peak: 1M/month validated CPU hours

Some project Statistics

Application 1: Lammps Software for dynamical molecular simulation, widely used by scientists from various research fields. Restartable, developed in C by an international group, can be compiled on both Windows and Linux with some effort. Input/output: 3 mandatory input files (<10MB)/ 1 compressed output file (hundreds of MB) Running time : 0.5 hour to 800 hours (it depends on a random number which decides the steps of the simulation) BOINC workshop

Problems Results are numerical, it generates discrepancy for 2 reasons: – float point calculation on different platforms – the checkpoints also cause discrepancy due to losing precision with printing the value to a text file. Solutions – Homogeneous Redundancy, or Homogeneous Application Version Running problems: – Some long jobs (~hundreds hours) crash in the middle without getting any credit BOINC workshop

Application 2: treeThreader For Protein structure prediction Written in C by local scientists, can be compiled easily on both Windows and Linux platform, restartable Computing task: to compare a protein sequence file against all existing protein templates. Input files: configuration files, Protein Sequence file, ~50k Protein templates (about 4GB) Output files: a text file corresponds to a template file It needs about 42GFLOPS/hour to compare one sequence file against all templates BOINC workshop

Each comparison takes 6s 1 Host Computing task A Protein sequence Protein Template 1 Protein Template 2 Protein Template 3 Protein Template 50,000 It takes about 84 hours on a single core

Each comparison takes 6s,each sub package takes 9000s on a host Running it on BOINC A Protein sequence It takes 9000s (2.5 hours) to finish the task Host A1 Sub Package 1 (sticky file) Protein Template 1500 Protein Template 1 Protein Template 2 Host A2 Sub Package 2(sticky file) Protein Template 3000 Protein Template 1501 Protein Template 1502 Host Am Sub Package 32(sticky file) Protein Template Protein Template Protein Template Host An Sub Package 14(sticky file) Sub Package 15(sticky file) Sub Package 16(sticky file) Locality Scheduling (job goes to where the data is)

Problems Long tail batches – There is a front end server which submits batches and does the pre-processing and post processing of the sequence, hence it can only maintain/watch a maximum number of active batches (batches in progress) in parallel (300) – a whole batch is delayed by the slowest job – No new batches will be submitted to the BOINC server due to some batches are still “in progress” (waiting for the slowest jobs) – A lot of hosts end up in “starving” situation BOINC workshop

Remote Job Submission hosts multiple applications Each application has multiple users Application users have no privileges to submit jobs via server directly It requires remote job submission which allows authorized and authenticated users to submit jobs through remote machines. Basic Remote Job Submission functions: batch submit/check_status/retire/abort/download results BOINC provides a quite rich set of APIs for remote batch (a set of jobs based on the same input files) operations, but each application still needs its own server side CGI code and client side code for remote job submission – Some operations (Batch retire/abort/status check) are generic, can directly use BOINC API – Other operations like batch submit/results downloading are application specific, need to be customized. – Can add fancy functions as “test running”, “estimate running time” BOINC workshop

Lammps Job Submission Jobs are created in batches. A batch = 1 set of input files + different parameter-value pairs A batch comprises from hundreds to thousands of jobs Remote Job Submission: Batches are submitted through a web portal by authenticated and authorized users Authenticated and Authorized users can “operate” the batches through the web portal (retire, abort, check status, download results) BOINC workshop Batch A –(input file1, input file 2) Job 1: Ka1=Va1 Kb1=Vb1 Job 2: Ka2=Va2 Kb2=Vb2 ….. Job N: KaN=VaN KbN=VbN

LAMMPS CAS User Interface File Sandbox Test a Job Submit a Batch Check Batch Status Get Output LAMMPS CGI File Sandbox Service Job1: Para List, Value List1 Job2: Para List, Value List2 Job3: Para List, Value List3 …. JobN: Para List, Value ListN Job1: Para List, Value List1 Job2: Para List, Value List2 Job3: Para List, Value List3 …. JobN: Para List, Value ListN … …

Syntax check, GLOPS, output size estimation http Web Portal http Pass the test BOINC workshop 15 Sandbox File1 File2 File1 File2 LAMMPS CGI on server Job Tester Batch Creator Batch Monitor Job Monitor Batch Monitor Job Monitor Operations on Batch Abort/Retire a batchRetire Abort/Retire a batchRetire Download Results Batch Operations Zip Results Volunteer Hosts User Test a job with chosen input files Test a job with chosen input files Submit a batch http

BOINC Sandbox BOINC workshop Can not repeat uploading a file Can not delete files used by a running batch

Lammps Job Testing BOINC workshop Test the job to the server Submit the batch Lammps Specific !

Batch Monitoring BOINC workshop Admin can see the status of all batches Batch status: In process, Completed, Aborted, Retired

Admin all batches BOINC workshop

Job Status BOINC workshop Input files associated with this job Results can be downloaded respectively

Batch Operations BOINC workshop Download results of this batch Retire a batch Download results of a work unit Can Abort an unfinished batch here

TreeThreader job submission Jobs are created in batches: 1 protein sequence corresponds to 1 batch (32 jobs) Remote Job Submission: – Client side: provide a set of PHP APIs which allows authenticated and authorized users to submit batches and operate (check status, retire, abort, get output)these batches from remote – Server side: Generic operations such as batch abort/retire/status check are already included in BOINC code Operations as batch submission and results downloading are application specific, and implemented in a CGI program on the server side BOINC workshop

TreeThreader Job Submission CGI Batch submission – Takes client uploaded the sequence and configuration files – create a batch of jobs based on the input files and all templates files which are already stored on the server side. – Return a Batch ID Batch result downloading – uncompress all output files of the batch – put uncompressed output files into a same directory and compress it – return the downloading URL of the batch result file BOINC workshop

TreeThreader Job Submission TreeThreader CGI Template P1 Template P2 Template P3 Template P32 … … … … Template P4 ICT Web Services API Submit a sequence Status Check Get Output Sequence Merged Results

Thoughts on a more generic Job submission interface Server side still requires specific functions to create batches, merge results, testing, estimation On client side, can generalize the job submission and results downloading functions Use an XML file to describe input files, types of input files from the client side BOINC workshop

BOINC workshop 0 upload !file needs to be uploaded to BOINC server 1 online !file already stored on BOINC server 0 MySEQ.tar.gz 1 Templates

The End! BOINC workshop