Download presentation
Presentation is loading. Please wait.
Published byJoshua Ronald Jackson Modified over 6 years ago
1
The CIPRES Science Gateway: Enabling High-Impact Science for Phylogenetics Researchers with Limited Resources Mark Miller, Wayne Pfeiffer, and Terri Schwartz San Diego Supercomputer Center
2
Phylogenetics is the study of the diversification of life on the planet Earth, both past and present, and the relationships among living things through time ?
3
Evolutionary relationships can be inferred from DNA sequence comparisons:
1. Align sequences to determine evolutionary equivalence: 2. Infer evolutionary relationships based on some set of assumptions:
4
Inferring Evolutionary relationships from DNA sequence comparisons is powerful:
DNA sequences are determined by fully automated procedures. Sequence data can be gathered from many species at scales from gene to whole genome. The high speed and low cost of NexGen Sequencing means new levels of sensitivity and resolution can be obtained. The speed of sequencing is still increasing, while the cost of sequencing is decreasing.
5
Inferring Evolutionary relationships from DNA sequence comparisons is powerful, BUT:
Current analyses often involve 1000’s of species and 1000’s of characters, creating very large matrices. Sequence alignment and Tree inference are NP hard; even with heuristics, computational power often limits the analyses (already). The length of tree search analysis scales exponentially with number of taxa and with number of characters with codes in current use. There are at least 107 species, each with ,000 genes, so the need for computational power and new approaches will continue to grow.
6
Inferring Evolutionary relationships from DNA sequence comparisons is powerful, BUT:
Current analyses often involve 1000’s of species and 1000’s of characters, creating very large matrices. Sequence alignment and Tree inference are NP hard; even with heuristics, computational power often limits the analyses (already). The length of tree search analysis scales exponentially with number of taxa and with number of characters with codes in current use. There are at least 107 species, each with ,000 genes, so the need for computational power and new approaches will continue to grow.
7
Inferring Evolutionary relationships from DNA sequence comparisons is powerful, BUT:
Current analyses often involve 1000’s of species and 1000’s of characters, creating very large matrices. Sequence alignment and Tree inference are NP hard; even with heuristics, computational power often limits the analyses (already). The length of tree search analysis scales exponentially with number of taxa and with number of characters with codes in current use. There are at least 107 species, each with ,000 genes, so the need for computational power and new approaches will continue to grow.
8
Inferring Evolutionary relationships from DNA sequence comparisons is powerful, BUT:
Current analyses often involve 1000’s of species and 1000’s of characters, creating very large matrices. Sequence alignment and Tree inference are NP hard; even with heuristics, computational power often limits the analyses (already). The length of tree search analysis scales exponentially with number of taxa and with number of characters with codes in current use. There are at least 107 species, each with ,000 genes, so the need for computational power and new approaches will continue to grow.
9
Inferring Evolutionary relationships from DNA sequence comparisons is powerful, BUT:
Current analyses often involve 1000’s of species and 1000’s of characters, creating very large matrices. Sequence alignment and Tree inference are NP hard; even with heuristics, computational power often limits the analyses (already). The length of tree search analysis scales exponentially with number of taxa and with number of characters with codes in current use. There are at least 107 species, each with ,000 genes, so the need for computational power and new approaches will continue to grow.
10
In this new, DNA sequence-rich world, laptops and desktops are no
longer adequate for phylogenetic analysis….
11
The CIPRES Portal was created to allow users to analyze large sequence data sets using popular community codes on a significant computational resource. The CIPRES Portal provided: Login-protected personal user space for storing results indefinitely. Access to most/all native command line options for each code. Support for adding new tools and new versions as needed.
12
The CIPRES Portal was created to allow users to analyze large sequence data sets using popular community codes on a significant computational resource. The CIPRES Portal provided: Login-protected personal user space for storing results indefinitely. Access to most/all native command line options for each code. Support for adding new tools and new versions as needed.
13
The CIPRES Portal was created to allow users to analyze large sequence data sets using popular community codes on a significant computational resource. The CIPRES Portal provided: Login-protected personal user space for storing results indefinitely. Access to most/all native command line options for each code. Support for adding new tools and new versions as needed.
14
The CIPRES Portal was created to allow users to analyze large sequence data sets using popular community codes on a significant computational resource. The CIPRES Portal provided: Login-protected personal user space for storing results indefinitely. Access to most/all native command line options for each code. Support for adding new tools and new versions as needed.
15
Workflow for the CIPRES Portal:
Assemble Sequences Upload to Portal Run Alignment Store Run Tree Inference Post-Tree Analysis Download
16
Limitations of the original CIPRES Portal
all jobs were run serially (efficient, but no gain in wall time) the cluster was modest (16 X 8-way dual core nodes) runs were limited to 72 hours the cluster was at the end of its useful lifetime funding for the project was ending demand for job runs was increasing
17
The solution: make parallel versions of community codes available on scalable, sustainable resources via the Science Gateway Program Workbench Framework CIPRES Cluster
18
The solution: make parallel versions of community codes available on scalable, sustainable resources via the Science Gateway Program TeraGrid/XSEDE Parallel codes Workbench Framework Triton Serial codes
19
Greater than 90% of all computational time was used for three tree inference codes: MrBayes, RAxML, and GARLI. Deploy parallel versions of these codes on TeraGrid Machines; initially using Globus/GRAM. Work with community developers to improve the speed-up available through the parallel codes offered by CSG. Add new parallel codes (e.g. MAFFT) as they appear in the community. Keep other serial codes on local SDSC resources that provide the project with fee-for-service cycles.
20
Parallel code profiles on Trestles
19 X 36 X
21
Use of the New Gateway exceeded all expectations….
?!?
22
Use of the New Gateway exceeded all expectations….
?!? Initial allocation
23
Use of the New Gateway exceeded all expectations….
The initial allocation was greater than the capacity of the original cluster! ?!? Initial allocation
24
Given the high level of consumption, how to make sure Resource Usage is efficient?
25
Job Attrition on the CIPRES Science Gateway*
*March – August 2010
26
Error Impact analysis
27
Prevent Job Loss When Contact with the Resource is Lost
To make the system robust to system outages, jobs in the DB are tracked in a “running jobs table.” When a job completes, results are transferred to the DB, and the job is marked complete. Every 24 hours, a daemon looks for results for jobs that are incomplete, any results found are transferred to the DB, and the job is marked as complete. If there is a service interruption while the job is running, the job results will still be automatically delivered to the user.
28
Monitor Submissions/Usage to Track Efficiency
29
SU/job increases Usage 12/2009 – 4/2012
30
MrBayes jobs (20%) fail when Lustre file system is under load
Long running jobs fail even when there is no other traffic on the system. Users immediately resubmit, driving SU use up. The failure is due to resource contention in the Lustre file system As a result, the code cannot write output files, and jobs fail Moving to a mounted ZFS system eliminated the problem
31
Move off Lustre Usage 12/2009 – 4/2012
32
With such high consumption, we must track usage by individuals*
SUs % of Users % total SU 0 – 30 K 97 45 30 – 300,000 K 3 55 *Reporting Period: Sept, 2010 – May, 2011
33
With such high consumption, we must track usage by individuals*
SUs % of Users % total SU 0 – 30 K 97 45 30 – 300K 3 55 *Reporting Period: Sept, 2010 – May, 2011 We need to monitor individual users, because we want all XSEDE users to be subject to the same level of peer review.
34
Establish a Fair Use Policy
Anyone, anywhere can sign up for an account. An account is not required to submit a job. Users at US institutions are permitted to use 50,000 SUs from the community allocation annually. Users at institutions in other countries can use up to 30,000 SUs annually. Users can apply for a personal XSEDE allocation if they require more SUs.
35
Tools required to implement the CIPRES SG Fair Use Policy:
ability to halt submissions from a given user account ability to monitor usage by each account automatically ability for users to track their SU consumption ability to forecast SU cost of a job for users ability to charge to a user’s personal XSEDE allocation
36
Tools required to implement the CIPRES SG Fair Use Policy:
ability to halt submissions from a given user account ability to monitor usage by each account automatically ability for users to track their SU consumption ability to forecast SU cost of a job for users ability to charge to a user’s personal XSEDE allocation
37
Tools required to implement the CIPRES SG Fair Use Policy:
ability to halt submissions from a given user account ability to monitor usage by each account automatically ability for users to track their SU consumption ability to forecast SU cost of a job for users ability to charge to a user’s personal XSEDE allocation
38
Post to user’s work area
XDB CDB job id, SU charge job id, user name Nightly usage reports Post to user’s work area User/management notifications
39
Help users track their resource consumption:
Notify users of their usage level
40
Tools required to implement the CIPRES SG Fair Use Policy:
ability to halt submissions from a given user account ability to monitor usage by each account automatically ability for users to track their SU consumption ability of users to forecast SU cost of a job ability to charge to a user’s personal XSEDE allocation
41
Create a conditional “warning” element in the interface XML
42
Tools required to implement the CIPRES SG Fair Use Policy:
ability to halt submissions from a given user account ability to monitor usage by each account automatically ability for users to track their SU consumption ability to forecast SU cost of a job for users ability to charge to a user’s personal XSEDE allocation
43
Steps required to use a personal allocation:
User receives personal allocation from XRAC PI of allocation adds “cipres” user to their account CSG staff changes the user profile to charge to the personal allocation account id
44
Steps required to use a personal allocation:
User receives personal allocation from XRAC PI of allocation adds “cipres” user to their account CSG staff changes the user profile to charge to the personal allocation account id 3 users have completed this process
45
Impact of Policy on Usage Dec 2009 – April 2012
When Lustre file system is not used, submissions and SU usage are linear. 29,000 more SUs requested each month. Projected use for is 13.4 million SUs
46
Impact of Policy on Usage Dec 2009 – April 2012
12 more users submit 160 more jobs each month Growth in usage is driven by new users Feb J Apr Jun Aug Oct Dec Feb Apr
47
What works about Trestles:
The Trestles machine is managed to keep queue depth low. This is a key requirement for many of our users who run a lot of relatively short jobs, and for class instruction. Run times of up to 334 hours are allowed. This is important because most of the CSG codes do not have restart capability and scalability is typically limited to 64 cores or less.
48
Impact on Scientific Productivity:
Publications enabled by the CIPRES Science Gateway/CIPRES Portal: Year Number 2012* 106 *As of June 1, 2012 Publications in the pipeline: Status Number In preparation 91 In review 25 :
49
Impact on Scientific Productivity:
In Q1 2012, 29% of all XSEDE users who ran jobs ran them from the CSG 50% of users said they had no access to local resources, nor funds to purchase access on cloud computing resources Used for curriculum delivery by at least 68 instructors. Jobs run for researchers in 23/29 EPSCOR states. Routine submissions from Harvard, Berkeley, Stanford….. 76% of users are in the US or have a collaborator in the US
50
Impact on Scientific Productivity :
“It is hard for me to imagine how I could work at a reasonable pace without this resource, especially when things like MS or grant submission deadlines loom….”
51
Impact on Education: “It is an easy-to-use cluster to run BEAST analyses in a short time. This allows students to run analyses that actually converge in a single class.” “I found it is important to be able to let the student explore the analysis 'all the way', i.e. not just show the principle but actually let them run an entire Markov chain and let them evaluate the results. For that I found that having access to the CIPRES Science Gateway to be crucial.”
52
Seven Success Strategies for Gateways:
1. Identify a user base that cannot do their work without HPC access 2. Focus on providing the key software elements users require 3. Provide scalable access to the best code versions available 4. Make efficient use of resources/user time/keystrokes 5. Provide easy access with adult supervision 6. Provide fast job turnaround on resources appropriate for the workflow 7. Relentless commitment to customer service
53
Seven Success Strategies for Gateways:
1. Identify a user base that cannot do their work without HPC access 2. Focus on providing the key software elements users require 3. Provide scalable access to the best code versions available 4. Make efficient use of resources/user time/keystrokes 5. Provide easy access with adult supervision 6. Provide fast job turnaround on resources appropriate for the workflow 7. Relentless commitment to customer service
54
Seven Success Strategies for Gateways:
1. Identify a user base that cannot do their work without HPC access 2. Focus on providing the key software elements users require 3. Provide scalable access to the best code versions available 4. Make efficient use of resources/user time/keystrokes 5. Provide easy access with adult supervision 6. Provide fast job turnaround on resources appropriate for the workflow 7. Relentless commitment to customer service
55
Seven Success Strategies for Gateways:
1. Identify a user base that cannot do their work without HPC access 2. Focus on providing the key software elements users require 3. Provide scalable access to the best code versions available 4. Make efficient use of resources/user time/keystrokes 5. Provide easy access with adult supervision 6. Provide fast job turnaround on resources appropriate for the workflow 7. Relentless commitment to customer service
56
Seven Success Strategies for Gateways:
1. Identify a user base that cannot do their work without HPC access 2. Focus on providing the key software elements users require 3. Provide scalable access to the best code versions available 4. Make efficient use of resources/user time/keystrokes 5. Provide easy access with adult supervision 6. Provide fast job turnaround on resources appropriate for the workflow 7. Relentless commitment to customer service
57
Seven Success Strategies for Gateways:
1. Identify a user base that cannot do their work without HPC access 2. Focus on providing the key software elements users require 3. Provide scalable access to the best code versions available 4. Make efficient use of resources/user time/keystrokes 5. Provide easy access with adult supervision 6. Provide fast job turnaround on resources appropriate for the workflow 7. Relentless commitment to customer service
58
Seven Success Strategies for Gateways:
1. Identify a user base that cannot do their work without HPC access 2. Focus on providing the key software elements users require 3. Provide scalable access to the best code versions available 4. Make efficient use of resources/user time/keystrokes 5. Provide easy access with adult supervision 6. Provide fast job turnaround on resources appropriate for the workflow 7. Relentless commitment to customer service
59
Seven Success Strategies for Gateways:
1. Identify a user base that cannot do their work without HPC access 2. Focus on providing the key software elements users require 3. Provide scalable access to the best code versions available 4. Make efficient use of resources/user time/keystrokes 5. Provide easy access with adult supervision 6. Provide fast job turnaround on resources appropriate for the workflow 7. Relentless commitment to customer service
60
Acknowledgements: CIPRES Science Gateway Terri Schwartz
Hybrid Code Development Wayne Pfeiffer Alexandros Stamatakis XSEDE Implementation Support Nancy Wilkins-Diehr Doru Marcusiu Leo Carson XSEDE System Support Mahidhar Tatineni Rick Wagner Workbench Framework: ` Terri Schwartz Paul Hoover Lucie Chan Jeremy Carver
62
Next Steps: Expand the accessibility of CIPRES functionalities by exposing ReST services. Expand the number of parallel codes available. Expand the number of computational resources available.
63
Next Steps: Expand the accessibility of CIPRES functionalities by exposing ReST services. Expand the number of parallel codes available. Expand the number of computational resources available.
64
Next Steps: iPlant as Prototype Partner for REST Services
Data Base Access NGS Sequencing Ancestral Character Estimation Tree Reconciliation Taxonomic Name Resolution Phylogenetic Workflows XSEDE CSG iPDE Parallel codes iPDB
65
Next Steps: Expand the accessibility of CIPRES functionalities by exposing ReST services. Expand the number of parallel codes available. Expand the number of computational resources available.
66
Next Steps: Expand the accessibility of CIPRES functionalities by exposing ReST services. Expand the number of parallel codes available. Expand the number of computational resources available.
67
Workflow for the CIPRES Gateway:
Assemble Sequences Upload to Portal Run Alignment Store Run Tree Inference Post-Tree Analysis Download
68
REST Services will put CIPRES in many environments
Next Steps: REST Services will put CIPRES in many environments XSEDE CSG Parallel codes raxmlGUI
69
Next Steps: Expand the Number of Parallel Codes available
MrBayes 3.2: MPI Hybrid parallel code Phylobayes: serial Parallel code Ima/Ima2: serial Parallel code LAMARC: serial Parallel code
70
Next Steps: Expand the Resources for all users
At 10 million SUs, the CSG allocation amounts to only about 0.7% of allocatable XSEDE resources.
71
Next Steps: Expand the Resources for all users
At 10 million SUs, the CSG allocation amounts to only about 0.7% of allocatable XSEDE resources. BUT……..
72
Next Steps: Expand the Resources for all users
Trestles Parallel CSG Triton Serial 10 million SUs on Trestles (2011/2012) = 16% of the allocatable Trestles Machine Projected growth to 21% of Trestles We need to expand to more machines!
73
Next Steps: Expand the Resources for all users
Stampede, Others Trestles (8%) Gordon (2%) Parallel CSG raxmlGUI OSG Serial
74
Next Steps: Expand the Resources for all users
CSG raxmlGUI
76
Presentation layer is based on Java Struts2
77
Uses 2 Java classes to access the Core
78
We currently use only A web client
80
The Workbench Framework (Java) deploys generic “tasks”….
81
….and queries generic DBs
82
Specific information is coded in a Central Registry
83
User information, data, and job runs are stored in a MySQL database
84
Tasks and queries are sent to remote machines and DBs
85
An XML standard is used used to create forms….
<?xml version="1.0" encoding="ISO "?> <!DOCTYPE pise SYSTEM " <!-- the interface was modified by mamiller to accomodate submission of jobs to both trestles and abe --> <pise> <head> </head> <title>RAxML-HPC2 on TG</title> <version>7.2.8</version> <description>Phylogenetic tree inference using maximum likelihood/rapid bootstrapping run on teragrid. (beta interface)</description> <authors>Alexandros Stamatakis</authors> <reference>Stamatakis A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models.Bioinformatics Nov 1;22(21): </reference> <category>Phylogeny / Alignment</category> <doclink> <command>raxmlhpc2_tgb</command> <parameters> <!-- Start -N < > <parameter ishidden="1" type="String"> <name>raxmlhpc_hybridlogic2</name> <attributes> <format> <language>perl</language> <code>"raxmlHPC-HYBRID T 6"</code> </format> <precond> <!-- * If -N nnn is specified with nnn < 50, run the hybrid parallel version of RAxML on a single node of Trestles using
86
An XML standard is used used to create forms….
<?xml version="1.0" encoding="ISO "?> <!DOCTYPE pise SYSTEM " <!-- the interface was modified by mamiller to accomodate submission of jobs to both trestles and abe --> <pise> <head> </head> <title>RAxML-HPC2 on TG</title> <version>7.2.8</version> <description>Phylogenetic tree inference using maximum likelihood/rapid bootstrapping run on teragrid. (beta interface)</description> <authors>Alexandros Stamatakis</authors> <reference>Stamatakis A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models.Bioinformatics Nov 1;22(21): </reference> <category>Phylogeny / Alignment</category> <doclink> <command>raxmlhpc2_tgb</command> <parameters> <!-- Start -N < > <parameter ishidden="1" type="String"> <name>raxmlhpc_hybridlogic2</name> <attributes> <format> <language>perl</language> <code>"raxmlHPC-HYBRID T 6"</code> </format> <precond> <!-- * If -N nnn is specified with nnn < 50, run the hybrid parallel version of RAxML on a single node of Trestles using User-entered values are then rendered into command lines and delivered with input files to the compute resource file system….
87
LOCAL RESOURCE REMOTE RESOURCE Remote Scheduler WF Tool Module
GridFTP SFTP WF Tool Module Local Scheduler PBS LSF SGE GSISSH DRMAA GRAM Local File System Input Output Remote File System MySQL DB A key feature for adapting the architecture for use on XSEDE resources was the creation of pluggable job submission and monitoring interfaces and pluggable file system interfaces. Job input and output can be stored in the local server file system or transferred via SFTP or GridFTP to the compute resource file system. Jobs can be submitted and monitored by communication with a local cluster’s scheduler (e.g., PBS, LSF, or SGE), or to remote public cluster via DRMAA, GRAM, or GSISSH. CSG submissions to XSEDE resources use GridFTP and GSISSH.
88
LOCAL RESOURCE REMOTE RESOURCE Remote Scheduler WF Tool Module
GridFTP SFTP WF Tool Module Local Scheduler PBS LSF SGE GSISSH DRMAA GRAM Local File System Input Output Remote File System MySQL DB Pluggable job submission and monitoring module distributes jobs to local and remote resources A key feature for adapting the architecture for use on XSEDE resources was the creation of pluggable job submission and monitoring interfaces and pluggable file system interfaces. Job input and output can be stored in the local server file system or transferred via SFTP or GridFTP to the compute resource file system. Jobs can be submitted and monitored by communication with a local cluster’s scheduler (e.g., PBS, LSF, or SGE), or to remote public cluster via DRMAA, GRAM, or GSISSH. CSG submissions to XSEDE resources use GridFTP and GSISSH.
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.