Galaxy in Production Nate Coraor Galaxy Team Penn State University.

Slides:



Advertisements
Similar presentations
Chapter 20 Oracle Secure Backup.
Advertisements

Internet Information Server 6.0. IIS 6.0 Enhancements  Fundamental changes, aimed at: Reliability & Availability Reliability & Availability Performance.
1.  Understanding about How to Working with Server Side Scripting using PHP Framework (CodeIgniter) 2.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 11: Monitoring Server Performance.
1 CS6320 – Why Servlets? L. Grewe 2 What is a Servlet? Servlets are Java programs that can be run dynamically from a Web Server Servlets are Java programs.
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 11 Managing and Monitoring a Windows Server 2008 Network.
Platform as a Service (PaaS)
Sharepoint Portal Server Basics. Introduction Sharepoint server belongs to Microsoft family of servers Integrated suite of server capabilities Hosted.
April-June 2006 Windows Hosting Seminar Series Product Roadmap: IIS 7.0 Matthew Boettcher Web Platform Technical Evangelist (Hosting) Developer & Platform.
Enterprise Reporting with Reporting Services SQL Server 2005 Donald Farmer Group Program Manager Microsoft Corporation.
NovaBACKUP 10 xSP Technical Training By: Nathan Fouarge
Windows Server MIS 424 Professor Sandvig. Overview Role of servers Performance Requirements Server Hardware Software Windows Server IIS.
Windows.Net Programming Series Preview. Course Schedule CourseDate Microsoft.Net Fundamentals 01/13/2014 Microsoft Windows/Web Fundamentals 01/20/2014.
This presentation will guide you though the initial stages of installation, through to producing your first report Click your mouse to advance the presentation.
Deploying Ruby on Rails How to make your application actually serve Dan Buettner 18 Oct 2007.
Module 10: Designing an AD RMS Infrastructure in Windows Server 2008.
Applets & Servlets.
Architecture Of ASP.NET. What is ASP?  Server-side scripting technology.  Files containing HTML and scripting code.  Access via HTTP requests.  Scripting.
DNN Performance & Scalability Planning, Evaluating & Improving : Part 2.
Lecture 8 – Platform as a Service. Introduction We have discussed the SPI model of Cloud Computing – IaaS – PaaS – SaaS.
Securing Microsoft® Exchange Server 2010
Oracle10g RAC Service Architecture Overview of Real Application Cluster Ready Services, Nodeapps, and User Defined Services.
5 Chapter Five Web Servers. 5 Chapter Objectives Learn about the Microsoft Personal Web Server Software Learn how to improve Web site performance Learn.
Advanced Features of Nagios XI Sam Lansing -
Database Laboratory Regular Seminar TaeHoon Kim.
Module 7: Fundamentals of Administering Windows Server 2008.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 11: Monitoring Server Performance.
9 Chapter Nine Compiled Web Server Programs. 9 Chapter Objectives Learn about Common Gateway Interface (CGI) Create CGI programs that generate dynamic.
Module 10: Monitoring ISA Server Overview Monitoring Overview Configuring Alerts Configuring Session Monitoring Configuring Logging Configuring.
Ramiro Voicu December Design Considerations  Act as a true dynamic service and provide the necessary functionally to be used by any other services.
Esri UC 2014 | Technical Workshop | ArcGIS for Server: Administrative Scripting and Automation Shreyas Shinde Ranjit Iyer.
Enabling High-Quality Printing in Web Applications
Module 2: Installing and Maintaining ISA Server. Overview Installing ISA Server 2004 Choosing ISA Server Clients Installing and Configuring Firewall Clients.
1 Overview of the Application Hosting Environment Stefan Zasada University College London.
Phone: Mega AS Consulting Ltd © 2007  CAT – the problem & the solution  Using the CAT - Administrator  Mega.
Module 11: Implementing ISA Server 2004 Enterprise Edition.
Some Design Notes Iteration - 2 Method - 1 Extractor main program Runs from an external VM Listens for RabbitMQ messages Starts a light database engine.
Stuart Wakefield Imperial College London Evolution of BOSS, a tool for job submission and tracking W. Bacchi, G. Codispoti, C. Grandi, INFN Bologna D.
ArcGIS Server for Administrators
Module 10 Administering and Configuring SharePoint Search.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 11: Monitoring Server Performance.
Module 2: Overview of IIS 7.0 Application Server.
Apache JMeter By Lamiya Qasim. Apache JMeter Tool for load test functional behavior and measure performance. Questions: Does JMeter offers support for.
1 Session 1: Introduction to PHP & MySQL iNET Academy Open Source Web Development.
Scale Fail or, how I learned to stop worrying and love the downtime.
Module 8 : Configuration II Jong S. Bok
SAN DIEGO SUPERCOMPUTER CENTER Inca Control Infrastructure Shava Smallen Inca Workshop September 4, 2008.
ICM – API Server Gary Ratcliffe. 2 Agenda Webinar Programme API Server Overview JSON-RPC iCM API Service API Server and Forms New services under.
Threads. Readings r Silberschatz et al : Chapter 4.
SPI NIGHTLIES Alex Hodgkins. SPI nightlies  Build and test various software projects each night  Provide a nightlies summary page that displays all.
(ITI310) By Eng. BASSEM ALSAID SESSIONS 10: Internet Information Services (IIS)
HNC COMPUTING - Network Concepts 1 Network Concepts Network Concepts Network Operating Systems Network Operating Systems.
IBM Express Runtime Quick Start Workshop © 2007 IBM Corporation Deploying a Solution.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Operating Systems Overview: Using Hardware.
Configuring SQL Server for a successful SharePoint Server Deployment Haaron Gonzalez Solution Architect & Consultant Microsoft MVP SharePoint Server
Windows Server 2003 { First Steps and Administration} Benedikt Riedel MCSE + Messaging
How to use Drupal Awdhesh Kumar (Team Leader) Presentation Topic.
XNAT 1.7: Getting Started 6 June, Introduction In this presentation we’ll discuss:  Features and functions in XNAT 1.7  Requirements  Installing.
Start-SPPowerShell – Introduction to PowerShell for SharePoint Admins and Developers Paul BAker.
Platform as a Service (PaaS)
Platform as a Service (PaaS)
Node.Js Server Side Javascript
Node.js Express Web Applications
Embeddable Discussions Ivelin Elenchev
Node.js Express Web Services
Node.Js Server Side Javascript
Google App Engine Ying Zou 01/24/2016.
Module P3 Practical: Building a webapp in nodejs and
Galaxy API.
Hosting Geodesign and Analysis Services in Your Portal for ArcGIS
Presentation transcript:

Galaxy in Production Nate Coraor Galaxy Team Penn State University

Galaxy runs out of the box! Simple download, setup, and install design: % hg clone % sh run.sh Great for development! Not designed to support multiple users in a production environment with default configuration

But more powerful scenarios are supported By default, Galaxy uses: SQLite Built-in HTTP server for all tasks Local job runner Single process Simplest error-proof configuration

Groundwork for scalability

Start with a clean environment Galaxy becomes a managed system service Give Galaxy its own user Don't share database or db users Make sure Galaxy is using a clean Python interpreter: virtualenv, or compile your own Galaxy can be housed in NFS or other cluster/network filesystems (has been tested w/ GPFS)

Basic Configuration

Disable the developer settings use_interactive = False - Not even safe (exposes config) use_debug = False - You'll still be able to see tracebacks in the log file, doesn't load response in memory

Get a real database SQLite is serverless Galaxy is a heavy database consumer Locking will be an immediate issue Migrating data is no fun Setup is very easy: 'database_connection = postgres://'

Offload the menial tasks: Proxy Directly serve static content faster than Galaxy's HTTP server Reduce load on the application Caching and compression Load balancing (more on that later) Hook your local authentication and authorization system

Metadata Detection Setting metadata is CPU intensive and will make Galaxy unresponsive Make a new process (better yet, run on the cluster!) All you need is: 'set_metadata_externally = True' Run the data source tools on the cluster if they have access to the Internet Remove 'tool = local:///' from config file

Advanced Configuration

Python and threading Galaxy is multi-threaded. No problem, right? Problem... Enter the Global Interpreter Lock Guido says: "run multiple processes instead of threads!"

Galaxy's multiprocess model One job manager - responsible for preparing and finishing jobs, monitoring cluster queue Many web servers Doesn't really need IPC, job notification through database

Defining extra servers is easy [server:web_0] port = 8000 [server:web_1] port = 8001 [server:web_2] port = [server:runner_0] port = 8100

Let your tools run free: Cluster Move intensive processing (tool execution) to other hosts Frees up the application server to serve requests and manage jobs Utilize existing resources No job interruption upon restart Per-tool cluster options Generic DRMAA support: SGE (and derivatives), LSF, PBS Pro, Condor? It's easy: Set 'start_job_runners' and 'default_cluster_job_runner' and go! If your cluster has Internet access, run upload, UCSC, etc. on the cluster too

Tune Database Parameters Let Postgres (not Galaxy) keep the result in memory. 'database_engine_option_server_side_cursors = True' Allow more database connections. 'database_engine_option_pool_size = 10' 'database_engine_option_max_overflow = 20' Don't create unnecessary connections to the database. 'database_engine_option_strategy = threadlocal'

Downloading data from Galaxy

Downloading data from the proxy

The proxy server can send files much faster than Galaxy's internal HTTP server and file I/O methods Reduce load on the application, free the process Restartability Security is maintained: the proxy consults Galaxy for authZ Proxy server requires minimal config and then: nginx: 'nginx_x_accel_redirect_base = /_download' Apache: 'apache_xsendfile = True'

Uploading data to Galaxy

Uploading data to the proxy

The proxy is also better at receiving files than Galaxy Again, reduce load on the application, free the process Again, restartability More reliable Slightly more complicated to set up, and nginx only

How do we run Galaxy Main?

All the details: usegalaxy.org/production

Automating Galaxy Dannon Baker Galaxy Team Emory University

Simple URIs Collections: Elements: The method performed defines the operation HTTP GET/PUT/POST/DELETE RESTful API

Galaxy's REST Overview Uses generated API keys for per-user authentication No username/password No credential caching (not REST!) Request parameters and responses are in JSON (JavaScript Object Notation) Maintains security Enable with enable_api = True in config

GET Example » GET /api/libraries?key=966354fc14c9e427cee380ef50a72a21 « [ « { « 'url': '/api/libraries/f2db41e1fa331b3e', « 'id': 'f2db41e1fa331b3e', « 'name': 'Library 1' « } « ]

Modules Libraries Users and Roles Sample Tracking Forms Workflows Histories *

Scripted Usage More in galaxy/scripts/api Share yours - Galaxy tool shed import os, sys, traceback from common import update try: data = {} data[ 'update_type' ] = 'request_state' except IndexError: print 'usage: %s key url' % os.path.basename( sys.argv[0] ) sys.exit( 1 ) update( sys.argv[1], sys.argv[2], data, return_formatted=True )

Extending the API It's still an early beta Wrap controller methods It's easy!