Sushi – An exquisite recipe for NGS data analysis Hubert Rehrauer & Masaomi Hatakeyama Supporting User for SHell-script Integration.

Slides:



Advertisements
Similar presentations
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Introduction to Rails.
Advertisements

Test Automation: Coded UI Test
You’ve Got a Cloud- Familiar Tools to Manage IT Bob Hunt Sr. IT Pro Evangelist
1Key – Report Creation with DB2. DB2 Databases Create Domain for DB2 Test Demo.
By SAG Objectives Cross platform QA Automation for web applications Scheduling the automation Automatically build the test scripts Generate the.
This presentation is intended as a detailed WebEx, to bring potential customers to an understanding of Dream Report capabilities. This presentation focuses.
Introduction to Model-View-Controller (MVC) Web Programming with TurboGears Leif Oppermann,
HORIZONT 1 ProcMan ® The Handover Process Manager Product Presentation HORIZONT Software for Datacenters Garmischer Str. 8 D München Tel ++49(0)89.
27-Jun-15 Rails. What is Rails? Rails is a framework for building web applications This involves: Getting information from the user (client), using HTML.
Ruby on Rails Creating a Rails Application Carol E Wolf CS396X.
Platforms, installation, configuration; accessing example collections Course material prepared by Greenstone Digital Library Project University of Waikato,
Dynamic Web site With PHP and MySQL. MySQL The combination of MySQL database and PHP scripting language is optimum for building dynamic websites. MySQL.
How to install CGAL Yuanzhen Wang. What is CGAL Computational Geometry Algorithms Library “Provide easy access to efficient and reliable geometric algorithms.
DB Audit Expert v1.1 for Oracle Copyright © SoftTree Technologies, Inc. This presentation is for DB Audit Expert for Oracle version 1.1 which.
NDT Tools Tutorial: How-To setup your own NDT server Rich Carlson Summer 04 Joint Tech July 19, 2004.
Professional Informatics & Quality Assurance Software Lifecycle Manager „Tools that are more a help than a hindrance”
Before we start: Align sequence reads to the reference genome
The New Books List Michael Doran, Systems Librarian Ex Libris Southwest Users Group February 6, 2008 – Santa Ana College.
NGS Analysis Using Galaxy
Ruby on Rails. What is Ruby on Rails? Ruby on Rails is an open source full-stack web framework. It is an alternative to PHP/MySQL. It can render templates,
6/1/2001 Supplementing Aleph Reports Using The Crystal Reports Web Component Server Presented by Bob Gerrity Head.
Getting to Push Button Deploys Moovweb January 19, 2012.
Customized cloud platform for computing on your terms !
Created by the Community for the Community BizTalk & Build.
A Web 2.0 Portal for Teragrid Fugang Wang Gregor von Laszewski May 2009.
© 2012 Avaya, Inc. All rights reserved, Page 1 Module Duration: Module 05: Handling Data in Bulk 3 Hours.
Introduction to NS2 -Network Simulator- -Prepared by Changyong Jung.
Please Note: Information contained in this document is considered LENOVO CONFIDENTIAL For Lenovo Internal Use Only Do Not Copy or Distribute!! For Lenovo.
|Tecnologie Web L-A Anno Accademico Laboratorio di Tecnologie Web Introduzione ad Eclipse e Tomcat
| nectar.org.au NECTAR TRAINING Module 10 Beyond the Dashboard.
1 Automatic Processing Pipelines with XNAT and REDCap Vanderbilt University Benjamin Yvernault, Bennett Landman, Brian Boyd,
Compiled Matlab on Condor: a recipe 30 th October 2007 Clare Giacomantonio.
Galaxy for Bioinformatics Analysis An Introduction TCD Bioinformatics Support Team Fiona Roche, PhD Date: 31/08/15.
Your compuBase online services Module 5: Extract Data.
Lecture 11 Rails Topics SaaSSaaS Readings: SaaS book Ch February CSCE 740 Software Engineering.
Transcriptome Analysis
Marcelo R.N. Mendes. What is FINCoS? A Java-based set of tools for data generation, load submission, and performance measurement of event processing systems;
Introduction to Web AppBuilder for ArcGIS: JavaScript Apps Made Easy
Ganga A quick tutorial Asterios Katsifodimos Trainer, University of Cyprus Nicosia, Feb 16, 2009.
Version 5. ¿What is PAF? PAF is a tool to easily and quickly implement… …distributed analysis over ROOT trees. …by hiding as much as possible the inherent.
SW318 Social Work Statistics Slide 1 Get ready to work on practice problems 1. Create a directory and subdirectory on your computer named C:\StudentData\SW318_Spring_2004.
Introductory RNA-seq Transcriptome Profiling. Before we start: Align sequence reads to the reference genome The most time-consuming part of the analysis.
Paris package: practical aspects Installation / presentation To run a simulation Analysis To add a new module to Paris Installation / presentation To run.
WDO-It! 102 Workshop: Using an abstraction of a process to capture provenance UTEP’s Trust Laboratory NDR HP MP.
Nfs or ftp server Server 1 Server 2 ClinCapture web app SAS script SAS REST notification service [SAS RNS] (can be run as standalone app) upload / download.
Programming for GCSE 1.0 Beginning with Python T eaching L ondon C omputing Margaret Derrington KCL Easter 2014.
Bioinformatics for biologists Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University Presented.
| nectar.org.au NECTAR TRAINING Module 10 Beyond the Dashboard.
Overview Using Plugins Developing Plugins Basic Examples / Demo Outlook Overview Using Plugins Developing Plugins Basic Examples / Demo Outlook Plugin.
IBM Express Runtime Quick Start Workshop © 2007 IBM Corporation Deploying a Solution.
Marcelo R.N. Mendes. What is FINCoS? A Java-based set of tools for data generation, load submission, and performance measurement of event processing systems;
1 Rails for the Ruby-Impaired John Paul Ashenfelter CTO/Transitionpoint.
Repository Manager 1.3 Product Overview Name Title Date.
Open-O Integration Project Introduction
Setup a PHP + MySQL Development Environment
Getting Started with R.
Attie Bioinformatics Server Redesign
The Linux Operating System
Populating a Data Warehouse
Populating a Data Warehouse
Tivoli Common Reporting v1.2 Overview
Populating a Data Warehouse
Populating a Data Warehouse
Dovetail & CVP Tutorial/Demo
OpenCRAVAT.
Access ProQuest Searchware
SSIS Data Integration Data Warehouse Acceleration
SSIS Data Integration Data Warehouse Acceleration
SSIS Data Integration Data Warehouse Acceleration
Presentation transcript:

Sushi – An exquisite recipe for NGS data analysis Hubert Rehrauer & Masaomi Hatakeyama Supporting User for SHell-script Integration

What is your data analysis wishlist? We had in mind: analyze by clicking scriptable manage my meta-information document all analysis steps organize my work I can add analysis applications connects to my compute resources keep everything in files on my disk no painful file formats The bioinformatician stays in the driver seat

The Sushi idea (I) Start with a bunch of raw data files on your disk:

The Sushi idea (II) Add the magic seasoning: Meta information

Meta-information turns mere data files into a data set One row per sample associated files everything else noteworthy about the files and the samples

Sushi offers a choice of analysis apps for your data set The meta-information columns drive the available applications

Sushi lets you control all parameters as selectors or as free text

The processing jobs generate the data files …

Sushi adds all ingredients to make them a new, documented data set …

… and Sushi let’s you move to the next analysis

The Sushi Data Analysis Process Step 1 – Generate Job script(s) Step 2 – Submit the Job script(s)

Sushi Modules 1.Sushi UI – Ruby on Rails – GUI 2.Sushi Application – Single Ruby file – CLI 3.Workflow Manager – Ruby gem library – Job Control

Sushi Components at FGCZ

Sushi Rank

Why choose Sushi? It has never been easier to import meta-information It has never been easier to add new data analysis applications Sushi does not impose constraints on your data analysis  Your applications define the semantics You never have to export your data again  it’s already exported! You never have to document your analysis again  the result is fully self-contained and documented by the time the analysis is done Sushi keeps your work organized even if you work on 10 different projects with thousands of samples

Acknowledgements FGCZ Genome Informatics Team Giancarlo Russo Lennart Opitz Weihong Qi Slavica Dimitrieva

Sushi Takeaways S U S H I SUSHI Super Easy Pipeline System Ultra Fast Development Surprisingly Flexible Ruby code Highly Independent Modules Intermediary between biologist and informatician

Sushi Demo 10 minutes 1.Installation 1 min 2.Data import/Job submission 2 mins 3.New application import 3 mins 4.Case Study 4 mins – RNAseq DEG analysis

Demo Environment Mac OS X Ruby Ruby on Rails 3.2.9

Installation 1.Downloading Ruby on Rails package – 2.Install libraries – bundle install 3.Setup DB – bundle exec rake db:migrate

Documents fgcz-sushi.uzh.ch

Download fgcz-sushi.uzh.ch/download.html

Installation Download, Extraction, Library installation, DB setup $ wget sushi.uzh.ch/sushi_ tgz $ tar zxvf sushi_ tgz $ cd sushi $ bundle install $ bundle exec rake db:migrate

Sushi run, workflow_manager run $ rails server $ workflow_manager

Sushi Access localhost:3000

Data import / Job submission Prepare your data set Import dataset.tsv Check samples Select an application – WordCountApp Set parameters Submit a job Check job status Check job script/log – public/projects Check result

DataSet Import

New Application Import fgcz-sushi.uzh.ch/download.html

New Application Import $ wget sushi.uzh.ch/fgcz_sushi_apps.tgz $ tar xvf fgcz_sushi_apps.tgz $ cp fgcz_sushi_apps/FastqcApp.rb sushi/lib/ $ cp -r fgcz_sushi_apps/R_scripts sushi/lib/

A Case Study –RNAseq DEG analysis- RNAseq Analysis – Quality Control – FastQC – Mapping – STAR – Counting – HTSeq – Differential Gene Expression – EdgeR

Summary Shell script auto-generating Application Framework Support all languages Help bio-logist/-informatician Implemented in Ruby Meta-Information DataSet Interface: GUI / CLI S A S H I M I

Thank you for your attention!!

Data import / Job submission Prepare your data set Import dataset.tsv Check samples Select an application – WordCountApp Set parameters Submit a job Check job status Check job script/log – public/projects Check result

DataSet Import

DataSet

Sushi Application Run

Parameter Setting

Job Submission

Job Status

New DataSet

Log, Job Script

Job Script

New Application Import fgcz-sushi.uzh.ch/download.html

New Application Import $ wget sushi.uzh.ch/fgcz_sushi_apps.tar $ tar xvf fgcz_sushi_apps.tar $ cp fgcz_sushi_apps/FastqcApp.rb sushi/lib/ $ cp -r fgcz_sushi_apps/R_scripts sushi/lib/

New Application Import

FastQC result

A Case Study –RNAseq DEG analysis- RNAseq Analysis – Quality Control – FastQC – Mapping – STAR – Counting – HTSeq – Differential Gene Expression – EdgeR

A Case Study

Summary Shell script auto-generating Application Framework Support all languages Help bio-logist/-informatician Implemented on Ruby Meta-Information DataSet Interface: GUI / CLI S A S H I M I

Thank you for your attention!!

Appendix

Sushi Run Style GUI CLI

Application Mode 1.SAMPLE mode – Job per Sample, e.g. Tophat 2.DATASET mode – Job per DataSet, e.g. FastQC

Import New Application 2 ways: prepare 1.Shell script 2.Ruby script Save it in Sushi repository – lib directory – No reboot Test on CLI – $ sushi_fabric --class WordCountApp --dataset_id 1 –run Test on GUI

How to add a new application 1.Inherit SushiApp class – Write Ruby code – Template Method Design Pattern – Possible to tune-up details 2.Delegate to SushiWrap class – Write Shell script – Ruby Metaprogramming – Quick import

Template Method Pattern Write Ruby code directly

Meta-programming Ruby code auto generation – from shell script code

A Sushi App – WordCountApp.rb

A Sushi App – WordCount.sh

The R Sushi Apps – FastqcApp.rb

Structure of a Job Script Generated by Sushi Generated by App

Sushi gem