© 2011 Illumina, Inc. All rights reserved. Illumina, illuminaDx, BaseSpace, BeadArray, BeadXpress, cBot, CSPro, DASL, DesignStudio, Eco, GAIIx, Genetic.

Slides:



Advertisements
Similar presentations
Yahoo! OpenID and OAuth 1 Allen Tom Yahoo! Membership Architect OpenID Foundation Board
Advertisements

Svetlin Nakov Director Training and Consulting Activities National Academy for Software Development (NASD) ASP.NET 3.5 New Features.
Programming with Android: SDK install and initial setup Luca Bedogni Marco Di Felice Dipartimento di Scienze dellInformazione Università di Bologna.
Behzad Samin 0 An End-to-End Overview of a RESTful Web Service.
How to Set Up a System for Teaching Files, Conferences, and Clinical Trials Medical Imaging Resource Center.
How to Author Teaching Files Draft Medical Imaging Resource Center.
The CODS Protégé Server. 2 Preliminaries If you want to follow along later Download and install Protégé 3.4 beta (Optional) Download the Server Stats.
© University of Reading David Spence 20 April 2014 e-Research: Activities and Needs.
©2011 Quest Software, Inc. All rights reserved.. Andrei Polevoi, Tatiana Golubovich Program Management Group ActiveRoles Add-on Manager Overview.
The OWASP Foundation Copyright © The OWASP Foundation Permission is granted to copy, distribute and/or modify this document under.
Copyright © 2008, SAS Institute Inc. All rights reserved. Discovering Meaningful Patterns in Genomics Data with JMP Genomics Jordan Hiller JMP Genomics.
New Products for © 2009 ANGEL Learning, Inc. Proprietary and Confidential, 2 Update Summary Enrich teaching and learning Meet accountability needs.
Publishing GIS Services to ArcGIS for Server
XProtect® Web Client 1 Product presentation.
Services Course Windows Live SkyDrive Participant Guide.
Enhancing Spotfire with the Power of R
Services Course Windows Live SkyDrive Participant Guide.
© 2013 Illumina, Inc. All rights reserved. Illumina, IlluminaDx, BaseSpace, BeadArray, BeadXpress, cBot, CSPro, DASL, DesignStudio, Eco, GAIIx, Genetic.
1Proprietary and Confidential AirVantage API – Getting started David SCIAMMA – June 13th 2014.
NGS Bioinformatics Workshop 2.1 Tutorial – Next Generation Sequencing and Sequence Assembly Algorithms May 3rd, 2012 IRMACS Facilitator: Richard.
TANDBERG Content Server January Organizational Challenges Corporations have struggled in the past:  Achieving unified communications within a global.
Page 1 Ricardo Villalobos Windows Azure Architect Evangelist Microsoft Corporation Designing, Building, and Deploying Windows Azure applications.
Report Distribution Report Distribution in PeopleTools 8.4 Doug Ostler & Eric Knapp 7264.
Modeling Public Pensions with Mathematica and Python II
SaaS, PaaS & TaaS By: Raza Usmani
INTRODUCTION TO CLOUD COMPUTING Cs 595 Lecture 5 2/11/2015.
© 2013 Illumina, Inc. All rights reserved. Illumina, IlluminaDx, BaseSpace, BeadArray, BeadXpress, cBot, CSPro, DASL, DesignStudio, Eco, GAIIx, Genetic.
OM. Brad Gall Senior Consultant
Lecture 8 – Platform as a Service. Introduction We have discussed the SPI model of Cloud Computing – IaaS – PaaS – SaaS.
Getting Started with Windows Azure Name Title Microsoft Corporation.
Microsoft Application Virtualization 5.0: Introduction Mohnish Chaturvedi & Ian Bartlett Premier Field Engineer WCL312.
© 2012 Autodesk Implementing Cloud-Based Productivity Solutions with the AutoCAD® ObjectARX® API Ravi Krishnaswamy Senior Software Architect.
High Throughput Sequence (HTS) data analysis 1.Storage and retrieving of HTS data. 2.Representation of HTS data. 3.Visualization of HTS data. 4.Discovering.
Contents 1.Introduction, architecture 2.Live demonstration 3.Extensibility.
NA-MIC National Alliance for Medical Image Computing Core 1b – Engineering Software Process Stephen R. Aylward Kitware, Inc.
ArcGIS Server for Administrators
08 | Advanced Features Jerry Nixon | Microsoft Developer Evangelist Daren May | President & Co-founder, Crank211.
An Introduction to R Statistical Computing AMS 597 Stony Brook University Spring 2009 By Tianyi Zhang.
SWGData and Software Access - 1 UCB, Nov 15/16, 2006 THEMIS SCIENCE WORKING TEAM MEETING Data and Software Access Ken Bromund GST Inc., at NASA/GSFC.
Intro to Datazen.
Testing in Android. Methods Unit Testing Integration Testing System Testing Regression Testing Compatibility Testing Black Box (Functional) White Box.
Bioinformatics for biologists Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University Presented.
A Technical Overview Bill Branan DuraCloud Technical Lead.
Local Touch—Global Reach Microsoft SharePoint 2013 Overview Stacy Simpkins, Sr. Consultant, Sogeti Florida.
1 Sample Multiplexing © 2007 Illumina, Inc. Illumina, Sentrix, Array of Arrays, BeadArray, DASL, Infinium, GoldenGate, BeadXpress, VeraCode,
CMPE 226 Database Systems April 19 Class Meeting Department of Computer Engineering San Jose State University Spring 2016 Instructor: Ron Mak
© 2011 Illumina, Inc. All rights reserved. Illumina, illuminaDx, BeadArray, BeadXpress, cBot, CSPro, DASL, Eco, Genetic Energy, GAIIx, Genome Analyzer,
High Throughput Sequence (HTS) data analysis 1.Storage and retrieving of HTS data. 2.Representation of HTS data. 3.Visualization of HTS data. 4.Discovering.
© 2010 Illumina, Inc. All rights reserved. Illumina, illuminaDx, Solexa, Making Sense Out of Life, Oligator, Sentrix, GoldenGate, GoldenGate Indexing,
Developers Introduction to the Power BI Platform.
1/10/2018 9:33 PM Cloud Roadshow © 2014 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO.
1/27/2018 5:13 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN.
By: Raza Usmani SaaS, PaaS & TaaS By: Raza Usmani
Cancer Genomics Core Lab
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
About Bill Bill Baer (ˈbɛər)
Developing Hybrid Apps on Microsoft Azure Stack
The Transition to Modern Office Add-in Development
Steering Group Member, Link Digital
SharePoint power hour Rob Howard Program Manager Build 2014
A technical look at new capabilities and features
Office 365 Development.
Microsoft Build /8/2018 5:15 AM © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY,
Chapter 2: System Structures
Office 365 Development July 2014.
Microsoft Virtual Academy
#01# ASP.NET Core Overview Design by: TEDU Trainer: Bach Ngoc Toan
Presentation transcript:

© 2011 Illumina, Inc. All rights reserved. Illumina, illuminaDx, BaseSpace, BeadArray, BeadXpress, cBot, CSPro, DASL, DesignStudio, Eco, GAIIx, Genetic Energy, Genome Analyzer, GenomeStudio, GoldenGate, HiScan, HiSeq, Infinium, iSelect, MiSeq, Nextera, Sentrix, SeqMonitor, Solexa, TruSeq, VeraCode, the pumpkin orange color, and the Genetic Energy streaming bases design are trademarks or registered trademarks of Illumina, Inc. All other brands and names contained herein are the property of their respective owners. BaseSpaceR Adrian Alexa

2 BaseSpace - Plug and Play Genomic Cloud Solution From Sample to Biological Insight 1.Seamless Instrument Integration 2.Harness the Internet 3.Accessible to Everyone Data transferred as the instrument runs Data available in BaseSpace within minutes of a run finishing Automatic de-multiplexing and FASTQ generation

3 BaseSpace Data storage, analysis and collaboration. Almost 40,000 Instrument Runs Streamed to BaseSpace by April 2013

4 UI Dev Portal App Users Developers SaaS Instruments PaaS OpenStack IaaS Secondary Analysis Visualization Biological Inference Validation Methods Development App BaseSpace API BaseSpace API REST API and Native Apps

5 BaseSpace API Data Model

6 BaseSpaceR = BaseSpace + R + Bioconductor A translation layer between BaseSpace REST API and R data structures Cloud based content management system, facilitating the storage and sharing of genomic data. Rich environment of statistical and data analysis tools for high-throughput genomic data. Persistent connection with the REST server and support for the REST API query parameters. Vectorized operations in line with the R semantic. Allows for queries across multiple Projects, Samples, AppResults, Files, etc. S4 class system used to represent the BaseSpace data model. Integration with Bioconductor libraries and data containers [working on it…]. Portability on most platforms: Linux, Windows and Mac OS X. Features

7 BaseSpaceR REST API and R data structures REST APIR API GET

8 Q-score distribution Access to the FASTQ files 1)Authentication [3 lines of code] –User needs to interact with the BaseSpace UI (or via a web server). > aAuth <- AppAuth(client_id = "5b b473ba740e9a9eb0abf64", + client_secret = "b3168bf65bf543f3b6e7f df", + scope = "CREATE GLOBAL, BROWSE GLOBAL, CREATE PROJECTS") Launching browser for OAuth authentication... > requestAccessToken(aAuth) Access token successfully acquired! > aAuth Object of class "AppAuth" with: > aAuth <- AppAuth(client_id = "5b b473ba740e9a9eb0abf64", + client_secret = "b3168bf65bf543f3b6e7f df", + scope = "CREATE GLOBAL, BROWSE GLOBAL, CREATE PROJECTS") Launching browser for OAuth authentication... > requestAccessToken(aAuth) Access token successfully acquired! > aAuth Object of class "AppAuth" with:

9 Q-score distribution Access to the FASTQ files > myProj <- listProjects(aAuth) > data.frame(Name = Name(myProj), Id = Id(myProj)) Name Id 1 BaseSpaceDemo 2 2 Cancer Sequencing Demo 4 3 HiSeq > sampl <- listSamples(aAuth, projectId = 2, Limit = 1) > inSample <- Samples(sampl, simplify = TRUE) > inSample #Samples object: > myProj <- listProjects(aAuth) > data.frame(Name = Name(myProj), Id = Id(myProj)) Name Id 1 BaseSpaceDemo 2 2 Cancer Sequencing Demo 4 3 HiSeq > sampl <- listSamples(aAuth, projectId = 2, Limit = 1) > inSample <- Samples(sampl, simplify = TRUE) > inSample #Samples object: 1)Authentication [3 lines of code] –User needs to interact with the BaseSpace UI (or via a web server). 2)Select a sample (collection of FASTQ files) from the Project of your choice [3 lines]

10 Q-score distribution Access to the FASTQ files > f <- listFiles(inSample, Extensions = ".gz") > idx <- grep("_R(1|2)_", Name(f)) > outDir <- paste("Sample", Id(inSample), sep = "_") > getFiles(aAuth, id = Id(f)[idx], destDir = outDir, verbose = TRUE) Downloading 4 files in directory: Sample_16018 Downloading file: data/intensities/basecalls/s_G1_L001_R1_001.fastq.1.gz > file.exists(file.path(outDir, f$Path[idx])) [1] TRUE TRUE TRUE TRUE > f <- listFiles(inSample, Extensions = ".gz") > idx <- grep("_R(1|2)_", Name(f)) > outDir <- paste("Sample", Id(inSample), sep = "_") > getFiles(aAuth, id = Id(f)[idx], destDir = outDir, verbose = TRUE) Downloading 4 files in directory: Sample_16018 Downloading file: data/intensities/basecalls/s_G1_L001_R1_001.fastq.1.gz > file.exists(file.path(outDir, f$Path[idx])) [1] TRUE TRUE TRUE TRUE 1)Authentication [3 lines of code] –User needs to interact with the BaseSpace UI (or via a web server). 2)Select a sample (collection of FASTQ files) from the Project of your choice [3 lines] 3)Download the files (FASTQs in our case) [4 lines]

11 Q-score distribution Access to the FASTQ files 1)Authentication [3 lines of code] –User needs to interact with the BaseSpace UI (or via a web server). 2)Select a sample (collection of FASTQ files) from the Project of your choice [3 lines] 3)Download the files (FASTQs in our case) [4 lines] 4)Process the downloaded files and compute the stats […] > library(ShortRead) > source("QscoreApp-functions.R") > qtab <- lapply(floc, getQscoreCounts) > idxR1 <- grep("_R1_", names(floc), fixed = TRUE) > idxR2 <- grep("_R2_", names(floc), fixed = TRUE) > x <- getQscoreStats(cbind(Reduce("+", qtab[idxR1]), Reduce("+", qtab[idxR2]))) > ylim <- range(x) + c(-2L, 2L) > plot(x = seq_len(nrow(x)), type = "n", ylim = ylim, + xlab = "Cycle", ylab = "Q-score", + main = "Q-scores statistics") > sx <- apply(x[, c("5%", "95%")], 2, function(x) smooth.spline(x)$y) > sx[, "95%"] <- pmax(sx[, "95%"], x[, "median"]) > polygon(c(1L:nrow(x), nrow(x):1L), c(sx[, "95%"], rev(sx[, "5%"])), col = "#CCEBC580", border = NA) > matpoints(sx, type = "l", lwd =.5, lty = 2, col = "black") > lines(x[, "mean"], lwd = 2, col = "red") > lines(x[, "median"], lwd = 2, col = "black") > library(ShortRead) > source("QscoreApp-functions.R") > qtab <- lapply(floc, getQscoreCounts) > idxR1 <- grep("_R1_", names(floc), fixed = TRUE) > idxR2 <- grep("_R2_", names(floc), fixed = TRUE) > x <- getQscoreStats(cbind(Reduce("+", qtab[idxR1]), Reduce("+", qtab[idxR2]))) > ylim <- range(x) + c(-2L, 2L) > plot(x = seq_len(nrow(x)), type = "n", ylim = ylim, + xlab = "Cycle", ylab = "Q-score", + main = "Q-scores statistics") > sx <- apply(x[, c("5%", "95%")], 2, function(x) smooth.spline(x)$y) > sx[, "95%"] <- pmax(sx[, "95%"], x[, "median"]) > polygon(c(1L:nrow(x), nrow(x):1L), c(sx[, "95%"], rev(sx[, "5%"])), col = "#CCEBC580", border = NA) > matpoints(sx, type = "l", lwd =.5, lty = 2, col = "black") > lines(x[, "mean"], lwd = 2, col = "red") > lines(x[, "median"], lwd = 2, col = "black")

12 Q-score distribution Access to the FASTQ files 1)Authentication [3 lines of code] –User needs to interact with the BaseSpace UI (or via a web server). 2)Select a sample (collection of FASTQ files) from the Project of your choice [3 lines] 3)Download the files (FASTQs in our case) [4 lines] 4)Process the downloaded files and compute the stats […] 5)Upload results back to BaseSpace [~10 lines] –Results are collection of files for now, minimal visualisation.

13 Amplification of distal chr8q. Homozygous deletion of part of chr8p. Location of centromeres indicated by vertical dotted lines Detect amplifications and deletions in cancer samples Data can be obtained from a single MiSeq runs (one for tumor and one for normal or even both on a flowcel). Typical analysis requires only the coverage data and this can be directly obtained using a REST method. Corrects for tumor ploidy and purity. Copy Number Abnormalities Accessing the coverage via a high-level REST method

14 RConsole Exploring BaseSpace data using RStudio

15 Whats next Facilitate the use of Bioconductor packages – there is much to gain if as many Bioconductor packages as possible can consume data (directly) from BaseSpace. Introduce high-level methods (REST or R API) for random access to BAMs, VCFs, metric data, etc. One can already use Rsamtools for indexed BAMs. R level methods to facilitate RNAseq, ChipSeq, etc. analyses. BaseSpace Data Central – publicly available data – most of it will be data coming from our latest instruments, chemistry, workflows.

16 Resources Tutorials, videos, whitepapers and other educational material: BaseSpace homepage: BaseSpace developer portal: Bio-IT World Asia presentation: More documentation on BaseSpace: umentation.ilmn BaseSpaceR homepage on Bioconductor: