Presentation is loading. Please wait.

Presentation is loading. Please wait.

Building Data-intensive Pipelines Ravi K Madduri Argonne National Lab University of Chicago.

Similar presentations


Presentation on theme: "Building Data-intensive Pipelines Ravi K Madduri Argonne National Lab University of Chicago."— Presentation transcript:

1 Building Data-intensive Pipelines Ravi K Madduri Argonne National Lab University of Chicago

2 Recap from other talks on genomics FBIRN combining imaging, clinical and genetics data CIDR provide better value to end users – Globus Online helping CIDR to reliably transfer large sequencing data sets to end users Ivo and Fabio presented various challenges in building Pipelines in Genomics – Large data volumes – Multiple, complex analytical tools In this talk we will focus on how we can provide workflow capabilities to end users in a way that is both easy to use and scalable

3 Enter Galaxy A free (for everyone) web service integrating a wealth of tools, compute resources, terabytes of reference data and permanent storage Open source software that makes it easy to integrate your own tools and data and customize your own site Flexible architecture -> Customizable 3

4 Galaxy Adoption ~50 deployments of Galaxy – Galaxy for MicroArray analysis, Machine Learning, Drug Discovery etc ~130,000 jobs a month and growing on the public instance of Galaxy 1 TB/week in user uploads – 60TB from China 150+ attendees in the Galaxy users conference – From 6 continents Adoption driven primarily by – Ease of use – Software as a service – Responsive to user needs 4

5 Opportunities for BIRN collaborators Galaxy for biomedical informatics – Researchers can discover, download interesting and useful datasets provided by BIRN – Analyze data using various BIRN tools – Create and share pipelines with other researchers – Create virtual collaborations by leveraging flexible, secure user and group management 5

6 Use case: CVRG-Galaxy Created a Galaxy instance for CVRG community Integrated it with Globus Online File transfer capabilities so researchers can get data for analysis Created a CVRG Toolbox in Galaxy with Bioconductor tools from CRData.org Investigating how individual PIs can contribute their own compute and storage 6

7 CVRG CRData Galaxy 7


Download ppt "Building Data-intensive Pipelines Ravi K Madduri Argonne National Lab University of Chicago."

Similar presentations


Ads by Google