Brian Corrie Technical Lead, iReceptor Technical Director, IRMACS Centre Simon Fraser University Services for Distributed Data, Security and Computation.

Slides:



Advertisements
Similar presentations
GRADD: Scientific Workflows. Scientific Workflow E. Science laboris Workflows are the new rock and roll of eScience Machinery for coordinating the execution.
Advertisements

Abstraction Layers Why do we need them? –Protection against change Where in the hourglass do we put them? –Computer Scientist perspective Expose low-level.
Foundation API: Today and Tomorrow Rion Dooley. Today v1 is in production 192 apps Creeping up on 200,000 requests/month About to hit 10,000th job Blowing.
EHarmony in Cloud Subtitle Brian Ko. eHarmony Online subscription-based matchmaking service Available in United States, Canada, Australia and United Kingdom.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
Futures – Alpha Cloud Deployment and Application Management.
A Java Architecture for the Internet of Things Noel Poore, Architect Pete St. Pierre, Product Manager Java Platform Group, Internet of Things September.
XSEDE 13 July 24, Galaxy Team: PSC Team:
RCAC Research Computing Presents: DiaGird Overview Tuesday, September 24, 2013.
T Sponsors Sameer Chabungbam Principal Program Manager, Microsoft Connector API Apps BizTalk Summit 2015 – London ExCeL London | April 13th & 14th.
Notes to the presenter. I would like to thank Jim Waldo, Jon Bostrom, and Dennis Govoni. They helped me put this presentation together for the field.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Data Grids: Globus vs SRB. Maturity SRB  Older code base  Widely accepted across multiple communities  Core components are tightly integrated Globus.
27. to 28. March 2007 | Geneva, Switzerland. Fabrice Romelard ilem SA Level 200.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of Atmosphere.
Fraser Technical Solutions, LLC
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
Project Proposal: Academic Job Market and Application Tracker Website Project designed by: Cengiz Gunay Client: Cengiz Gunay Audience: PhD candidates and.
Windows.Net Programming Series Preview. Course Schedule CourseDate Microsoft.Net Fundamentals 01/13/2014 Microsoft Windows/Web Fundamentals 01/20/2014.
February Semantion Privately owned, founded in 2000 First commercial implementation of OASIS ebXML Registry and Repository.
Apache Airavata GSOC Knowledge and Expertise Computational Resources Scientific Instruments Algorithms and Models Archived Data and Metadata Advanced.
Customized cloud platform for computing on your terms !
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of Atmosphere.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Neil Witheridge APAN29 Sydney February 2010 ARCS Authorisation Services Neil Witheridge Manager, ARCS Authorisation Services APAN29, Sydney, February 2010.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
Office Business Applications Workshop Defining Business Process and Workflows.
Using Biological Cyberinfrastructure Scaling Science and People: Applications in Data Storage, HPC, Cloud Analysis, and Bioinformatics Training Scaling.
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
The iPlant Collaborative Using iPlant for sharing, managing, and analyzing ecological data Ramona Walls Presented at ESA 2014 – Ignite session August 12,
AgINFRA science gateway for workflows and integrated services 07/02/2012 Robert Lovas MTA SZTAKI.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
IPlant Collaborative Hands-on Cyberinfrastructure Workshop - Part 1 R. Walls University of Arizona Biodiversity Information Standards (TDWG) Sep. 28, 2015,
ABone Architecture and Operation ABCd — ABone Control Daemon Server for remote EE management On-demand EE initiation and termination Automatic EE restart.
NEES Cyberinfrastructure Center at the San Diego Supercomputer Center, UCSD George E. Brown, Jr. Network for Earthquake Engineering Simulation NEES TeraGrid.
GCRC Meeting 2004 BIRN Coordinating Center Software Development Vicky Rowley.
Using the ARCS Grid and Compute Cloud Jim McGovern.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of the iPlant Discovery Environment.
Axis AI Solves Challenges of Complex Data Extraction and Document Classification through Advanced Natural Language Processing and Machine Learning MICROSOFT.
Storing digital assets on Grid/EGI FedCloud with gLibrary Giuseppe La Rocca, INFN DARIAH ERIC.
IPlant Collaborative Tools and Services Workshop Overview of the iPlant Discovery Environment Sriram Srinivasan.
Tutorial on Science Gateways, Roma, Catania Science Gateway Framework Motivations, architecture, features Riccardo Rotondo.
Active Directory Domain Services (AD DS). Identity and Access (IDA) – An IDA infrastructure should: Store information about users, groups, computers and.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
Collection-Based Persistent Archives Arcot Rajasekar, Richard Marciano, Reagan Moore San Diego Supercomputer Center Presented by: Preetham A Gowda.
REST API to develop application for mobile devices Mario Torrisi Dipartimento di Fisica e Astronomia – Università degli Studi.
InSilicoLab – Grid Environment for Supporting Numerical Experiments in Chemistry Joanna Kocot, Daniel Harężlak, Klemens Noga, Mariusz Sterzel, Tomasz Szepieniec.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of Atmosphere.
International Planetary Data Alliance Registry Project Update September 16, 2011.
CyVerse Workshop Discovery Environment Overview. Welcome to the Discovery Environment A Simple Interface to Hundreds of Bioinformatics Apps, Powerful.
System Software Laboratory Databases and the Grid by Paul Watson University of Newcastle Grid Computing: Making the Global Infrastructure a Reality June.
Jean-Philippe Baud, IT-GD, CERN November 2007
Accessing the VI-SEEM infrastructure
Computing Clusters, Grids and Clouds Globus data service
Tools and Services Workshop
Joslynn Lee – Data Science Educator
CyVerse Discovery Environment
Tools and Services Workshop Overview of Atmosphere
USF Health Informatics Institute (HII)
Module 01 ETICS Overview ETICS Online Tutorials
Technical Capabilities
MMG: from proof-of-concept to production services at scale
Presentation transcript:

Brian Corrie Technical Lead, iReceptor Technical Director, IRMACS Centre Simon Fraser University Services for Distributed Data, Security and Computation - an iReceptor Perspective

WHAT IS IMMUNOGENETICS? Explores the relationship between the immune system and genetics Immune receptors (Antibodies and T-cell receptors) important to: Immune response to infectious disease Vaccine design Therapeutic antibodies to fight autoimmune diseases Target immune system against cancer cells

WHY IRECEPTOR? Goal: To improve the design of vaccines and therapeutic antibodies by integrating data repositories of antibody and T-cell receptor gene sequences Driver: Data Deluge from Next Generation Sequencing (NGS) New research area – 2009 Science first NGS of immune repertoire (Zebrafish) Millions of sequences per subject, many subjects/lab, 100s of labs Crude analysis tools, few data standards Researcher need: Federate data from multiple labs/institutions ( securely ) Perform analyses across federated data ( as authorized ) Perform complex analyses ( using advanced computational resources ) iReceptor Solution: iReceptor Scientific Gateway for immunogenetics

IRECEPTOR MODEL Fundamental concept - distributed data model Data is maintained/controlled by data stewards Data stewards expose data as desired/allowed through iReceptor services iReceptor federates data and coordinates analysis/processing IR DB: DB model for immunogenetics Patient data, sample data, sequence data, annotated data, analysis data… IR Data Service: Service to ingest data into IRDB IR DB Service: Service to expose database IR Auth Service: Service to authenticate to resources – DB & Compute IR Computation Services: Service interface to perform analyses on federated data iReceptor Scientific Gateway: Web interface to coordinate/control these services

IRECEPTOR MODEL Fundamental concept - distributed data model Data is maintained using iReceptor DB model, controlled by data stewards Data stewards expose data as desired/allowed through iReceptor DB service iReceptor federates data (DB Service} and coordinates analysis/processing (HPC Services)

IRECEPTOR TECHNICAL CHALLENGES Authentication iReceptor Gateway, HPC (Compute Canada + others), DBs How to federate/unify (OAuth2, oauth-myproxy, DB) Distributed DB model Scalability to multiple DBs, performance on LARGE DBs HPC integration for analysis Moving data to/from HPC, running/controlling jobs, monitoring Gateway How do we make something that makes it all easy to use!

IRECEPTOR TECHNICAL CHALLENGES Authentication iReceptor Gateway, HPC (Compute Canada + others), DBs How to federate/unify (OAuth2, oauth-myproxy, DB) Distributed DB model Scalability to multiple DBs, performance on LARGE DBs HPC integration for analysis Moving data to/from HPC, running/controlling jobs, monitoring Gateway How do we make something that makes it all easy to use!

IRECEPTOR MODEL Fundamental concept - distributed data model Data is maintained using iReceptor DB model, controlled by data stewards Data stewards expose data as desired/allowed through iReceptor DB service iReceptor federates data (DB Service} and coordinates analysis/processing (HPC Services)

AGAVE - WHAT IS IT? Hosted, multi-tenant REST API (yes, it is hosted!) Developed at Texas Advanced Compute Centre, UT Austin Evolved from iPlant Collaborative (Gateway for plant genomics) Web service abstraction enabling : secure, uniform access to HPC, HTC, Cloud systems secure, uniform access to data app store model to finding and running scientific codes White-label, Science-as-a-Service for everyone

AGAVE - WHAT DOES IT DO? Service based APIS for: Auth: Token based authentication System: HPC system management App: HPC App model for abstracting HPC codes Job: Job execution, job monitoring, data staging File: Data management, metadata support, provenance, reporting Transform: Data transformation User: User management, discovery Event: Notifications and events

AGAVE - WHAT DOES IT DO? Service based APIS for: Auth: Token based authentication System: HPC system management App: HPC App model for abstracting HPC codes Job: Job execution, job monitoring, data staging File: Data management, metadata support, provenance, reporting Transform: Data transformation User: User management, discovery Event: Notifications and events

AGAVE – IDENTITY/AUTH Multi-tenanting for IdP management and customization Identity – WSO2 API Manager with Customizations Hosted identity as a service Pluggable identity providers: LDAP, AD, DB, etc Client registration, metering/monitoring (scopes access) Authentication/Authorization AGAVE v2 is an OAuth2 provider Fine-grained access control over resources (e.g. clients) Integrated group and role support

AGAVE – HPC SYSTEM MANAGEMENT Use publicly available data and compute systems (e.g. TACC) Register private systems you have accounts on (e.g. Compute Canada) Systems are associated with people… Register bugaboo.westgrid.ca, silo.westgrid.ca for bcorrie Uses common HPC authentication methods Uname/password (trust?), ssh keys, X509 certs, myproxy certs, etc. Understands and can manage schedulers Submit and monitor jobs

AGAVE – HPC APP MANAGEMENT AGAVE Apps are wrappers around computational codes Expose any HPC code through the AGAVE API Share and publish Apps for someone, anyone, everyone Implement cross platform solutions App interface stays the same (inputs, parameters) App hides system dependencies Apps are instantiated by associating an App with a System Apps can be associated with many systems

AGAVE – HPC JOB MANAGEMENT AGAVE Jobs are instantiations of an App on a System Jobs have inputs and parameters Provides: Common service based interface for all job execution Automatic data staging and archiving Fire and forget execution Events and notifications of job status Provenance and reproducibility

AGAVE – HPC DATA MANAGEMENT Data can be moved between AGAVE Systems Provides: Access to distributed files systems through common API FTP, SFTP, GridFTP, IRODS, … Built in Metadata support Synchronous and asynchronous file transfers Automated retry, parallelism, and monitoring Notifications and updates Provenance and reporting

AGAVE – OTHER COOL STUFF… Good documentation (although sometimes out of date/buggy) agaveapi.co Quick start, samples/templates, tutorial, App builder Command line API (for the coders) Client SDKs (Java, Python, PHP) Gateway ToGo How to “spin up” a Scientific Gateway quickly

SUMMARY AGAVE has been very valuable for iReceptor Helping us deal with IdP/authentication/authorization Our main tool for data staging and job management to HPC systems It is very powerful, it is very easy to use – have a look at it… Drawbacks Hosted service Some things are outside of our control (uptime) System is widely used, uptime/availability is high

USEFUL LINKS AGAVE Stuff AGAVE – agaveapi.co AGAVE Presentations – agaveapi.co/presentations AGAVE Developer API Presentation – agaveapi.co/slides/agave-overview/#/ iPlant – WSO2 – wso2.com/products/api-manager/ iReceptor Stuff Main site: ireceptor.irmacs.sfu.ca Gateway site: ireceptorgw.irmacs.sfu.ca (alpha – no access for you! Yet!)

Questions?