Presentation is loading. Please wait.

Presentation is loading. Please wait.

Brian Corrie Technical Lead, iReceptor Technical Director, IRMACS Centre Simon Fraser University Services for Distributed Data, Security and Computation.

Similar presentations


Presentation on theme: "Brian Corrie Technical Lead, iReceptor Technical Director, IRMACS Centre Simon Fraser University Services for Distributed Data, Security and Computation."— Presentation transcript:

1 Brian Corrie Technical Lead, iReceptor Technical Director, IRMACS Centre Simon Fraser University Services for Distributed Data, Security and Computation - an iReceptor Perspective

2 WHAT IS IMMUNOGENETICS? Explores the relationship between the immune system and genetics Immune receptors (Antibodies and T-cell receptors) important to: Immune response to infectious disease Vaccine design Therapeutic antibodies to fight autoimmune diseases Target immune system against cancer cells

3 WHY IRECEPTOR? Goal: To improve the design of vaccines and therapeutic antibodies by integrating data repositories of antibody and T-cell receptor gene sequences Driver: Data Deluge from Next Generation Sequencing (NGS) New research area – 2009 Science first NGS of immune repertoire (Zebrafish) Millions of sequences per subject, many subjects/lab, 100s of labs Crude analysis tools, few data standards Researcher need: Federate data from multiple labs/institutions ( securely ) Perform analyses across federated data ( as authorized ) Perform complex analyses ( using advanced computational resources ) iReceptor Solution: iReceptor Scientific Gateway for immunogenetics

4 IRECEPTOR MODEL Fundamental concept - distributed data model Data is maintained/controlled by data stewards Data stewards expose data as desired/allowed through iReceptor services iReceptor federates data and coordinates analysis/processing IR DB: DB model for immunogenetics Patient data, sample data, sequence data, annotated data, analysis data… IR Data Service: Service to ingest data into IRDB IR DB Service: Service to expose database IR Auth Service: Service to authenticate to resources – DB & Compute IR Computation Services: Service interface to perform analyses on federated data iReceptor Scientific Gateway: Web interface to coordinate/control these services

5 IRECEPTOR MODEL Fundamental concept - distributed data model Data is maintained using iReceptor DB model, controlled by data stewards Data stewards expose data as desired/allowed through iReceptor DB service iReceptor federates data (DB Service} and coordinates analysis/processing (HPC Services)

6 IRECEPTOR TECHNICAL CHALLENGES Authentication iReceptor Gateway, HPC (Compute Canada + others), DBs How to federate/unify (OAuth2, oauth-myproxy, DB) Distributed DB model Scalability to multiple DBs, performance on LARGE DBs HPC integration for analysis Moving data to/from HPC, running/controlling jobs, monitoring Gateway How do we make something that makes it all easy to use!

7 IRECEPTOR TECHNICAL CHALLENGES Authentication iReceptor Gateway, HPC (Compute Canada + others), DBs How to federate/unify (OAuth2, oauth-myproxy, DB) Distributed DB model Scalability to multiple DBs, performance on LARGE DBs HPC integration for analysis Moving data to/from HPC, running/controlling jobs, monitoring Gateway How do we make something that makes it all easy to use!

8 IRECEPTOR MODEL Fundamental concept - distributed data model Data is maintained using iReceptor DB model, controlled by data stewards Data stewards expose data as desired/allowed through iReceptor DB service iReceptor federates data (DB Service} and coordinates analysis/processing (HPC Services)

9 AGAVE - WHAT IS IT? Hosted, multi-tenant REST API (yes, it is hosted!) Developed at Texas Advanced Compute Centre, UT Austin Evolved from iPlant Collaborative (Gateway for plant genomics) Web service abstraction enabling : secure, uniform access to HPC, HTC, Cloud systems secure, uniform access to data app store model to finding and running scientific codes White-label, Science-as-a-Service for everyone

10 AGAVE - WHAT DOES IT DO? Service based APIS for: Auth: Token based authentication System: HPC system management App: HPC App model for abstracting HPC codes Job: Job execution, job monitoring, data staging File: Data management, metadata support, provenance, reporting Transform: Data transformation User: User management, discovery Event: Notifications and events

11 AGAVE - WHAT DOES IT DO? Service based APIS for: Auth: Token based authentication System: HPC system management App: HPC App model for abstracting HPC codes Job: Job execution, job monitoring, data staging File: Data management, metadata support, provenance, reporting Transform: Data transformation User: User management, discovery Event: Notifications and events

12 AGAVE – IDENTITY/AUTH Multi-tenanting for IdP management and customization Identity – WSO2 API Manager with Customizations Hosted identity as a service Pluggable identity providers: LDAP, AD, DB, etc Client registration, metering/monitoring (scopes access) Authentication/Authorization AGAVE v2 is an OAuth2 provider Fine-grained access control over resources (e.g. clients) Integrated group and role support

13 AGAVE – HPC SYSTEM MANAGEMENT Use publicly available data and compute systems (e.g. TACC) Register private systems you have accounts on (e.g. Compute Canada) Systems are associated with people… Register bugaboo.westgrid.ca, silo.westgrid.ca for bcorrie Uses common HPC authentication methods Uname/password (trust?), ssh keys, X509 certs, myproxy certs, etc. Understands and can manage schedulers Submit and monitor jobs

14 AGAVE – HPC APP MANAGEMENT AGAVE Apps are wrappers around computational codes Expose any HPC code through the AGAVE API Share and publish Apps for someone, anyone, everyone Implement cross platform solutions App interface stays the same (inputs, parameters) App hides system dependencies Apps are instantiated by associating an App with a System Apps can be associated with many systems

15 AGAVE – HPC JOB MANAGEMENT AGAVE Jobs are instantiations of an App on a System Jobs have inputs and parameters Provides: Common service based interface for all job execution Automatic data staging and archiving Fire and forget execution Events and notifications of job status Provenance and reproducibility

16 AGAVE – HPC DATA MANAGEMENT Data can be moved between AGAVE Systems Provides: Access to distributed files systems through common API FTP, SFTP, GridFTP, IRODS, … Built in Metadata support Synchronous and asynchronous file transfers Automated retry, parallelism, and monitoring Notifications and updates Provenance and reporting

17 AGAVE – OTHER COOL STUFF… Good documentation (although sometimes out of date/buggy) agaveapi.co Quick start, samples/templates, tutorial, App builder Command line API (for the coders) Client SDKs (Java, Python, PHP) Gateway ToGo How to “spin up” a Scientific Gateway quickly

18 SUMMARY AGAVE has been very valuable for iReceptor Helping us deal with IdP/authentication/authorization Our main tool for data staging and job management to HPC systems It is very powerful, it is very easy to use – have a look at it… Drawbacks Hosted service Some things are outside of our control (uptime) System is widely used, uptime/availability is high

19 USEFUL LINKS AGAVE Stuff AGAVE – agaveapi.co AGAVE Presentations – agaveapi.co/presentations AGAVE Developer API Presentation – agaveapi.co/slides/agave-overview/#/ iPlant – www.iplantcollaborative.org WSO2 – wso2.com/products/api-manager/ iReceptor Stuff Main site: ireceptor.irmacs.sfu.ca Gateway site: ireceptorgw.irmacs.sfu.ca (alpha – no access for you! Yet!)

20 Questions?


Download ppt "Brian Corrie Technical Lead, iReceptor Technical Director, IRMACS Centre Simon Fraser University Services for Distributed Data, Security and Computation."

Similar presentations


Ads by Google