HP-SEE project and the HPC Bioinformatics Life Science gateway

Slides:



Advertisements
Similar presentations
Research Infrastructures WP 2012 Call 10 e-Infrastructures part Topics: Construction of new infrastructures (or major upgrades) – implementation.
Advertisements

High Performance Computing Course Notes Grid Computing.
FP6−2004−Infrastructures−6-SSA [ Empowering e Science across the Mediterranean ] Grids and their role towards development F. Ruggieri – INFN (EUMEDGRID.
+ Software engineering in High Performance Computing Anastas Mishev Faculty of Computer Science and Engineering UKIM.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Cracow Grid Workshop’10 Kraków, October 11-13,
SCI-BUS is supported by the FP7 Capacities Programme under contract nr RI WS-PGRADE/gUSE Supporting e-Science communities in Europe Zoltan Farkas.
Chemistry GRID and its application for air pollution forecast Computer and Automation Research Institute of the Hungarian Academy of Sciences (MTA SZTAKI)
Hungarian Supercomputing GRID
EGEE-III INFSO-RI Enabling Grids for E-sciencE Nov. 18, EGEE and gLite are registered trademarks EGEE-III, Regional, and National.
1 European policies for e- Infrastructures Belarus-Poland NREN cross-border link inauguration event Minsk, 9 November 2010 Jean-Luc Dorel European Commission.
30 March, 2011, Yerevan, Armenia Interconnection of Armenian e- Infrastructures with the pan- Euroepan Integrated Environments H. Astsatryan.
IST E-infrastructure shared between Europe and Latin America Biomedical Applications in EELA Esther Montes Prado CIEMAT (Spain)
HP-SEE High-Performance Computing Infrastructure for South East Europe’s Research Communities Regional Conference, Sofia, 9 December 2010.
EGI-InSPIRE Steven Newhouse Interim EGI.eu Director EGI-InSPIRE Project Director.
INFSO-RI Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
SEE-GRID-SCI Regional Grid Infrastructure: Resource for e-Science Regional eInfrastructure development and results IT’10, Zabljak,
From GEANT to Grid empowered Research Infrastructures ANTONELLA KARLSON DG INFSO Research Infrastructures Grids Information Day 25 March 2003 From GEANT.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Lessons learnt from the EGEE Application Porting Support activity Gergely Sipos Coordinator.
From P-GRADE to SCI-BUS Peter Kacsuk, Zoltan Farkas and Miklos Kozlovszky MTA SZTAKI - Computer and Automation Research Institute of the Hungarian Academy.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Introduction to SHIWA Technology Peter Kacsuk MTA SZTAKI and Univ.of Westminster
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Vision for European DCIs Steven Newhouse Project Director, EGI-InSPIRE 15/09/2010.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
HP-SEE HP-SEE project and the HPC Bioinformatics Life Science gateway M. KOZLOVSZKY Obuda University The HP-SEE initiative is co-funded by.
1 4/23/2007 Introduction to Grid computing Sunil Avutu Graduate Student Dept.of Computer Science.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Code Applications Tamas Kiss Centre for Parallel.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Next steps with EGEE Gergely Sipos
1 e-Infrastructures e-Infrastructures Taking stock and looking ahead an European perspective Bernhard Fabianek European Commission - DG INFSO GÉANT & e-Infrastructure.
1 European e-Infrastructure experiences gained and way ahead OGF 20 / EGEE User’s Forum 9 th May 2007 Mário Campolargo European Commission - DG INFSO Head.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Services for advanced workflow programming.
HP-SEE High-Performance Computing Infrastructure for South East Europe’s Research Communities 8th e-Infrastructure Concertation Meeting,
The SEE-GRID-SCI initiative is co-funded by the European Commission under the FP7 Research Infrastructures contract no Workflow repository, user.
1 SCI-BUS: building e-Science gateways in Europe: building e-Science gateways in Europe Peter Kacsuk and Zoltan Farkas MTA SZTAKI.
INFSO-RI SA2 ETICS2 first Review Valerio Venturi INFN Bruxelles, 3 April 2009 Infrastructure Support.
SCI-BUS is supported by the FP7 Capacities Programme under contract nr RI MTA SZTAKI background for the DARIAH CC Zoltan Farkas MTA SZTAKI LPDS,
OpenNebula: Experience at SZTAKI Peter Kacsuk, Sandor Acs, Mark Gergely, Jozsef Kovacs MTA SZTAKI EGI CF Helsinki.
E-science grid facility for Europe and Latin America CHAIN Proposal v0.1 EELA-2 compilation CERN e-Infrastructure projects Meeting ( )
SHIWA project presentation Project 1 st Review Meeting, Brussels 09/11/2011 Peter Kacsuk MTA SZTAKI
A. Andries, A. Altuhov, P. Bogatencov, N. Iliuha, G. Secrieru RENAM Association Moldova Integrarea în Spaţiul European de Cercetare devine.
EGI-InSPIRE EGI-InSPIRE RI The European Grid Infrastructure Steven Newhouse Director, EGI.eu Project Director, EGI-InSPIRE 29/06/2016CoreGrid.
Convert generic gUSE Portal into a science gateway Akos Balasko.
Co-ordination & Harmonisation of Advanced e-Infrastructures for Research and Education Data Sharing Grant.
Exposing WS-PGRADE/gUSE for large user communities Peter Kacsuk, Zoltan Farkas, Krisztian Karoczkai, Istvan Marton, Akos Hajnal,
1 The Life-Science Grid Community Tristan Glatard 1 1 Creatis, CNRS, INSERM, Université de Lyon, France The Spanish Network for e-Science 2/12/2010.
Using iRODS with the EnginFrame Grid Portal into the GRIDA3 project Francesco Locunto Marco Piras Matteo Vocale.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Application Porting Support SSC and proposal Gergely Sipos
Accessing the VI-SEEM infrastructure
Clouds , Grids and Clusters
Peter Kacsuk, Zoltan Farkas MTA SZTAKI
Advancing South-East Europe into the eInfrastructure era
Security Requirements for ChinaGrid Applications - What the current grid security solutions cannot do Hai Jin Huazhong University of Science and Technology.
LinkSCEEM-2: A computational resource for the Eastern Mediterranean
Challenges of HPC High Performance Computing in Albania
FET Plans FET - Proactive 1.
Grid infrastructure development: current state
WS-PGRADE for Molecular Sciences and XSEDE
Infrastructure Support
EUMEDGRID-Support Project
GRENA Participation in European Commission Projects
Bioinformatics Community of CNGrid A New Approach to Utilizing Grids
Peter Kacsuk MTA SZTAKI
Recap: introduction to e-science
Conference: Data and Life Sci +DC
VI-SEEM Virtual Research Environment
Clouds from FutureGrid’s Perspective
ELIXIR Competence Center
DEGISCO project - Desktop Grids for application developers and users
Defining the Grid Fabrizio Gagliardi EMEA Director Technical Computing
e-Infrastructures for Research and Education:
Presentation transcript:

HP-SEE project and the HPC Bioinformatics Life Science gateway M. KOZLOVSZKY Obuda University The HP-SEE initiative is co-funded by the European Commission under the FP7 Research Infrastructures contract no. 261499

Overview The HP-SEE project HP-SEE Life Sciences Virtual Community HP-SEE Bioinformatics Life Science gateway Sequence alignment applications workflow based online bioinformatics services Working with workflows/gUSE Summer School on Workflows and Gateways for Grids and Clouds 2012 – Budapest ,Hungary 2-6.07.2012 2

Pan-European e-Infrastructures vision The Research Network infrastructure provides fast interconnection and advanced services among Research and Education institutes of different countries The Research Distributed Computing Infrastructure (Grid, HPC) provides a distributed environment for sharing computing power, storage, instruments and databases through the appropriate software (middleware) in order to solve complex application problems This integrated environment is called electronic infrastructure (eInfrastructure) allowing new methods of global collaborative research - often referred to as electronic science (eScience) The creation of the eInfrastructure is one of the key objectives to facilitate building of the European Research Area e-Science Collaborations DCI Infrastructure Network Infrastructure Summer School on Workflows and Gateways for Grids and Clouds 2012 – Budapest ,Hungary 2-6.07.2012 3

Comp chem, Life sciences Seismology, Meteorology, Environment Context: the Model - Converged Communication & Service Infrastructure for South-East Europe SEE-LIGHT & GEANT Comp physics, Comp chem, Life sciences Seismology, Meteorology, Environment SEE-GRID & EGI HP-SEE User / Knowledge layer Summer School on Workflows and Gateways for Grids and Clouds 2012 – Budapest ,Hungary 2-6.07.2012 4

Context: Timeline and funding Summer School on Workflows and Gateways for Grids and Clouds 2012 – Budapest ,Hungary 2-6.07.2012 5

HP-SEE: Project Contract : RI-261499 Project type: CP & CSA Call: INFRA-2010-1.2.3: VRCs Start date: 01/09/2010 Duration: 24 + 9 months Total budget: 3 885 196 € Funding from the EC: 2 100 000 € Total funded effort, PMs: 539.5 Web site: www.hp-see.eu Summer School on Workflows and Gateways for Grids and Clouds 2012 – Budapest ,Hungary 2-6.07.2012 6

HP-SEE: Partnership Contractors (14) Third Party / JRU mechanism used associate universities / research centres Summer School on Workflows and Gateways for Grids and Clouds 2012 – Budapest ,Hungary 2-6.07.2012 7

HP-SEE: Project Objectives Objective 1 – Empowering multi-disciplinary virtual research communities Objective 2 – Deploying integrated infrastructure for virtual research communities Including a GEANT link to Southern Caucasus Objective 3 – Policy development and stimulating regional inclusion in pan-European HPC trends Objective 4 – Strengthening the regional and national human network Summer School on Workflows and Gateways for Grids and Clouds 2012 – Budapest ,Hungary 2-6.07.2012 8

The HP-SEE Life Science VRC and its objectives Main goal: Utilize the combined HPC resources with regional needs coming from the life/bioscience communities, fostering the research process in the field within the region with the help of the large-scale high availability infrastructure, and facilitate the cooperation between the sparsely distributed life science research centres. Data and limitations The Life Sciences domain has been revolutionized by advances in both computer hardware and software algorithms. Assembling the Human Genome Gene-expression chips to understand cellular processes Exponential growth in the amount of publicly available genomic data. GeneBank Traditional database approaches are no longer sufficient for rapidly performing life science queries involving the fusion of data types. Existing computational tools were created by experimentalists dealing with data sets that were miniscule in comparison to those available today. As a result, software that was once perfectly adequate now performs slowly or is incapable of successful analysis on traditional computational platforms. Summer School on Workflows and Gateways for Grids and Clouds 2012 – Budapest ,Hungary 2-6.07.2012 9

Accessible infrastructure Country Center Computing Cores Teraflops Bulgaria BG Blue Gene/P 8192 27.85 HPCG 576 3.23 FYR of Macedonia FINKI SC 2016 9 Hungary NIIFI SC 144 0.5 Pecs SC 1152 10 Debrecen SC 3078 18 Szeged 2112 14 Romania InfraGRID 400 2.5 IFIN_BIO 256 2.72 IFIN_BC 368 3.9 NCIT 562 3.4 UVT Blue Gene/P 4096 13.9 Serbia PARADOX 672 6.26 TOTAL 23624 115.26 HP-SEE Supercomputing infrastructure SEE-GRID-SCI Grid infrastructure Summer School on Workflows and Gateways for Grids and Clouds 2012 – Budapest ,Hungary 2-6.07.2012 10

HP-SEE’s LS Applications 7 applications from 5 countries Greece: Searching for novel miRNA genes and their targets (miRs) Network models of short and long term memory (CMSLTM) Montenegro: DNA Multi-core Analysis (DNAMA) Hungary: Deep sequencing for short fragment alignment (DeepAligner) - gUSE & workflow based In-silico Disease Gene Mapper (DiseaseGene) - gUSE & workflow based Georgia: Modeling of some biochemical processes with the purpose of realization of their thin and purposeful synthesis (MSBP) Armenia: Molecular Dynamics Study of Complex systems (MDSCS) Summer School on Workflows and Gateways for Grids and Clouds 2012 – Budapest ,Hungary 2-6.07.2012 11

Why gUSE/WS-PGRADE Infrastructure Based on gLite and Arc as middleware HP-SEE infrastructure Based on gLite and Arc as middleware Authentication procedures are painfull (as usual) Interoperabilty with grids is a plus Application Workflow like process with embedded (legacy) applications Restricted input parameter sets for the algorithms Service like operation Portal features for a community Knowledge, licensing & support Open source software environment needed Knowledge transfer required for the application specific modules Summer School on Workflows and Gateways for Grids and Clouds 2012 – Budapest ,Hungary 2-6.07.2012 12

HP-SEE Bioinformatics eScience Gateway HP-SEE Bioinformatics eScience Gateway hosted at Obuda University, operated by MTA SZTAKI. gUSE+WS-PGRADE (v3.3.2) - Liferay based SEE region’s supercomputing & grid infrastructure used Accessible at: http://ls-hpsee.nik.uni-obuda.hu:8080/liferay-portal-6.0.5 Summer School on Workflows and Gateways for Grids and Clouds 2012 – Budapest ,Hungary 2-6.07.2012 13

Architecture and application porting steps Unified porting steps of the applications: Summer School on Workflows and Gateways for Grids and Clouds 2012 – Budapest ,Hungary 2-6.07.2012 14

DeepAligner-Deep sequencing for short fragment alignment Description & Objectives Mapping short fragment reads to open-access eukaryotic genomes is solvable by a group of algorithms (BLAST, BWA, PatternHunter, and other sequence alignment tools – BLAST /mpiblast or scalablast/ is one of the most frequently used tool in bioinformatics and the others are relative new fast light-weighted tools that aligns short sequences. Local installations of these algorithms are typically not able to handle such problem size therefore the procedure runs slowly, while web based implementations cannot accept high number of queries. The HP-SEE infrastructure allows accessing massively parallel architectures and the sequence alignment code is distributed free for academia. Result Online workflow based short sequence alignment service Impact Freely available service/code for large scale short sequence alignment Collaborations Hungarian Bioinformatics Association, Semmelweis University HP-SEE infrastructure used: Hungarian HPC, NIIF’s supercomputing sites Summer School on Workflows and Gateways for Grids and Clouds 2012 – Budapest ,Hungary 2-6.07.2012 15

DeepAligner-Deep sequencing for short fragment alignment (contd.) Small scale launch (Home cluster): PBS/Linux Cluster, at the Obuda University – John von Neumann Faculty of Informatics. Activity and technical assistance in pre-production stage: Technical assistance was provided by MTA SZTAKI and NIIF. Porting: Application was ported using(Perl/C). Workflow and GUI was created for the application by Obuda University. Benchmarking Scaled from 32 cores to 96 cores (MPI). DeepAligner Status The online service is using two from NIIF’s supercomputing infrastructure (Budapest site and Szeged site). Foreseen activities: Parameter assignments optimization of the GUI, more scientific publications about short sequence alignment. Further scaling is planned with performance analysis. More information: http://hpseewiki.ipb.ac.rs/index.php/DeepAligner Summer School on Workflows and Gateways for Grids and Clouds 2012 – Budapest ,Hungary 2-6.07.2012 16

Development & working on gUSE/WS-PGRADE Pros Close collaboration and useful support (pros) ARC middleware connector was developed from scratch by MTA SZTAKI on request ASM and ARC submitter related bugs have been found and reported Helpful and skilled support & development team Cons ARC middleware problems (internal) hard to find Summer School on Workflows and Gateways for Grids and Clouds 2012 – Budapest ,Hungary 2-6.07.2012 17

Thank you for you attention! Future plans Additional plug-in like online bioinformatics services More sequence alignment workflows More sequence multiple alignment workflows Sequence database quality measurement workflows Open up the gateway for users outside SEE region Thank you for you attention! Questions? Summer School on Workflows and Gateways for Grids and Clouds 2012 – Budapest ,Hungary 2-6.07.2012 18

gUSE/WS-PGRADE architecture DeepAligner DiseaseGene ASM Application specific Module WS-PGRADE Summer School on Workflows and Gateways for Grids and Clouds 2012 – Budapest ,Hungary 2-6.07.2012 19