OurGrid: A Simple Solution for Running Bag-of-Tasks Applications on Grids Marcelo Meira, Walfredo Cirne (marcelo, Universidade.

Slides:



Advertisements
Similar presentations
1 Bogotá, EELA-2 1 st Conference, On the Co-existence of Service and Opportunistic Grids Francisco Brasileiro Universidade Federal.
Advertisements

MyGrid: A User-Centric Approach for Grid Computing Walfredo Cirne Universidade Federal da Paraíba.
High Performance Computing Course Notes Grid Computing.
The OurGrid Project Walfredo Cirne Universidade Federal de Campina Grande.
The OurGrid Project Walfredo Cirne Universidade Federal de Campina Grande.
Comments on The Progress of Computing William Nordhaus Iain Cockburn Boston University and NBER.
GridFlow: Workflow Management for Grid Computing Kavita Shinde.
Running Thor over MyGrid Walfredo Cirne Universidade Federal de Campina Grande.
An Introduction to Grid Computing Research at Notre Dame Prof. Douglas Thain University of Notre Dame
DataGrid Kimmo Soikkeli Ilkka Sormunen. What is DataGrid? DataGrid is a project that aims to enable access to geographically distributed computing power.
Grids and Grid Technologies for Wide-Area Distributed Computing Mark Baker, Rajkumar Buyya and Domenico Laforenza.
Labs of The World, Unite!!! Walfredo Cirne Universidade Federal de Campina Grande.
Cambodia-India Entrepreneurship Development Centre - : :.... :-:-
Virtualization for Cloud Computing
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
Self-Adaptive QoS Guarantees and Optimization in Clouds Jim (Zhanwen) Li (Carleton University) Murray Woodside (Carleton University) John Chinneck (Carleton.
Scientific Data Infrastructure in CAS Dr. Jianhui Scientific Data Center Computer Network Information Center Chinese Academy of Sciences.
Cmpe 494 Peer-to-Peer Computing Anıl Gürsel Didem Unat.
Virtualization Lab 3 – Virtualization Fall 2012 CSCI 6303 Principles of I.T.
Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example.
A Lightweight Platform for Integration of Resource Limited Devices into Pervasive Grids Stavros Isaiadis and Vladimir Getov University of Westminster
IST E-infrastructure shared between Europe and Latin America Biomedical Applications in EELA Esther Montes Prado CIEMAT (Spain)

WP9 Resource Management Current status and plans for future Juliusz Pukacki Krzysztof Kurowski Poznan Supercomputing.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
E-science grid facility for Europe and Latin America OurGrid E2GRIS1 Rafael Silva Universidade Federal de Campina.
Miguel Branco CERN/University of Southampton Enabling provenance on large-scale e-Science applications.
Architectural Support for Fine-Grained Parallelism on Multi-core Architectures Sanjeev Kumar, Corporate Technology Group, Intel Corporation Christopher.
20 October 2006Workflow Optimization in Distributed Environments Dynamic Workflow Management Using Performance Data David W. Walker, Yan Huang, Omer F.
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
1 st December 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
Alexandre Duarte Gustavo Wagner Francisco Brasileiro Walfredo Cirne Multi-Environment Software Testing on the Grid Universidade Federal de Campina Grande.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Code Applications Tamas Kiss Centre for Parallel.
1 Catania, 4 th EEGE User Forum/OGF 25, OurGrid integration with gLite based grids in EELA-2 Francisco Brasileiro Universidade.
Perspectives on Grid Technology Ian Foster Argonne National Laboratory The University of Chicago.
1 Bogotá, EELA-2 1 st Conference, The OurGrid Approach for Opportunistic Grid Computing Francisco Brasileiro Universidade Federal.
E-science grid facility for Europe and Latin America Bridging the High Performance Computing Gap with OurGrid Francisco Brasileiro Universidade.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
SEEK Welcome Malcolm Atkinson Director 12 th May 2004.
Authors: Ronnie Julio Cole David
E-science grid facility for Europe and Latin America OurGrid and the co-existence with gLite Alexandre Duarte Universidade Federal de Campina.
E-infrastructure shared between Europe and Latin America Interoperability between EELA and OurGrid Alexandre Duarte CERN and UFCG 1 st.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Microsoft Management Seminar Series SMS 2003 Change Management.
Conference name Company name INFSOM-RI Speaker name The ETICS Job management architecture EGEE ‘08 Istanbul, September 25 th 2008 Valerio Venturi.
Campus grids: e-Infrastructure within a University Mike Mineter National e-Science Centre 14 February 2006.
GSAF: A Grid-based Services Transfer Framework Chunyan Miao, Wang Wei, Zhiqi Shen, Tan Tin Wee.
Enabling e-Research in Combustion Research Community T.V Pham 1, P.M. Dew 1, L.M.S. Lau 1 and M.J. Pilling 2 1 School of Computing 2 School of Chemistry.
Uppsala, April 12-16th 2010EGEE 5th User Forum1 A Business-Driven Cloudburst Scheduler for Bag-of-Task Applications Francisco Brasileiro, Ricardo Araújo,
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Grid Applications in the EELA Project Geneviève.
The User Perspective Michelle Osmond. The Research Challenge Molecular biology, biochemistry, plant biology, genetics, toxicology, chemistry, and more.
© Copyright AARNet Pty Ltd PRAGMA Update & some personal observations James Sankar Network Engineer - Middleware.
E-infrastructure shared between Europe and Latin America Interoperability between EELA and OurGrid Alexandre Duarte CERN IT-GD EELA Project.
Millions of Jobs or a few good solutions …. David Abramson Monash University MeSsAGE Lab X.
E-science grid facility for Europe and Latin America JRA1 role and its interaction with SA1 and NA3 Francisco Brasileiro Universidade Federal.
1 Cloud Services Requirements and Challenges of Large International User Groups Laurence Field IT/SDC 2/12/2014.
18 May 2006CCGrid2006 Dynamic Workflow Management Using Performance Data Lican Huang, David W. Walker, Yan Huang, and Omer F. Rana Cardiff School of Computer.
InSilicoLab – Grid Environment for Supporting Numerical Experiments in Chemistry Joanna Kocot, Daniel Harężlak, Klemens Noga, Mariusz Sterzel, Tomasz Szepieniec.
Lecture-6 Bscshelp.com. Todays Lecture  Which Kinds of Applications Are Targeted?  Business intelligence  Search engines.
8 th International Desktop Grid Federation Workshop, Hannover, Germany, August 17 th, 2011 DEGISCO Desktop Grids for International Scientific Collaboration.
Deploying Research in the Real World: The OurGrid Experience
Clouds , Grids and Clusters
A comparison between a Computational Grid and a High-end Multicore Server in an academic environment David Risinamhodzi – North-west University- South.
Grid Computing.
1. 2 VIRTUAL MACHINES By: Satya Prasanna Mallick Reg.No
The Globus Toolkit™: Information Services
ShareGrid: architettura e middleware
Presentation transcript:

OurGrid: A Simple Solution for Running Bag-of-Tasks Applications on Grids Marcelo Meira, Walfredo Cirne (marcelo, Universidade Federal de Campina Grande

eScience Computers are changing scientific research −Enabling collaboration −As investigation tools (simulations, data mining, etc...) As a result, many research labs around the world are now computation hungry Buying more computers is just part of answer Better using existing resources is the other 2

Solution 1: Globus Grids promise “plug on the wall and solve your problem” Globus is the closest realization of such vision −Deployed for dozens of sites But it requires highly-specialized skills and complex off-line negotiation Good solution for large labs that work in collaboration with other large labs −CERN’s LCG is a good example of state-of-art 3

Solution 2: Voluntary Computing have been a great success, harnessing the power of millions of computers However, to use this solution, you must −have a very high visibility project −be in a well-known institution −invest a good deal of effort in “advertising” 4

And what about the thousands of small and middle research labs throughout the world which also need lots of compute power? 5

Solution 3: OurGrid OurGrid is a peer-to-peer grid Each lab correspond to a peer in the system OurGrid is easy to install and automatically configures itself Labs can freely join the system without any human intervention To keep it doable, we focus on Bag-of-Tasks applications 6

Bag-of-Tasks Applications Data mining Massive search (as search for crypto keys) Parameter sweeps Monte Carlo simulations Fractals (such as Mandelbrot) Image manipulation (such as tomography) And many others… 7

OurGrid Challenges How to make people collaborate? −Free-riders are the norm in peer-to-peer networks −Why should you collaborate with someone you don’t know? How to keep it simple? −Grids are complex for a reason −How to deal with the need for information and configuration? How to keep it safe? −Labs you don’t know (or trust) can freely join the grid 8

Network of Favors OurGrid forms a peer-to-peer community in which peers are free to join It’s important to encourage collaboration within OurGrid (i.e., resource sharing) −In file-sharing, most users freeride OurGrid uses the Network of Favor −All peers maintain a local balance for all known peers −Peers with greater balances have priority −The emergent behavior of the system is that by donating more, you get more resources −No additional infrastructure is needed 9

A B C D E NoF at Work [1] ConsumerFavor ProviderFavorReport * * * = no idle resources now broker B60 D45 10

NoF at Work [2] A B C D E B60 D45 E 0 ConsumerQuery ProviderWorkRequest * * = no idle resources now * broker 11

Free-rider Consumption Epsilon is the fraction of resources consumed by free-riders 12

Equity Among Collaborators 13

Scheduling with No Information Grid scheduling typically depends on information about the grid (e.g. machine speed and load) and the application (e.g. task size) However, getting good information is hard Can we schedule without information and deploy the system now? Work-queue with Replication −Tasks are sent to idle processors −When there are no more tasks, running tasks are replicated on idle processors −The first replica to finish is the official execution −Other replicas are cancelled 14

Work-queue with Replication 8000 experiments Experiments varied in −grid heterogeneity −application heterogeneity −application granularity Performance summary: 15

WQR Overhead Obviously, the drawback in WQR is cycles wasted by the cancelled replicas Wasted cycles: 16

Data Aware Scheduling WQR achieves good performance for CPU- intensive BoT applications However, many important BoT applications are data-intensive These applications frequently reuse data −During the same execution −Between two successive executions Storage Affinity uses replication and just a bit of static information to achieve good scheduling for data intensive applications 17

Storage Affinity Results 3000 experiments Experiments varied in −grid heterogeneity −application heterogeneity −application granularity Performance summary: Storage AffinityX-SuffrageWQR Average (seconds) Standard Deviation

SWAN: OurGrid Security Bag-of-Tasks applications only communicate to receive input and return the output −This is done by OurGrid itself The remote task runs inside a Xen virtual machine, with no network access, and disk access only to a designated partition 19

SWAN Architecture Guest OS Grid OS Grid Middleware Grid Application Guest OS Grid OS Grid Middleware Grid Application Guest OS Grid OS Grid Middleware Grid Application 20

Making it Work for Real... 21

OurGrid Status OurGrid free-to-join community is in production since December 2004 OurGrid is open source (GPL) and is available at −We’ve had external contributions OurGrid latest version is 3.3 −It contains the 10th version of MyGrid −The Network of Favors is available since version 3.0 −SWAN has been made available with version 3.1 −We’ve had around 180 downloads 22

23

Some projects using OurGrid HIV Research (LNCC) Smart Pumping (UFCG/PETROBRAS) GerpavGrid (PUC-RS) GridVida (UFPE/CESAR) Grinfoseg (UNIFOR) BioPauá (LNCC) SegHidro (UFCG) 24

HIV research with OurGrid B,c,FB,c,F HIV-2 HIV-1 M O ABCDFGHJKABCDFGHJK N ? prevalent in Europe and Americas prevalent in Africa majority in the world 18% in Brazil 25

HIV protease x Ritonavir Subtype B RMSD Subtype F 26

Performance Results for the HIV Application 55 machines in 6 administrative domains in the US and Brazil Task = 3.3 MB input, 1 MB output, 4 to 33 minutes of dedicated execution Ran 60 tasks in 38 minutes Speed-up is 29.2 for 55 machines −Considering an 18.5-minute average machine 27

Smart Pumping Objective: Control of pumping stations of PETROBRAS in states of Rio Grande do Norte and Ceará. Uses genetic algorithms to search for the best scenario considering operational and economic factors. 28

GerpavGrid Objective: Provide a system for better managing Porto Alegre’s streets maintenance plan. 29

GridVida Objectives: Use computing grid to integrate SUS/Recife medical electronic information system. Design a tool for supporting medical imaging diagnosis. Use the processing power provided by Grid Computing to enable Similarity Measurement (SM) applied to Content-Based Medical Image Retrieval (CBIR) 30

GrinfoSeg Objective: Provide a data mining system for SENASP – Secretaria Nacional de Segurança Pública. 31

BioPAUÁ Objective: To offer a tool, as well the facility, for researches working in several important fields (e.g., bioinformatics, structural biology, biochemistry, medicinal chemistry, biopharmacology), for running Molecular Dynamics (MD) simulations over a computational grid environment. 32

SegHidro Objective: Provide means to simulate a wide variety of scenarios, based on weather and climate forecasts, in order to better decide about water reservoir management, agricultural planning and flood control. A typical SegHidro application is a workflow in which the input of a model is the output of its precedent. Portal users can upload their own models or use existing ones previously inserted by other users. 33

Conclusions We have a free-to-join grid solution for Bag-of- Tasks applications working today. Real users provide invaluable feedback for systems research. Delivering results to real users is really cool! :-) OurGrid may be a tool to help small labs to join EELA. 34

Questions?

Thank you! Merci! Danke! Grazie! Gracias! Obrigado! More at