Current team –Anubis Rossetto – PhD student Dependability; health care –Carlos Oberdan Rolim – PhD student Context aware; health care –João Ladislao – PhD student Context aware; agricultural
MW for Ubicomp / Wireless Sensor Networks Current team –Gisele Souza – Master student SW engineering –Rodrigo Souza – PhD student Wireless sensor networks; agricultural –Valderi Leithardt – PhD student Resource discovery on WSN
MW for Ubicomp / Wireless Sensor Networks Recent team –Cristiano Costa - Former phd student architecture model; context service –Diego Midon Pereira – master student Probabilistic difusion –Luciano C. da Silva - PhD student Context adaptation –Several others PhD and master students Since 2000
MW for Ubicomp / Wireless Sensor Networks Mobile team –TG students –Intelligent Systems for Urban Transport –Other applications –Humberto Felizzola –Luciano Goulart –Renan Drabach –Sébastien Skorupski Polytech, Grenoble, France
MW for Ubicomp / Wireless Sensor Networks Ubicomp (UC) → delivering more meaningful services which are ubiquitously available –higher integration of systems, –improved mobility and scalability, –context-awareness, self-adaptation, etc.
MW for Ubicomp / Wireless Sensor Networks Research interests → development of middleware and frameworks to foster development of UC systems –Current focus: support for mobile context-aware systems support for autonomous control of adaptations communication protocols for ad-hoc networks Agricultural systems Health care systems
MW for Ubicomp / Wireless Sensor Networks Research interests → … –Previous grants from Fapergs, CNPq and RNP –Since 2000
MW for Ubicomp / Wireless Sensor Networks Research interests → … –Outcomes: so far, produced 4 generations of Ubicomp middleware (ISAM, ContextS, GRADEp and Continuum) –Partners include UNISINOS, UFSM, UCPel and UFPEL
MW for Ubicomp / Wireless Sensor Networks MW4G Project –Definition International partnership project between UFRGS and University of Coimbra (Portugal) financed by CAPES-Grice to work on Wireless Sensor Networks (WSNs) –Main objective → proposal and evaluation of new content and mechanisms of middleware for WSNs
MW for Ubicomp / Wireless Sensor Networks Other Projects –INF Smart Cities Large INF project
Current team –Eduardo Bezerra Phd student Former master student Interest algorithms and Load Balancing –Fabio Cecin Former phd and master student Architecture models (p2p) State cheating –Felipe Severino Master student; cheating
Massively Multiplayer Online Games Objective: decentralize the network support –Client-server (traditional and expensive) –Fully decentralized (P2P) –Hybrid (client-server + p2p)
Massively Multiplayer Online Games Issues implied by decentralization –Game state consistency management –Saturation of the peers’ network link –Firewall/NAT between peers –Cheating facilitated by the lack of central arbiter
Massively Multiplayer Online Games Projects/Works: –FreeMMG mmog middleware with a hybrid architecture each cell - portion of the virtual environment - is managed by a P2P group and the interaction between the cells is mediated by a central server Fabio Cecin master thesis
Massively Multiplayer Online Games Projects/Works: –P2PSE – hybrid middleware, which divides the game into: Action spaces: –fast-paced and network-demanding game interactions, such as fighting; –consists of small-scale spaces disjoint from the rest of the game world, –where the interaction is in a P2P manner, with a limited number of players
Massively Multiplayer Online Games Projects/Works: –P2PSE – (cont.): Social space: –unique and large space managed by the central server, –where only social interactions are allowed, such as chatting, trading etc., –between an unlimited number of players FreeMMG 2: PhD Thesis of Fabio Cecin
Massively Multiplayer Online Games Projects/Works: –Cosmmus: doctorate plan of Eduardo Bezerra Load balancing algorithms Interest algorithms (communication) Optimistic multicast algorithms –New work, at Lugano –Cheating treatment: master plan of Felipe Severino
Massively Multiplayer Online Games Group Pages –Games https://saloon.inf.ufrgs.br/twiki/view/Projeto s/Jogos/WebHomehttps://saloon.inf.ufrgs.br/twiki/view/Projeto s/Jogos/WebHome
Activities on Grid and Volunteer Computing MapReduce
Grid and Cloud Computing Current Team –Alexandre Miyazaki – IC student –Bruno Donassolo - Master student –Eduardo Martins da Rocha – TG student –Eder Fontoura - Master student –Julio Anjos – Master student
Grid and Cloud Computing Current Team –Marko Petek - Phd student –Otávio Kreling Zabaleta – TG student –Pedro de Botelho Marcos – Master Student –Wagner Kolberg - Master student
Grid and Cloud Computing Recent Team –Diego Gomes - Former master student –Rafael dal Zotto – Former master student
Grid and Cloud Computing Grid: what is? –Grid is a type of parallel and distributed system that enables the sharing, selection, and aggregation of geographically distributed "autonomous" resources dynamically at runtime depending on their availability, capability, performance, cost, and users' quality-of-service requirements http://www.cs.mu.oz.au/~raj/GridInfoware/ gridfaq.htmlhttp://www.cs.mu.oz.au/~raj/GridInfoware/ gridfaq.html
Grid and Cloud Computing Grid: what is –Grids are persistent environments that enable software applications to integrate instruments, displays, computational and information resources that are managed by diverse organizations in widespread locations GGF 2002
Grid and Cloud Computing Some software for grid –Globus Project Most (?) widely used grid sw http://www.globus.org/ GSI – Security MDS – Information Service GRAM – Execution Management Data Management Intensive use of Web Services –Not in last version
Grid and Cloud Computing Some software for grid –Boinc Project Desktop grid or Volunteer Computing http://boinc.berkeley.edu/ Master/slave architecture Several applications (projects) –Seti@home –http://setiathome.berkeley.edu/http://setiathome.berkeley.edu/ Client (anonymous) machine asks tasks to server machine Client machines are not reliable Very complex client scheduler
Grid and Cloud Computing Projects: –Profile-based scheduling algorithms taking in account the use of the resources. Modeled on XtremWeb architecture (Eder Fontoura); –A proposal for fast diskless checkpoint with object prevalence for volunteer computing environments, focused on small devices (Rafael Dal Zotto);
Grid and Cloud Computing Projects: –Evaluation and comparison between the XtremWeb scheduler (FCFS) and the model proposed by Fontoura, through simulations on the SimGrid and experiments on the Grid5000; –To run simulations on the SimGrid in a distributed way on a real grid architecture; –Modeling the BOINC scheduler on SimGrid –Bruno, Julio, Wagner and Eduardo
Grid and Cloud Computing Projects: –Evaluation of Boinc Scheduler when classical throughput-oriented projects X new burst projects –With a game theoretic modeling Nash Equilibrium –Using the BOINC scheduler on SimGrid –Bruno Donassolo and Eduardo Rocha
Grid and Cloud Computing Projects: –In the context of the Cern-CMS experiment: Development of a Phd Thesis and of a Msc Dissertation on the creation of a Files and Replicas System to use on the Grid Currently: –one Phd Student and –a former MSc Student –Both doing research on Cern-Geneve Marko Petek and Diego Gomes
Grid and Cloud Computing Implementations: –XtremWeb deployment with the original XW architecture scheduler (FCFS) and the model proposed by Eder Fontoura on the Grid5000 (Julio Anjos); –XtremWeb simulation on the SimGrid Desktop Computing scenario Both schedulers
MapReduce on Desktop Grid and Cloud Environments
MapReduce Team Current Team –Alexandre Miyazaki – IC student –Julio Anjos – Master student –Otávio Kreling Zabaleta – TG student –Pedro de Botelho Marcos – Master Student –Wagner Kolberg - Master student
The MapReduce Model MapReduce is a programming model for large- scale parallel computing, and for processing large data sets It abstracts the complexity of distributing parallel programming applications The model is inspired on the map and reduce primitives present in many functional languages like Lisp and Haskell The Hadoop implementation of the MapReduce model is one of the most used in production systems (e.g. Yahoo, Facebook, Amazon)
Implementations on other platforms GPGPU o MARS o http://ieeexplore.ieee.org/xpls/abs_all.jsp?arn umber=5557865&tag=1 http://ieeexplore.ieee.org/xpls/abs_all.jsp?arn umber=5557865&tag=1 o Three versions: GPU GPU/CPU GPU + Hadoop - Utiliza o streaming o Problem: GPGPUS have no dynamic memory allocation Necessary to introduce additional steps to calculate the output size of Map and Reduce phases
Implementations on other platforms Multicore o Phoenix o http://dl.acm.org/citation.cfm?id=1317533.131 8097 o Number of maps is the number of cores o Also for reduces o Main problem o Input size limited to memory computer capacity
Implementations on other platforms Desktop Grid o BitDew o http://dl.acm.org/citation.cfm?id=1918097 o Funcioning similar to Hadoop. o Main problem o Environment volatility
Implementations on other platforms Cloud computing o Cloud MapReduce - http://ieeexplore.ieee.org/xpls/abs_all.jsp?arn umber=5948637 Decentralized Deployed on the Amazon cloud OS Only 3000 lines of code (Hadoop has 300k) There is a task queue Each VM searches a task and executes it Results are written in queues to be consumed in other stages Problem: It uses proprietary OS-specific services Which prevents its use elsewhere
Implementations on other platforms o Cloud computing o Azure MapReduce - http://ieeexplore.ieee.org/xpls/abs_all.jsp?arn umber=5708501 Same as above However using MS cloud
Problems Heterogeneity o LATE o http://dl.acm.org/citation.cfm?id=1855744 o Predicting the tasks that will take longer and launching them speculatively, shortening the job execution time o Problem: it can launch tasks unnecessarily
Problems Volatility o MOON o http://dl.acm.org/citation.cfm?id=1851476.185 1489 o Stable machines were used to prevent data loss and ensure that a job finishes execution o Problem: o It uses task re-execution to tolerate failures o Cost could be very high on volatile environment
Problems Fault tolerant o Single points of failure Master Master replication http://dl.acm.org/citation.cfm?id=165126 3.1651271 HDFS NameNode fault tolerant library http://dl.acm.org/citation.cfm?id=162960 2
Problems Fault tolerant o Byzantines Result verification Centralized http://ieeexplore.ieee.org/xpls/abs_all.js p?arnumber=6008723 Distributed http://ieeexplore.ieee.org/xpls/abs_all.js p?arnumber=6009055
Problems Fault tolerant o Persistence of intermediate pairs Prevents replay in case of failures RAFT algorithms http://ieeexplore.ieee.org/xpls/abs_all.js p?arnumber=5767877 Low cost algorithms to persist and and retrieve intemediate pairs
Problems o Allocation of new machines during execution Interesting to be able to restore the cpu power in case of computer failure Respect SLAs. MapReduce RSA - Resource Scaling Algorithm http://ieeexplore.ieee.org/xpls/abs_all.jsp? arnumber=5999808
Problems The Hadoop framework makes some assumptions about the environment behavior, once it is optimized to run on computer clusters: –Nodes execute tasks at almost the same rate, therefore the tasks finish in waves. –A node with a lower execution rate than the cluster average is a straggler. –There is no cost to launch speculative tasks. –A non-responsive machine is in a failure state
MapReduce Simulator for Heterogeneous Environments
60 Agenda Introduction Motivation MRPerf Simulator Architecture Characteristics Current Developing State Final Considerations
61 Motivation Implementing theoretical algorithms is much easier and faster, due to a simplified model. It is possible to easily validate modifications to task scheduling algorithms, the speculative execution mechanism, and data distribution (HDFS is modeled as a simple data structure to represent data placement in the grid).
62 Motivation Testing several configurations and scenarios is also faster. Number of nodes, computation power and network topology are indicated in a description file. There is no real computation and data transfer. No need to develop different MapReduce applications to perform validation over different task sizes.
63 Motivation Without the restrictions of a real environment, it is possible to test algorithm's scalability with different environments, topologies, and MapReduce configurations: Chunk size and number of replicas; Number of reduces; etc.
64 MRPerf MRPerf is a MapReduce simulator developed on top of the ns-2 network simulator. So, why didn't we use MRPerf? MRPerf does not implement key features present in Hadoop, that are really important on heterogeneous environments, such as HDFS chunk replication, speculative execution mechanism, and stragglers (faulty, slow nodes).
65 MRPerf Without data replication, once a node has processed all its local chunks, the master node will schedule map tasks over non- local chunks, leading to increased data transfer rates. If there is no speculative mechanism, slower machines will slow the whole MapReduce job. The lack of these two features alone, generates a great change on the framework behavior.
66 Characteristics of the Proposed Simulator Developed on top of SimGrid. SimGrid has a simplified model, making it faster. The chosen SimGrid API was MSG since there are a few Hadoop mechanisms that didn't seem to be adequate to SimDag although the map and reduce phases should follow a DAG model. The environment is described through SimGrid's native platform and deployment files.
67 Current Developing State Implemented: HDFS data distribution; Map, Copy and Reduce phases; HDFS chunk replication; Speculative execution mechanism. To do: Rack configuration; Node volatility (desktop grids).
68 Current Developing State CharacteristicMRPerfMR Version Base systemns-2SimGrid Heterogeneous environments NoYes Rack configurationYesTo do HDFS chunk replicationNoYes Speculative executionNoYes Node volatilityNoTo do
69 Current Developing State Validation with 20 nodes and 20 reduce tasks:
70 Current Developing State Validation with 200 nodes and 1 reduce task:
71 Simulation of the New Data Distribution Algorithm Simulating the new data distribution resulted in the expected behavior: less non-local tasks then the original algorithm.
72 Final Considerations The new simulator should assist on researches with MapReduce, specially in scalability and behavioral analysis. Handling heterogeneity and volatility will allow the study of MapReduce deployment on Desktop Grids.
73 Maresia Projetc Pedro Botelho PEP (master) Motivation Hadoop has single points of failure Master NameNode of the distributed file system
74 Maresia Project Goals Avoid single points of failure Decentralized approach – P2P Information previously contained in single points are now distributed/replicated among workers
Future Works To continue this work, we intend to perfect and define all the modifications to the Hadoop framework, and also implement these changes and fully test them. Through the development of a simulator using SIMGRID, we intend to validate the presented approaches. With the adapted Hadoop framework, we expect to obtain a reliable and efficient implementation of the MapReduce model destined to Desktop Grid environments.
Grid and Cloud Computing Group Pages –HEP Project https://saloon.inf.ufrgs.br/twiki/view/Projeto s/Grade/HEP/WebHomehttps://saloon.inf.ufrgs.br/twiki/view/Projeto s/Grade/HEP/WebHome –Boinc, XW and MR activities https://saloon.inf.ufrgs.br/twiki/view/Projeto s/Grade/Overviewhttps://saloon.inf.ufrgs.br/twiki/view/Projeto s/Grade/Overview
GPPD Activities on SLD “Sistemas Largamente Distribuídos”