Presentation is loading. Please wait.

Presentation is loading. Please wait.

GPPD Activities on SLD “Sistemas Largamente Distribuídos”

Similar presentations

Presentation on theme: "GPPD Activities on SLD “Sistemas Largamente Distribuídos”"— Presentation transcript:

1 GPPD Activities on SLD “Sistemas Largamente Distribuídos”

2 Areas Ubiquitous Computing and Sensor Networks Massively Multiplayer Online Games Grid and Cloud Computing –MapReduce

3 MW for Ubicomp / Wireless Sensor Networks

4 Current team –Anubis Rossetto – PhD student Dependability; health care –Carlos Oberdan Rolim – PhD student Context aware; health care –João Ladislao – PhD student Context aware; agricultural

5 MW for Ubicomp / Wireless Sensor Networks Current team –Gisele Souza – Master student SW engineering –Rodrigo Souza – PhD student Wireless sensor networks; agricultural –Valderi Leithardt – PhD student Resource discovery on WSN

6 MW for Ubicomp / Wireless Sensor Networks Recent team –Cristiano Costa - Former phd student architecture model; context service –Diego Midon Pereira – master student Probabilistic difusion –Luciano C. da Silva - PhD student Context adaptation –Several others PhD and master students Since 2000

7 MW for Ubicomp / Wireless Sensor Networks Mobile team –TG students –Intelligent Systems for Urban Transport –Other applications –Humberto Felizzola –Luciano Goulart –Renan Drabach –Sébastien Skorupski Polytech, Grenoble, France

8 MW for Ubicomp / Wireless Sensor Networks Ubicomp (UC) → delivering more meaningful services which are ubiquitously available –higher integration of systems, –improved mobility and scalability, –context-awareness, self-adaptation, etc.

9 MW for Ubicomp / Wireless Sensor Networks Research interests → development of middleware and frameworks to foster development of UC systems –Current focus: support for mobile context-aware systems support for autonomous control of adaptations communication protocols for ad-hoc networks Agricultural systems Health care systems

10 MW for Ubicomp / Wireless Sensor Networks Research interests → … –Previous grants from Fapergs, CNPq and RNP –Since 2000

11 MW for Ubicomp / Wireless Sensor Networks Research interests → … –Outcomes: so far, produced 4 generations of Ubicomp middleware (ISAM, ContextS, GRADEp and Continuum) –Partners include UNISINOS, UFSM, UCPel and UFPEL

12 MW for Ubicomp / Wireless Sensor Networks MW4G Project –Definition International partnership project between UFRGS and University of Coimbra (Portugal) financed by CAPES-Grice to work on Wireless Sensor Networks (WSNs) –Main objective → proposal and evaluation of new content and mechanisms of middleware for WSNs

13 MW for Ubicomp / Wireless Sensor Networks Other Projects –INF Smart Cities Large INF project

14 MW for Ubicomp / Wireless Sensor Networks Group pages –Current activities s/UbicompOverview s/UbicompOverview –Isam project: s/ISAM/WebHome s/ISAM/WebHome –MW4R: s/MW4R/WebHome s/MW4R/WebHome

15 Massively Multiplayer Online Games

16 Current team –Eduardo Bezerra Phd student Former master student Interest algorithms and Load Balancing –Fabio Cecin Former phd and master student Architecture models (p2p) State cheating –Felipe Severino Master student; cheating

17 Massively Multiplayer Online Games Objective: decentralize the network support –Client-server (traditional and expensive) –Fully decentralized (P2P) –Hybrid (client-server + p2p)

18 Massively Multiplayer Online Games Issues implied by decentralization –Game state consistency management –Saturation of the peers’ network link –Firewall/NAT between peers –Cheating facilitated by the lack of central arbiter

19 Massively Multiplayer Online Games Projects/Works: –FreeMMG mmog middleware with a hybrid architecture each cell - portion of the virtual environment - is managed by a P2P group and the interaction between the cells is mediated by a central server Fabio Cecin master thesis

20 Massively Multiplayer Online Games Projects/Works: –P2PSE – hybrid middleware, which divides the game into: Action spaces: –fast-paced and network-demanding game interactions, such as fighting; –consists of small-scale spaces disjoint from the rest of the game world, –where the interaction is in a P2P manner, with a limited number of players

21 Massively Multiplayer Online Games Projects/Works: –P2PSE – (cont.): Social space: –unique and large space managed by the central server, –where only social interactions are allowed, such as chatting, trading etc., –between an unlimited number of players FreeMMG 2: PhD Thesis of Fabio Cecin

22 Massively Multiplayer Online Games Projects/Works: –Cosmmus: doctorate plan of Eduardo Bezerra Load balancing algorithms Interest algorithms (communication) Optimistic multicast algorithms –New work, at Lugano –Cheating treatment: master plan of Felipe Severino

23 Massively Multiplayer Online Games Group Pages –Games s/Jogos/WebHome s/Jogos/WebHome

24 Activities on Grid and Volunteer Computing MapReduce

25 Grid and Cloud Computing Current Team –Alexandre Miyazaki – IC student –Bruno Donassolo - Master student –Eduardo Martins da Rocha – TG student –Eder Fontoura - Master student –Julio Anjos – Master student

26 Grid and Cloud Computing Current Team –Marko Petek - Phd student –Otávio Kreling Zabaleta – TG student –Pedro de Botelho Marcos – Master Student –Wagner Kolberg - Master student

27 Grid and Cloud Computing Recent Team –Diego Gomes - Former master student –Rafael dal Zotto – Former master student

28 Grid and Cloud Computing Grid: what is? –Grid is a type of parallel and distributed system that enables the sharing, selection, and aggregation of geographically distributed "autonomous" resources dynamically at runtime depending on their availability, capability, performance, cost, and users' quality-of-service requirements gridfaq.html gridfaq.html

29 Grid and Cloud Computing Grid: what is –Grids are persistent environments that enable software applications to integrate instruments, displays, computational and information resources that are managed by diverse organizations in widespread locations GGF 2002

30 Grid and Cloud Computing Some software for grid –Globus Project Most (?) widely used grid sw GSI – Security MDS – Information Service GRAM – Execution Management Data Management Intensive use of Web Services –Not in last version

31 Grid and Cloud Computing Some software for grid –Boinc Project Desktop grid or Volunteer Computing Master/slave architecture Several applications (projects) –Seti@home – Client (anonymous) machine asks tasks to server machine Client machines are not reliable Very complex client scheduler

32 Grid and Cloud Computing Projects: –Profile-based scheduling algorithms taking in account the use of the resources. Modeled on XtremWeb architecture (Eder Fontoura); –A proposal for fast diskless checkpoint with object prevalence for volunteer computing environments, focused on small devices (Rafael Dal Zotto);

33 Grid and Cloud Computing Projects: –Evaluation and comparison between the XtremWeb scheduler (FCFS) and the model proposed by Fontoura, through simulations on the SimGrid and experiments on the Grid5000; –To run simulations on the SimGrid in a distributed way on a real grid architecture; –Modeling the BOINC scheduler on SimGrid –Bruno, Julio, Wagner and Eduardo

34 Grid and Cloud Computing Projects: –Evaluation of Boinc Scheduler when classical throughput-oriented projects X new burst projects –With a game theoretic modeling Nash Equilibrium –Using the BOINC scheduler on SimGrid –Bruno Donassolo and Eduardo Rocha

35 Grid and Cloud Computing Projects: –In the context of the Cern-CMS experiment: Development of a Phd Thesis and of a Msc Dissertation on the creation of a Files and Replicas System to use on the Grid Currently: –one Phd Student and –a former MSc Student –Both doing research on Cern-Geneve Marko Petek and Diego Gomes

36 Grid and Cloud Computing Implementations: –XtremWeb deployment with the original XW architecture scheduler (FCFS) and the model proposed by Eder Fontoura on the Grid5000 (Julio Anjos); –XtremWeb simulation on the SimGrid Desktop Computing scenario Both schedulers

37 MapReduce on Desktop Grid and Cloud Environments

38 MapReduce Team Current Team –Alexandre Miyazaki – IC student –Julio Anjos – Master student –Otávio Kreling Zabaleta – TG student –Pedro de Botelho Marcos – Master Student –Wagner Kolberg - Master student

39 The MapReduce Model MapReduce is a programming model for large- scale parallel computing, and for processing large data sets It abstracts the complexity of distributing parallel programming applications The model is inspired on the map and reduce primitives present in many functional languages like Lisp and Haskell The Hadoop implementation of the MapReduce model is one of the most used in production systems (e.g. Yahoo, Facebook, Amazon)

40 The MapReduce Model

41 MapReduce State of Art

42 Implementations on other platforms GPGPU o MARS o umber=5557865&tag=1 umber=5557865&tag=1 o Three versions:  GPU  GPU/CPU  GPU + Hadoop - Utiliza o streaming o Problem:  GPGPUS have no dynamic memory allocation  Necessary to introduce additional steps to calculate the output size of Map and Reduce phases

43 Implementations on other platforms Multicore o Phoenix o 8097 o Number of maps is the number of cores o Also for reduces o Main problem o Input size limited to memory computer capacity

44 Implementations on other platforms Desktop Grid o BitDew o o Funcioning similar to Hadoop. o Main problem o Environment volatility

45 Implementations on other platforms Cloud computing o Cloud MapReduce - umber=5948637  Decentralized  Deployed on the Amazon cloud OS  Only 3000 lines of code (Hadoop has 300k)  There is a task queue  Each VM searches a task and executes it  Results are written in queues to be consumed in other stages  Problem:  It uses proprietary OS-specific services  Which prevents its use elsewhere

46 Implementations on other platforms o Cloud computing o Azure MapReduce - umber=5708501  Same as above  However using MS cloud

47 Problems Heterogeneity o LATE o o Predicting the tasks that will take longer and launching them speculatively, shortening the job execution time o Problem: it can launch tasks unnecessarily

48 Problems Volatility o MOON o 1489 o Stable machines were used to prevent data loss and ensure that a job finishes execution o Problem: o It uses task re-execution to tolerate failures o Cost could be very high on volatile environment

49 Problems Fault tolerant o Single points of failure  Master  Master replication  3.1651271  HDFS NameNode  fault tolerant library  2

50 Problems Fault tolerant o Byzantines  Result verification  Centralized  p?arnumber=6008723  Distributed  p?arnumber=6009055

51 Problems Fault tolerant o Persistence of intermediate pairs  Prevents replay in case of failures  RAFT algorithms  p?arnumber=5767877  Low cost algorithms to persist and and retrieve intemediate pairs

52 Problems o Allocation of new machines during execution  Interesting to be able to restore the cpu power in case of computer failure  Respect SLAs.  MapReduce RSA - Resource Scaling Algorithm  arnumber=5999808

53 Problems The Hadoop framework makes some assumptions about the environment behavior, once it is optimized to run on computer clusters: –Nodes execute tasks at almost the same rate, therefore the tasks finish in waves. –A node with a lower execution rate than the cluster average is a straggler. –There is no cost to launch speculative tasks. –A non-responsive machine is in a failure state

54 MapReduce Our projects Julio and Wagner

55 Motivation There are two important properties in Desktop Grid environments: Heterogeneity and Volatility.

56 Adapting MapReduce to Desktop Grids 1st Approach - Data Distribution

57 Adapting MapReduce to Desktop Grids 2nd Approach - Speculative Task Execution

58 Adapting MapReduce to Desktop Grids 3rd Approach - Addressing Volatility

59 MapReduce Simulator for Heterogeneous Environments

60 60 Agenda Introduction Motivation MRPerf Simulator Architecture Characteristics Current Developing State Final Considerations

61 61 Motivation Implementing theoretical algorithms is much easier and faster, due to a simplified model. It is possible to easily validate modifications to task scheduling algorithms, the speculative execution mechanism, and data distribution (HDFS is modeled as a simple data structure to represent data placement in the grid).

62 62 Motivation Testing several configurations and scenarios is also faster. Number of nodes, computation power and network topology are indicated in a description file. There is no real computation and data transfer. No need to develop different MapReduce applications to perform validation over different task sizes.

63 63 Motivation Without the restrictions of a real environment, it is possible to test algorithm's scalability with different environments, topologies, and MapReduce configurations:  Chunk size and number of replicas;  Number of reduces;  etc.

64 64 MRPerf MRPerf is a MapReduce simulator developed on top of the ns-2 network simulator. So, why didn't we use MRPerf? MRPerf does not implement key features present in Hadoop, that are really important on heterogeneous environments, such as  HDFS chunk replication,  speculative execution mechanism,  and stragglers (faulty, slow nodes).

65 65 MRPerf Without data replication, once a node has processed all its local chunks, the master node will schedule map tasks over non- local chunks, leading to increased data transfer rates. If there is no speculative mechanism, slower machines will slow the whole MapReduce job. The lack of these two features alone, generates a great change on the framework behavior.

66 66 Characteristics of the Proposed Simulator Developed on top of SimGrid. SimGrid has a simplified model, making it faster. The chosen SimGrid API was MSG since there are a few Hadoop mechanisms that didn't seem to be adequate to SimDag although the map and reduce phases should follow a DAG model. The environment is described through SimGrid's native platform and deployment files.

67 67 Current Developing State Implemented: HDFS data distribution; Map, Copy and Reduce phases; HDFS chunk replication; Speculative execution mechanism. To do: Rack configuration; Node volatility (desktop grids).

68 68 Current Developing State CharacteristicMRPerfMR Version Base systemns-2SimGrid Heterogeneous environments NoYes Rack configurationYesTo do HDFS chunk replicationNoYes Speculative executionNoYes Node volatilityNoTo do

69 69 Current Developing State Validation with 20 nodes and 20 reduce tasks:

70 70 Current Developing State Validation with 200 nodes and 1 reduce task:

71 71 Simulation of the New Data Distribution Algorithm Simulating the new data distribution resulted in the expected behavior: less non-local tasks then the original algorithm.

72 72 Final Considerations The new simulator should assist on researches with MapReduce, specially in scalability and behavioral analysis. Handling heterogeneity and volatility will allow the study of MapReduce deployment on Desktop Grids.

73 73 Maresia Projetc Pedro Botelho PEP (master) Motivation Hadoop has single points of failure Master NameNode of the distributed file system

74 74 Maresia Project Goals Avoid single points of failure Decentralized approach – P2P Information previously contained in single points are now distributed/replicated among workers

75 Future Works To continue this work, we intend to perfect and define all the modifications to the Hadoop framework, and also implement these changes and fully test them. Through the development of a simulator using SIMGRID, we intend to validate the presented approaches. With the adapted Hadoop framework, we expect to obtain a reliable and efficient implementation of the MapReduce model destined to Desktop Grid environments.

76 Grid and Cloud Computing Group Pages –HEP Project s/Grade/HEP/WebHome s/Grade/HEP/WebHome –Boinc, XW and MR activities s/Grade/Overview s/Grade/Overview

77 GPPD Activities on SLD “Sistemas Largamente Distribuídos”

Download ppt "GPPD Activities on SLD “Sistemas Largamente Distribuídos”"

Similar presentations

Ads by Google