Going Large-Scale in P2P Experiments Using the JXTA Distributed Framework Mathieu Jan & Sébastien Monnet Projet PARIS Paris, 13 February 2004.

Slides:

Advertisements

Similar presentations

1 Computational Asset Description for Cyber Experiment Support using OWL Telcordia Contact: Marian Nodine Telcordia Technologies Applied Research

Advertisements

Performance Testing - Kanwalpreet Singh.

Deployment of DIET and JuxMem using JDF: ongoing work Mathieu Jan Projet PARIS Rennes, 4 May 2004.

INTRODUCTION TO SIMULATION WITH OMNET++ José Daniel García Sánchez ARCOS Group – University Carlos III of Madrid.

Rheeve: A Plug-n-Play Peer- to-Peer Computing Platform Wang-kee Poon and Jiannong Cao Department of Computing, The Hong Kong Polytechnic University ICDCSW.

Workload Management Workpackage Massimo Sgaravatto INFN Padova.

Software Frameworks for Acquisition and Control European PhD – 2009 Horácio Fernandes.

Workload Management Massimo Sgaravatto INFN Padova.

Undergraduate Poster Presentation Match 31, 2015 Department of CSE, BUET, Dhaka, Bangladesh Wireless Sensor Network Integretion With Cloud Computing H.M.A.

February 11, 2003Ninth International Symposium on High Performance Computer Architecture Memory System Behavior of Java-Based Middleware Martin Karlsson,

MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 7 Configuring File Services in Windows Server 2008.

11 SERVER CLUSTERING Chapter 6. Chapter 6: SERVER CLUSTERING2 OVERVIEW  List the types of server clusters.  Determine which type of cluster to use for.

Understanding and Managing WebSphere V5

KARMA with ProActive Parallel Suite 12/01/2009 Air France, Sophia Antipolis Solutions and Services for Accelerating your Applications.

© 2012 LogiGear Corporation. All Rights Reserved Robot framework.

Oracle10g RAC Service Architecture Overview of Real Application Cluster Ready Services, Nodeapps, and User Defined Services.

Bottlenecks: Automated Design Configuration Evaluation and Tune.

JuxMem: An Adaptive Supportive Platform for Data Sharing on the Grid Gabriel Antoniu, Luc Bougé, Mathieu Jan IRISA / INRIA & ENS Cachan, France Workshop.

LEGO – Rennes, 3 Juillet 2007 Deploying Gfarm and JXTA-based applications using the ADAGE deployment tool Landry Breuil, Loïc Cudennec and Christian Perez.

Deploying DIET and JuxMem: GoDIET + JDF Mathieu Jan PARIS Research Group IRISA INRIA & ENS Cachan / Brittany Extension Rennes Lyon, July 2004.

Architecting Web Services Unit – II – PART - III.

Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.

Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting October 10-11, 2002.

Large-scale Deployment in P2P Experiments Using the JXTA Distributed Framework Gabriel Antoniu, Luc Bougé, Mathieu Jan & Sébastien Monnet PARIS Research.

Peer-to-Peer Distributed Shared Memory? Gabriel Antoniu, Luc Bougé, Mathieu Jan IRISA / INRIA & ENS Cachan/Bretagne France Dagstuhl seminar, October 2003.

Building a Parallel File System Simulator E Molina-Estolano, C Maltzahn, etc. UCSC Lab, UC Santa Cruz. Published in Journal of Physics, 2009.

Contents 1.Introduction, architecture 2.Live demonstration 3.Extensibility.

Installation and Development Tools National Center for Supercomputing Applications University of Illinois at Urbana-Champaign The SEASR project and its.

Stuart Wakefield Imperial College London Evolution of BOSS, a tool for job submission and tracking W. Bacchi, G. Codispoti, C. Grandi, INFN Bologna D.

JuxMem: An Adaptive Supportive Platform for Data Sharing on the Grid Gabriel Antoniu, Luc Bougé, Mathieu Jan IRISA / INRIA & ENS Cachan, France Grid Data.

Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”

Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!

Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.

Visualizing DIET and JuxMem Mathieu Jan PARIS Research Group IRISA INRIA & ENS Cachan / Brittany Extension Rennes Lyon, July 2004.

SONIC-3: Creating Large Scale Installations & Deployments Andrew S. Neumann Principal Engineer, Progress Sonic.

Latest news on JXTA and JuxMem-C/DIET Mathieu Jan GDS meeting, Rennes, 11 march 2005.

Framework for MDO Studies Amitay Isaacs Center for Aerospace System Design and Engineering IIT Bombay.

Apache JMeter By Lamiya Qasim. Apache JMeter Tool for load test functional behavior and measure performance. Questions: Does JMeter offers support for.

ECI – electronic Commerce Infrastructure “ An application to the Shares Market ” Demetris Zeinalipour ( Melinos Kyriacou

Experiment Management System CSE 423 Aaron Kloc Jordan Harstad Robert Sorensen Robert Trevino Nicolas Tjioe Status Report Presentation Industry Mentor:

This poster has been developed with support from the CATIIS project Program doctoral interregional și transnațional de excelență în domeniile “Calculatoare.

Mark E. Fuller Senior Principal Instructor Oracle University Oracle Corporation.

SAN DIEGO SUPERCOMPUTER CENTER Inca Control Infrastructure Shava Smallen Inca Workshop September 4, 2008.

SAN DIEGO SUPERCOMPUTER CENTER Administering Inca with incat Jim Hayes Inca Workshop September 4-5, 2008.

Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.

7. Grid Computing Systems and Resource Management

Marcelo R.N. Mendes. What is FINCoS? A set of tools for data generation, load submission, and performance measurement of CEP systems; Main Characteristics:

Performance Evaluation of JXTA-* Communication Layers Mathieu Jan PARIS Research Group Paris, November 2004.

Performance Testing Test Complete. Performance testing and its sub categories Performance testing is performed, to determine how fast some aspect of a.

GraDS MacroGrid Carl Kesselman USC/Information Sciences Institute.

Status & development of the software for CALICE-DAQ Tao Wu On behalf of UK Collaboration.

ANR CIGC LEGO (ANR-CICG-05-11) Bordeaux, 2006, December 11 th Automatic Application Deployment on Grids Landry Breuil, Boris Daix, Sébastien Lacour, Christian.

Tool Integration with Data and Computation Grid “Grid Wizard 2”

SPI NIGHTLIES Alex Hodgkins. SPI nightlies  Build and test various software projects each night  Provide a nightlies summary page that displays all.

David Foster LCG Project 12-March-02 Fabric Automation The Challenge of LHC Scale Fabrics LHC Computing Grid Workshop David Foster 12 th March 2002.

Selenium server By, Kartikeya Rastogi Mayur Sapre Mosheca. R

Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.

 Cloud Computing technology basics Platform Evolution Advantages  Microsoft Windows Azure technology basics Windows Azure – A Lap around the platform.

Sponsored by the National Science Foundation Systematic Experimentation Sarah Edwards GENI Project Office.

Information Initiative Center, Hokkaido University North 11, West 5, Sapporo , Japan Tel, Fax: General.

Wednesday NI Vision Sessions

Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.

1 Scalability and Accuracy in a Large-Scale Network Emulator Nov. 12, 2003 Byung-Gon Chun.

SQL Database Management

CMS DCS: WinCC OA Installation Strategy

IBM Workload Scheduler 2015 Take the Complexity Out of Workload Automation, while Keeping the Technology Up-to-Date IEM fixlets and Centralized Agent Update.

Self Healing and Dynamic Construction Framework:

Class project by Piyush Ranjan Satapathy & Van Lepham

Mark McKelvin EE249 Embedded System Design December 03, 2002

Presentation transcript:

Going Large-Scale in P2P Experiments Using the JXTA Distributed Framework Mathieu Jan & Sébastien Monnet Projet PARIS Paris, 13 February 2004

2 Outline How to test P2P systems at a large-scale? The JDF tool Experimenting with various network configurations Experimenting with various volatility conditions Ongoing and future work

3 How to test P2P systems at a large-scale? How to reproduce and test P2P systems? Volatility Heterogeneous architectures Large-scale Many papers on Gnutella, KaZaA, etc Behavior not yet fully understood Experiments on CFS, PAST, etc Mostly simulation Real experiments up to a few tens of physical nodes Large-scale (thousands of node) via emulation The methodology for testing in not discussed Deployment How to control the volatility? A need for infrastructures

4 Solutions used for testing P2P prototypes Simulation Results are reproducible  May require significant adaptations  Simplified model compared to the reality Emulation Configure network with various characteristics  Heterogeneity not fully captured  Results are not reproducible  Deployment and management Experiments on real testbeds Needed step when validating software Real heterogeneity  Results are not reproducible  Deployment and management

5 A framework for automated testing of JXTA-based systems from a single node (control node) Two modes Run one distributed test Multiple tests called batch mode (useful with crontab) We added the support of PBS Conducting JXTA-based experiments with JDF (1/2)

6 Hypothesis All the nodes must be “visible” by the control node Requirements Java Virtual Machine Bourne shell SSH/RSH configured to run with no password on each node JDF: several shell scripts Deployment of the needed resources for a test or several tests Jar files and script used on each node Configuration of JXTA peers Launching peers Collect logs and results files of each node Analyze results on the control node Cleanup deployed and generated files for the test Kill remaining processes Update resources for a test Conducting JXTA-based experiments with JDF (2/2)

7 How to define a test using JDF? An XML description file of the JXTA-based network Type of peers (rendezvous, edge peers) How peers are interconnected, etc A set of Java classes describing the behavior of each peer Extend the JDF’s framework (start, stop JXTA, etc) A Java class for analyzing collected results A file containing the list of nodes and the path of the JVM on each node

8 Describing a simple JuxMem network (1/2) Notion of profile A set of peers having the same behavior Instance attribute of profile Specify the total number of nodes hosting this type of peer via the instance attribute Instance attribute of peer Specify the total number of peers of this type on 1 node Simplest example: one cluster manager and 1 provider Cluster A group Provider ACluster Manager A

9 Describing a simple JuxMem network (2/2)

10 A more complex JuxMem network (1/2) juxmem group cluster B group cluster A group cluster C group

11 A more complex JuxMem network (2/2) … … … …

12 Usage of JDF’s scripts runAll.sh [ ] -debug: show all script commands executed -unsecure: use rsh instead of ssh -cleanup: cleanup JDF directory on each host -bundle: create bundle for distribution -install: install distribution bundle -update: update files on each peer -config: configure JXTA network -kill: kill existing JDF processes -run: run test -nohup: run and return without waiting for peers to exit -analyze: analyze test results -log: keep test results and log4j logs from peers batchAll.sh [ ]

13 Experimental results with JDF (1/2) Experimental setup Distributed ASCI Supercomputer 2 (DAS-2) managed by PBS (The Netherlands) 5 clusters for a total number of 200 Dual 1-GHz Pentium-III nodes Site mainly used: 72 nodes SSH/SCP used Experiments with JDF on up to 64 nodes Deployment of JXTA + JDF + JuxMem Configuration of JuxMem peers Update only JuxMem

14 Experimental results with JDF (2/2)

15 Standard JDF vs Optimized JDF

16 Launching peers For each peer a JVM is started Several JXTA can not share the same JVM How to deal with connections between edge and rendezvous peers? Rendezvous peers must be started before edge peers JDF uses the notion of delay Time to wait before launching peers Need a mechanism for distributed synchronization

17 Getting the logs and the results Framework of JDF Start and stop JXTA (net peergroup as well as custom groups as in JuxMem) Store the results in a property file Retrieve log files generated on each node Library used: Log4j Files starting with log. Retrieve result files on each node The specified analyze class is called Display results

18 Experimenting with various volatility conditions Goals Provide multiple failure conditions Experiment various failure detection techniques Experiment various replication strategies Identify class of application and system states Adapt fault tolerance mechanisms

19 Providing multiple failure conditions Go large scale Control faults upon thousands of nodes Precision Possibility to kill a node at a given time/state Some nodes may be “fail-safe” Easy to use Changing the failure model should not affect the code being tested

20 Failure injection: going large scale Using statistical distributions Advantages Ease of use : permit to generate multiple failure dates automatically Suitable large scale Which statistical distributions ? Exponential (to model life expectancy) Uniform (to choose between numerous nodes)

21 Failure injection: precision Why ? Play the role of the enemy Kill a node that handles a lock Kill multiple nodes during some data replication Model reality Some nodes may be almost “fail-safe” A particular node may have a very high MTBF How ? Combine statistical flows and a more precise configuration file

22 Failure injection in JDF: design Add a unique configuration file Generated by a set of tools Using “The Probability/Statistics Object Library” ( Deployed on each node by JDF Launch a new Java thread Reads the configuration file Sleeps for a while Kills its node at a given time

23 Failure injection: execution flow Main flow (test class) New Killer().start() Thread Killer Configuration file (fi.properties) Result file (fi.results) write read suicide kill

24 Failure injection: sample experiment 64 peers running on 64 nodes Creating fi.properties for an initial MTBF of 1 minute Each node life time follows an exponential law with a rate of 1/64 With JDF it becomes easy to use java –cp.:PSOL.jar CreateFiProperties new fi.Killer.start(); // in the test class runAll.sh -cleanup –with-nfs -install -config -run -analyze -log paraci_01-64 test.xml

25 Failure injection: sample experiment

26 Failure injection: ongoing work Time deviation Initial time (t 0 ) Clocks drift Tools to precisely specify fi.properties Suicide interface (event handler) More flexibility

27 Failure detection and replication strategies Running the same test multiple times Failure detection Change the failure detection techniques Tune the Δ (delay between heartbeats) Which Δ for which MTBF Replication strategies Adapt replication degree to “current” MTBF (level of risk) Experiment multiple replication strategies in various conditions (failures/detection)

28 Fault tolerance in JuxMem road map Finalize failure injection tools Experiment Marin Bertier’s failure detectors with JXTA/JDF Integrate of the failure detectors in JuxMem Experiment with various replication strategies Automatic adaptation

29 Ongoing work Improving JDF There is a lot to do Enable concurrent tests via PBS Submitting issues to bugzilla Write more tests for JuxMem Measuring the cost of elementary operations in JuxMem Various consistency protocols at large-scale Benchmarking other elementary steps of JDF Launch peers Collect result and log files Use of emulation tools like Dummynet or NIST NET Visit of Fabio Picconi at IRISA

30 Future work Hierarchical deployment Ka-run/Taktuk-like (ID IMAG) Distributed synchronization mechanism Support more complex tests Allow the use of JDF over Globus Support other protocols than SSH/RSH Especially when updating resources