Presentation is loading. Please wait.

Presentation is loading. Please wait.

REMOTE++: A Tool for Automatic Remote

Similar presentations


Presentation on theme: "REMOTE++: A Tool for Automatic Remote"— Presentation transcript:

1 REMOTE++: A Tool for Automatic Remote
Distribution of Programs on Windows Computers Ashley Hopkins Department of Computer Science and Engineering University of South Florida Tampa, Florida AMH001 (thesis.ppt - 04/16/03)

2 Acknowledgements I wish to thank Dr. Kenneth Christensen for his encouragement, his enthusiasm, and his support in writing this thesis. I also wish to thank my committee member Zornitza Genova Prodanoff for taking the time to read this thesis and provide valuable feedback. AMH002

3 Topics Introduction – remote distribution
Description of Remote distribution methods Design of REMOTE++ Evaluation of REMOTE++ Summary and future work AMH003

4 Introduction Two key issues addressed by remote distribution
Simulation programs require significant time to execute Many require multiple runs to complete an experiment Many computer resources are under utilized AMH004

5 Introduction continued
Parallelization of programs reduces overall execution time Two types of parallelization Space based parallelization Addresses programs that can be broken down easily Time based parallelization Addresses programs that require multiple executions Many simulations fit this category REMOTE++ implements time parallelization AMH005

6 Introduction continued
Remote Distribution Network Remote Remote Remote Network Master Remote Remote AMH006

7 Introduction continued
Remote distribution of programs Enables execution of independent programs in parallel Harnesses the idle CPU cycles of remote machines Reduces the overall execution time of experiments AMH007

8 Introduction continued
Requirements of Distribution Tools Distribution must be automatic (no manual interaction) 2) Tool must be simple for easy maintenance and modification 3) Output files must be available on the master PC 4) A single process must be distributed to each remote machine 5) Once a job completes, the next job must be sent Each job must be executed only once The failure of a job to complete must be detected The failure of a remote host must be detected Error messages must be displayed at the master PC A log file should be kept AMH008

9 Remote Distribution Methods
Methods for Remote Distribution Remote shell (rsh) and remote execute (rexec) commands Cluster systems Beowulf Grid Computing Unix based remote distribution tools Condor Original REMOTE tool developed by Dr. Christensen REMOTE++ built upon this tool AMH009

10 Remote Distribution Methods continued
Drawbacks of current tools Primarily designed for Unix platforms Many are large or complex Many require extensive installation and maintenance AMH010

11 Remote Distribution Methods continued
Key challenge is… Develop a Windows based Remote Distribution tool that is easy to use, maintain, and modify. Must be able to reduce overall execution time Overhead in distribution of processes must be overcome Must be able to execute many different programs No modification to the programs Various input and output methods allowed AMH011

12 Description of REMOTE++
REMOTE++ is built upon REMOTE Sockets interface replaced by rcp/rsh commands Programs read/write to standard input/output An invalid job is detected An invalid host is detected REMOTE++ also has drawbacks Each remote host required to have an rsh/rcp daemon Status feature of REMOTE not available Security concerns with remote shell commands AMH012

13 Description of REMOTE++ continued
Set-up of REMOTE++ 1) Each client must have a remote shell/remote copy daemon. 2) REMOTE++ must be loaded on the master machine. 3) A joblist.txt file must contain a list of jobs to be executed. 4) A hostlist.txt file must contain a list of the hostnames of all remote machines. 5) A status.txt file must be created as a log file containing the success or failure of each job and each remote host. AMH013

14 Description of REMOTE++ continued
Sample joblist.txt file file mm1.exe input1.txt output1.txt std hello.exe input2.txt output2.txt file mm1.exe input3.txt output3.txt Sample Hostlist.txt file giga2.csee.usf.edu giga3.csee.usf.edu AMH014

15 Description of REMOTE++ continued
Sample status.txt file Mode is classic. Executable file mm1.exe found Input file input1.txt found Output file output1.txt found Mode is new. Executable file hello.exe found Input file input2.txt found Output file output2.txt found Input file input3.txt was not found Output file output3.txt found giga2.csee.usf.edu is a valid host giga3.csee.usf.edu is a valid host AMH015

16 Description of REMOTE++ continued
Operation of REMOTE++ 1) The existence of each job in joblist.txt is validated. 2) Threads are used to assign a job to each host in the host list. 3) The executable is remote copied (rcp) to the remote host. rcp failure makes host unavailable and job is reassigned 4) The job is executed using a remote shell (rsh) command. 5) When the job finishes the host is assigned another job until all jobs in joblist.txt are complete. AMH016

17 Description of REMOTE++ continued
Sample Execution of REMTOE++ AMH017

18 Description of REMOTE++ continued
Two input/output methods are supported by REMOTE++ 1) File or “Classic” method Used with programs that read from and write to files Implemented in original REMOTE tool Requires transfer of input and output files 2) Std or “New” method Used with programs that use standard input/output New in REMOTE++ tool Input and Output redirected from files No transfer of files required AMH018

19 Description of REMOTE++ continued
The remote shell/remote copy daemon: 1) Vendor version (tested with Denicomp’s rshd) Dependable Cost prohibitive Not open source 2) Free version (by Silviu Marghescu) Free Open source Does not support standard input/output method Not as reliable AMH019

20 Evaluation of REMOTE++
Queuing systems can be modeled using simulation Queue simulations must be executed numerous times with varying input to gather statistical information A queue simulation was utilized to evaluate the REMOTE++ tool AMH020

21 Evaluation of REMOTE++ continued
A queue is a sequence of customers waiting to receive service The following features determine the behavior of a queue: The distribution of time between arriving customers The distribution of time to service a customer The number of servers available to service the customers The capacity of the queue The population size of customers The queuing discipline determines the order of service AMH021

22 Evaluation of REMOTE++ continued
An M/M/1 queue has the following features: Markovian (exponentially distributed) inter-arrival of customers Markovian (exponentially distributed) service times A single server An unlimited queue capacity An infinite customer population An M/M/1 queue has FIFO queuing discipline Server Arrivals Queue Departures AMH022

23 Evaluation of REMOTE++ continued
Evaluated REMOTE++ with an M/M/1 queue simulation Performance of an M/M/1 queue measured by its utilization Utilization (ρ) is the fraction of the time the system is busy Utilization is a ratio of arrival rate and the service rate The length (L) of the queue is dependent on the utilization AMH023

24 Evaluation of REMOTE++ continued
Goal of Evaluation… Determine the relationship between the utilization and the simulation run time for mean queue length within a percent of the theoretical length At the same time… Evaluate the reduction in execution time when executing simulation with REMOTE++ on five machines AMH024

25 Evaluation of REMOTE++ continued
M/M/1 queue simulation time was evaluated for... Utilization from 1% to 99.5% Length within 10% of the theoretical length Statistical mean of 10 executions at each interval AMH025

26 Evaluation of REMOTE++ continued
As the target utilization approaches 100% the simulation time of the M/M/1 queue increasingly grows longer. AMH026

27 Evaluation of REMOTE++ continued
Simulation time grows slightly faster than order six polynomial growth AMH027

28 Evaluation of REMOTE++ continued
The M/M/1 queue execution... Projected a five time speed up on five machines Achieved about two and a half time speed-up on five machines seven seconds of overhead per job at low utilization jobs executed in several seconds AMH028

29 Summary and future work
Remote Distribution can be used to reduce execution time. Existing systems are Unix-based and complex Need a simple Windows based tool REMOTE++ improves upon REMOTE Complex sockets interface replaced by simple rsh/rcp script Enables wider variety of programs to be executed Able to recover from invalid jobs and hosts AMH029

30 Summary and future work
Improve free remote shell daemon Support std or “new” input/output method Reduce overhead in distribution to increase reduction in execution time. Support more and mixed input/output methods Implement security in REMOTE++ Currently relies on rsh daemon for security Implement status feature similar to original REMOTE tool AMH030

31 Thank You Questions? Ashley Hopkins
Department of Computer Science and Engineering University of South Florida Tampa, Florida REMOTE++ soon available at: Thank You AMH031


Download ppt "REMOTE++: A Tool for Automatic Remote"

Similar presentations


Ads by Google