Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS591x Cluster Computing and Programming Parallel Computers Grid Computing Global Grid Exchange and Parabon’s Frontier.

Similar presentations


Presentation on theme: "CS591x Cluster Computing and Programming Parallel Computers Grid Computing Global Grid Exchange and Parabon’s Frontier."— Presentation transcript:

1 CS591x Cluster Computing and Programming Parallel Computers Grid Computing Global Grid Exchange and Parabon’s Frontier

2 GGE - Frontier Global Grid Exchange is based on Frontier Frontier is a parallel computing platform from Parabon, Inc. Frontier is massively parallel platform Frontier provides a framework to define, launch, execute and manage parallel applications across the Internet

3 GGE - Frontier Frontier derives is computational processing power from “relatively low- power, unreliable, high latency nodes” … but potentially on a massive scale These nodes are largely desktop computers Frontier uses idle time on these nodes

4 GGE - Frontier processing on a specific only occurs if the processor is in an unused state “unreliable” means that when a node returns to an active state (i. e. the user moves a mouse) the task is “killed” Frontier has facilities to help mitigate the unexpected lose of nodes.

5 GGE - Frontier Program in Java or any language that produces bytecode must run under JVM client application must make Java calls to the Client API The work task must be coded in Java

6 GGE - Frontier Frontier platform Client application – the front end application that defines the overall application and communicates with Frontier API. This runs on the GGE/Frontier user’s computer Frontier compute engine – compute nodes running an application that provides spare (idle) machine cycles to the Frontier job Frontier Server – handles the assignment, allocation of resources, schedules and manages jobs

7 GGE - Frontier Frontier Server Client Application Compute Engine …

8 GGE - Frontier Main components of a Frontier application jobs  relatively isolated applications tasks  set of one or more work applications  arbitrary  independent  a task runs on a single compute node

9 GGE - Frontier jobs = a set of elements a set of tasks elements = data elements  blocks of data represented as a parameter list, sent to compute engine for execution of task executable elements  Java bytecode that defines the executable task to be performed by the compute engine

10 GGE - Frontier Tasks – defined by task specification tasks made up of  executable elements that it requires  a list of parameters  an entry point class name executable element defines the computational work of the compute node  must be bytecode in jar file  defined by.addJarFileExecutableElement.addRequiredExecutableElement method

11 GGE - Frontier about tasks tasks cannot communicate with other running tasks all communications with a task takes place from the server at the time that the task is instantiated All communication from a task takes place via status reports, some may be passed by to the client application task can see/use global and job level elements

12 GGE - Frontier Parameter lists define as a set of name-value pairs name – some text string value – an assigned value of a primative type parameter lists – grouped together into parameter maps parameter maps attached to task specification

13 GGE - Frontier Task Status all information is reported back from compute engine as status reports some status reports are passed back from server to client application recent status reports replace previous status reports

14 GGE - Frontier Types of status reports run mode  unstarted, running, complete,… results  returned as name-value pairs in parameter maps exceptions  codes and text description of exceptions progress  single scalar metric of task progress.

15 Checkpointing Compute nodes are “unreliable” can frequently leave the compute node “pool” owner/user moves a mouse,etc In progress job is stopped and exited Potential to lose a lot of work

16 Checkpointing Checkpointing allows the Frontier server to restart a task at or near the place where it exited – where it was last checkpointed Checkpointing involves recording/saving important state variable in the task and sending these state variables to the server The server replaces the task’s task spec with the checkpoint data task is restarted with checkpoint data as task spec rather than the original task spec.

17 Design Issues Java for task code code must run under compute node’s JVM JVM has tight security model “Work” done in tasks Except for Frontier job management work is done in tasks Tasks are pushed out to compute nodes (maybe local)

18 Design Issues Best for jobs that have a high compute to io ratio (compute bound) Frontier has limited data IO capabilities and no inter-task communications capabilities Launch and Listen most serial programs have program controlled IO In Frontier you push out the task and listen for results You don’t know when those results will come back

19 Design Issues Unreliable task execution Tasks can be stopped, reexecuted, reassigned Can’t reasonably predict when tasks will complete Can redundantly allocate tasks to make to make job more resilient

20 GGE/Frontier Programs In most general sense you must have two programs A client application  setups, starts and potentially manages the job  Java or make Java calls to client API A task  this contains the scientific or engineering work that you are trying to accomplish  must be in Java, run under JVM

21 Frontier tasks We’ll start by looking at tasks Code samples shown here from the Global Grid Exchange website www.globalgridexchange/developers See (in your installation) local/src/LocalTask.java local/src/LocalApp.java

22 Frontier Tasks public class LocalTask implements Task { private TaskContext context; // task context private boolean runtimeWantsMeToStop = false; public LocalTask(TaskContext context) { this.context = context; runtimeWantsMeToStop = false; } // Task interface method to start the task public void run() throws TaskStoppedException { try { DoLocalTask(); } catch(TaskStoppedException e) { throw(e); } } // Task interface method to stop the task public void stop() { runtimeWantsMeToStop = true; }…..}

23 Frontier Tasks private void DoLocalTask() throws TaskStoppedException { double progress = 0.0; NamedParameterMap map = new NamedParameterMap(); if(runtimeWantsMeToStop) { throw new TaskStoppedException(); } square = input * input; // Return the final status (with the results) map.put("square", square); context.postResults(1.0, map); // 1.0 denotes 100% completion }

24 Frontier Tasks private int input; // task parameters private int square; public void setInput(int val) { input = val; } public void setSquare(int val) { square = val; } Must have in your task code…

25 Frontier Tasks Compiler task file to java class file Put class file in Jar file

26 Frontier Client Application Client application defines jobs Jobs define tasks Client works with Frontier server (for remote tasks) or locally to launch tasks

27 Client Application Basic steps Create a session manager  manager = new LocalSessionManager(); define job attributes (up to you) assign job attributes to parameter map attach parameter map to job attribute map

28 Client applications Define executable elements  the class file that you will execute as a task Set that element as the executable element define task attributes assign task attributes to map define parameters for task attach task attribute map to task spec define task proxy

29 Client Application Create a listener Events (results) do not return automatically Listener listens for events in GGE system Events include results, exceptions, progress report

30 Listener Listener class class LocalAppJobListener implements TaskProgressListener, TaskResultListener, TaskExceptionListener { SessionManager manager; Job job; // Constructor public LocalAppJobListener(SessionManager manager_, Job job_) { manager = manager_; job = job_; }

31 Progress Listener public void progressReported(TaskProgressEvent event) { // Get the task's ID from the task attributes int taskId = event.getAttributes().getIntValue("TaskID"); System.out.println("Task " + taskId + " is " +event.getProgress()*100 + "% complete"); }

32 Results Listener public void resultsPosted(TaskResultEvent event) { NamedParameterMap resultsMap =event.getResults(); if(resultsMap == null) { return; } if(event.isComplete()) { // Get the Task proxy from the event TaskProxy proxy = event.getTaskProxy(); // Get the task's ID from the task attributes int taskId = proxy.getAttributes().getIntValue("TaskID"); // Get the result from the event int square = resultsMap.getIntValue("square"); System.out.println("Task " + taskId + "reports the square is: " + square);

33 Listener – shut things down // Remove the task from the job proxy.remove(); System.out.println(“Tasks have been completed.removing job"); // Remove the job job.remove(); // Destroy the manager manager.destroy(); // Exit all running threads System.exit(0); } }

34


Download ppt "CS591x Cluster Computing and Programming Parallel Computers Grid Computing Global Grid Exchange and Parabon’s Frontier."

Similar presentations


Ads by Google