Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pattern Programming Seeds Framework Workpool Assignment 1

Similar presentations


Presentation on theme: "Pattern Programming Seeds Framework Workpool Assignment 1"— Presentation transcript:

1 Pattern Programming Seeds Framework Workpool Assignment 1
ITCS 4/5145 Parallel Programming UNC-Charlotte, B. Wilkinson, Jan 15, PatternProg-2

2 Seeds Workpool DiffuseData, Compute, and GatherData Methods
Master GatherData DiffuseData Private variable total (answer) DataMap d Returns d to each slave Data argument data Compute Data argument data DataMap input Slaves DiffuseData, Compute and GatherData methods start with a capital letter although method names should not! DataMap output d created in DiffuseData. output created in Compute

3 Objects sent between master and slaves identified by a key,
based upon using a Java HashMap See HashMap object key For implementation convenience two classes: Data class used to pass data between master and slaves (A “segment” number keeps track of packets as they go from one method to another.) DataMap class used inside DiffuseData, Compute, and GatherData methods DataMap is a subclass of Data and so allows casting Extends Data Extends DataMap

4 (Used inside DiffuseData, Compute, and GatherData methods)
DataMap methods (Used inside DiffuseData, Compute, and GatherData methods) put (key, data) – puts data into DataMap identified by key get (key, data) – gets stored data identified by key key usually a String -- a programmer-chosen name to data. Often, data handles a primitive data type such as Integer or Long key and data are actually of the Object class

5 Data cast into a DataMap
Note Data object segment used by Framework to keep track of where to put results public Data DiffuseData (int segment) { DataMap<String, Object> d =new DataMap<String, Object>(); d.put(“name_of_inputdata", inputData); return d; } public Data Compute (Data data) { DataMap<String, Object> input = (DataMap<String,Object>)data; //data produced by DiffuseData() DataMap<String, Object> output = new DataMap<String, Object>(); //output returned to gatherdata inputData = input.get(“name_of_inputdata”); … // computation output.put("name_of _results", results); // to return to GatherData() return output; public void GatherData (int segment, Data dat) { DataMap<String,Object> out = (DataMap<String,Object>) dat; outdata = out.get (“name_of_results”); result … // aggregate outdata from all the worker nodes. result a private variable Data cast into a DataMap By framework By framework

6 Question Will a class field modified in the DiffuseData or GatherData methods be updated with the same values as in the Compute method? Answer NO. The two methods are running on different JVMs (and different nodes)

7 Other methods called by framework
public void initializeModule(String[] args) { … // initialize private variables datacount = … ; } public int getDataCount() { //Set to number of data items to be processed. return datacount;

8 User methods used in Bootstrap class
Apart from methods to start and stop the framework pattern, additional methods can be specified by programmer in the Workpool class and can be invoked in the Bootstrap class. Typically a method is invoked that produces the final result. Example public double getPi() { // returns value of pi based all workers double pi = (total / (random_samples * DoubleDataSize)) * 4; return pi; }

9 Embarrassing Parallel Computation
Workpool Pattern Embarrassing Parallel Computation Monte Carlo p

10 Monte Carlo Methods A so-called “embarrassingly parallel” computation as it decomposes into obviously independent tasks that can be done in parallel without any into task communications during the computation. Monte Carlo methods use random selections. For parallelizing Monte Carlo code, must address best way to generate random numbers in parallel. 3.15

11 Calculate p using the Monte Carlo method
Circle formed within a 2 x 2 square. Ratio of area of circle to square given by: Points within square chosen randomly. Score kept of how many points happen to lie within circle. Fraction of points within circle will be , given sufficient number of randomly selected samples. 3.16

12 One quadrant can be described by integral:
Random pairs of numbers, (xr,yr) generated, each between 0 and 1. Counted as in circle if 3.18

13 Alternative (better) Monte Carlo Method
(Not used here) Generate random values of x to compute f(x) Sum values of f(x): where xr are randomly generated values of x between x1 and x2. Monte Carlo method very useful if the function cannot be integrated numerically (maybe having a large number of variables) 3.19

14 Workpool implementation
Slaves Compute Return number of 1000 random points inside arc of circle inside seed Send starting seed for random sequence Aggregate answers DiffuseData GatherData Master Compute node Source/sink

15 Seeds Monte Carlo code MonteCarloPiModule.java
DiffuseData Method (Required to be implemented) public Data DiffuseData (int segment) { DataMap<String, Object> d =new DataMap<String, Object>(); d.put("seed", R.nextLong()); return d; // returns a random seed for each job unit }

16 (Required to be implemented)
Compute Method (Required to be implemented) public Data Compute (Data data) { DataMap<String, Object> input = (DataMap<String,Object>)data; DataMap<String, Object> output = new DataMap<String, Object>(); Long seed = (Long) input.get("seed"); // get random seed Random r = new Random(); r.setSeed(seed); Long inside = 0L; for (int i = 0; i < DoubleDataSize ; i++) { double x = r.nextDouble(); double y = r.nextDouble(); double dist = x * x + y * y; if (dist <= 1.0) { ++inside; } output.put("inside", inside); // to return to GatherData() return output;

17 GatherData Method (Required to be implemented)
public void GatherData (int segment, Data dat) { DataMap<String,Object> out = (DataMap<String,Object>) dat; Long inside = (Long) out.get("inside"); total += inside; // aggregate answer from all the worker nodes. }

18 getDataCount Method (Required to be implemented)
public int getDataCount() { return random_samples; } Set number of data “envelopes” sent from master by DiffuseData to slaves, in this case number of “seeds”. (Number of physical slaves processors might be different and determined by compute resources.) Initialized in: initializeModule(…) { random_samples = 3000; )

19 Method to compute p result (used in bootstrap module)
public double getPi() { // returns value of pi based on all workers double pi = (total / (random_samples * DoubleDataSize)) * 4; return pi; }

20 public Data Compute (Data data) { // input gets the data produced by DiffuseData()
DataMap<String, Object> input = (DataMap<String,Object>)data; DataMap<String, Object> output = new DataMap<String, Object>(); Long seed = (Long) input.get("seed"); // get random seed Random r = new Random(); r.setSeed(seed); Long inside = 0L; for (int i = 0; i < DoubleDataSize ; i++) { double x = r.nextDouble(); double y = r.nextDouble(); double dist = x * x + y * y; if (dist <= 1.0) { ++inside; } output.put("inside", inside);// store partial answer to return to GatherData() return output; // output will emit the partial answers done by this method public Data DiffuseData (int segment) { DataMap<String, Object> d =new DataMap<String, Object>(); d.put("seed", R.nextLong()); return d; // returns a random seed for each job unit public void GatherData (int segment, Data dat) { DataMap<String,Object> out = (DataMap<String,Object>) dat; Long inside = (Long) out.get("inside"); total += inside; // aggregate answer from all the worker nodes. public double getPi() { // returns value of pi based on the job done by all the workers double pi = (total / (random_samples * DoubleDataSize)) * 4; return pi; public int getDataCount() { return random_samples; Complete module class package edu.uncc.grid.example.workpool; import java.util.Random; import java.util.logging.Level; import edu.uncc.grid.pgaf.datamodules.Data; import edu.uncc.grid.pgaf.datamodules.DataMap; import edu.uncc.grid.pgaf.interfaces.basic.Workpool; import edu.uncc.grid.pgaf.p2p.Node; public class MonteCarloPiModule extends Workpool { private static final long serialVersionUID = 1L; private static final int DoubleDataSize = 1000; double total; int random_samples; Random R; public MonteCarloPiModule() { R = new Random(); } public void initializeModule(String[] args) { total = 0; Node.getLog().setLevel(Level.WARNING); // reduce verbosity for logging random_samples = 3000; // set number of random samples

21 Bootstrap class RunMonteCarloPiModule.java
package edu.uncc.grid.example.workpool; import java.io.IOException; import net.jxta.pipe.PipeID; import edu.uncc.grid.pgaf.Anchor; import edu.uncc.grid.pgaf.Operand; import edu.uncc.grid.pgaf.Seeds; import edu.uncc.grid.pgaf.p2p.Types; public class RunMonteCarloPiModule { public static void main(String[] args) { try { MonteCarloPiModule pi = new MonteCarloPiModule(); Seeds.start( args[0] , false); PipeID id = Seeds.startPattern( new Operand( (String[])null, new Anchor( args[1] , Types.DataFlowRoll.SINK_SOURCE), pi ) ); System.out.println(id.toString() ); Seeds.waitOnPattern(id); System.out.println( "The result is: " + pi.getPi() ) ; Seeds.stop(); } catch (SecurityException e) { } Deploys framework and runs code

22 Discussion Does anyone see a flaw in the code (clue: random number generation)

23 Workpool pattern Matrix addition and multiplication
Matrix addition and multiplication very easy to parallelize as each result value independent of other result values.

24 Matrix Addition, C = A + B Add corresponding elements of each matrix to form elements of result matrix. Given elements of A as ai,j and elements of B as bi,j, each element of C computed as: Add A B C Easy to parallelize – each processor computes one C element or group of C elements

25 Workpool Implementation
Slave computation Adds one row of A with one row of B to create one row of C (rather than each slave adding single elements) Add A B C

26 Workpool implementation
Slaves (one for each row) Return one row of C C A B Send one row of A and B to slave Master Compute node Following example 3 x 3 arrays and 3 slaves Source/sink

27 MatrixAddModule.java Continues on several sides
package edu.uncc.grid.example.workpool; import … public class MatrixAddModule extends Workpool { private static final long serialVersionUID = 1L; int[][] matrixA; int[][] matrixB; int[][] matrixC; public MatrixAddModule() { matrixC = new int[3][3]; } public void initMatrices(){ matrixA = new int[][]{{2,5,8},{3,4,9},{1,5,2}}; matrixB = new int[][]{{2,5,8},{3,4,9},{1,5,2}}; public int getDataCount() { return 3; public void initializeModule(String[] args) { Node.getLog().setLevel(Level.WARNING); MatrixAddModule.java Continues on several sides In this example matrices are 3 x 3 Some initial values Required method. Number of data objects (Slaves)

28 DataMap d returned are pairs of string key and associated array
DiffuseData method public Data DiffuseData(int segment) { int[] rowA = new int[3]; int[] rowB = new int[3]; DataMap<String, int[]> d =new DataMap<String, int[]>(); int k = segment; for (int i=0;i<3;i++) { rowA[i] = matrixA[k][i]; rowB[i] = matrixB[k][i]; } d.put("rowA",rowA); d.put("rowB",rowB); return d; DataMap d returned are pairs of string key and associated array segment variable used to select rows Copy one row of A and one row of B into rowA, rowB to be sent to slaves rowA and rowB put in d DataMap to send to slaves

29 Compute method public Data Compute(Data data) { int[] rowC = new int[3]; DataMap<String, int[]> input = (DataMap<String,int[]>)data; DataMap<String, int[]> output = new DataMap<String, int[]>(); int[] rowA = (int[]) input.get("rowA"); int[] rowB = (int[]) input.get("rowB"); for (int i=0;i<3;i++) { rowC[i] = rowA[i] + rowB[i]; } output.put("rowC",rowC); return output; Get two rows from data received Add rows Put result row into output with key to be sent back to master

30 GatherData method Note segment variable and Data from slave
public void GatherData(int segment, Data dat) { DataMap<String,int[]> out = (DataMap<String,int[]>) dat; int[] rowC = (int[]) out.get("rowC"); for (int i=0;i<3;i++) { matrixC[segment][i]= rowC[i]; } Get C row sent from slave Place row into result matrix Segment variable associated with Data used to choose correct row

31 Bootstrap class - RunMatrixAddModule.java
package edu.uncc.grid.example.workpool; import … public class RunMatrixAddModule { public static void main (String [] args ) { try { long start = System.currentTimeMillis(); Seeds.start( args[0] ,false); MatrixAddModule m = new MatrixAddModule(); m.initMatrices(); PipeID id = Seeds.startPattern(new Operand ((String[])null,new Anchor (args[1], Types.DataFlowRoll.SINK_SOURCE),m)); Seeds.waitOnPattern(id); m.printResult(); Seeds.stop(); long stop = System.currentTimeMillis(); double time = (double) (stop - start) / ; System.out.println("Execution time = " + time); … In this example the path to Seeds and local host name are command line arguments

32 Matrix Multiplication, C = A * B
Multiplication of two matrices, A and B, produces matrix C whose elements, ci,j (0 <= i < n, 0 <= j < m), computed as follows: where A is an n x l matrix and B is an l x m matrix.

33 Parallelizing Matrix Multiplication
Assume throughout that matrices square (n x n matrices). Sequential code to compute A x B could simply be for (i = 0; i < n; i++) // for each row of A for (j = 0; j < n; j++) { // for each column of B c[i][j] = 0; for (k = 0; k < n; k++) c[i][j] = c[i][j] + a[i][k] * b[k][j]; } Requires n3 multiplications and n3 additions Sequential time complexity of O(n3). Very easy to parallelize as each result independent

34 Matrix Multiplication, C = A * B
One slave computes one element of result in workpool implementation

35 Workpool implementation
Slaves (one for each element of result) Return one element of C C A Send one row of A and one column of B to slave B Master Compute node Following example 3 x 3 arrays and 9 slaves Source/sink

36 MatrixAddModule.java Continues on several sides
package edu.uncc.grid.example.workpool; import … public class MatrixAddModule extends Workpool { private static final long serialVersionUID = 1L; int[][] matrixA; int[][] matrixB; int[][] matrixC; public MatrixAddModule() { matrixC = new int[3][3]; } public void initMatrices(){ matrixA = new int[][]{{2,5,8},{3,4,9},{1,5,2}}; matrixB = new int[][]{{2,5,8},{3,4,9},{1,5,2}}; public int getDataCount() { return 9; public void initializeModule(String[] args) { Node.getLog().setLevel(Level.WARNING); MatrixAddModule.java Continues on several sides In this example matrices are 3 x 3 Some initial values Required method. Number of data objects (Slaves)

37 Note on mapping rows and columns to segments
Arow Bcol segment segment segment segment segment segment segment segment segment 8 2 2 int Arow =segment/3; Int Bcol = segment%3;

38 DataMap d returned are pairs of string key and associated array
DiffuseData method public Data DiffuseData(int segment) { int[] rowA = new int[3]; int[] colB = new int[3]; DataMap<String, int[]> d =new DataMap<String, int[]>(); int a=segment/3,b = segment%3 ; for (int i=0;i<3;i++) { rowA[i] = matrixA[a][i]; colB[i] = matrixB[i][b]; } d.put("rowA",rowA); d.put(“colB",colB); return d; DataMap d returned are pairs of string key and associated array segment variable used to select element in A and B Copy one row of A and one column of B into rowA, colB to be sent to slaves rowA and colB put in d DataMap to send to slaves

39 Compute method public Data Compute(Data data) { int[] rowC = new int[3]; DataMap<String, int[]> input = (DataMap<String,int[]>)data; DataMap<String, Integer> output = new DataMap<String, Integer>(); int[] rowA = (int[]) input.get("rowA"); int[] colB = (int[]) input.get(“colB"); int out = 0; for (int i=0;i<3;i++) { out += rowA[i]*colB[i]; } output.put(“out",out); return output; Get two rows from data received Matrix multiplication, one result Put result into output with key to be sent back to master

40 GatherData method Note segment variable and Data from slave
public void GatherData(int segment, Data dat) { DataMap<String,Integer> out = (DataMap<String,Integer>) dat; int answer = out.get("out"); int a=segment/3, b=segment%3; matrixC[a][b]= answer; } Get result sent from slave* Place element into result matrix Segment variable associated with Data used to choose correct row * Cast from Integer to int not necessary

41 Questions


Download ppt "Pattern Programming Seeds Framework Workpool Assignment 1"

Similar presentations


Ads by Google