Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tutorial for PARK data fitting Paul KIENZLE, Wenwu CHEN and Ziwen FU Reflectometry Group.

Similar presentations


Presentation on theme: "Tutorial for PARK data fitting Paul KIENZLE, Wenwu CHEN and Ziwen FU Reflectometry Group."— Presentation transcript:

1 Tutorial for PARK data fitting Paul KIENZLE, Wenwu CHEN and Ziwen FU Reflectometry Group

2 Objective: Distributed Computing Environment Service Server Master Node User Cluster Working Nodes User/Client ServiceServer Management WorkingServer User

3 Prerequisite Python: version >= 2.40 Windows: cygwin Client: wxPython: version >= 2.6 matplot Most services may need numpy

4 Setup of park Download Source code: –Source code: svn co –Package for unix/linux: park tar.gz park tar.bz2 –Package for windows: park zip Edit cluster config file: –park/config/hosts Start service server –park/servers/mapServer.py Start client –park/client/AppJob.py Provide services –park/services

5 Setup of park in Unix/Linux Download park tar.gz or park tar.bz2 from Unzip the file: tar –xvzf park tar.gz Make the installation: cd park make install or setup.py install –install-purelib=home_directory_of_park The command make install is equivalent to setup.py install – install-purelib=~. It will install park in directory ~/park.

6 Setup of park in Windows Download park zip or park tar.bz2 from Unzip the file: unzip park zip Make the installation in MSDOS window: cd park setup.py install It will install park in directory ~/Lib/site-packages/park.

7 Edit the config file The server makes use of park/config/hosts to configure the working nodes. Example of park/config/hosts: # # hosts configure file for park # example for compufans.ncnr.nist.gov cluster: # 4 nodes, each node with 2 cpus # # the format is similar to that of /ect/hosts: # ip_address full_name alias_name[:port:number_of_cpus] # localhost.localdomain localhost:5300:2 # n4.ncnr.nist.gov n4:6500:2 # n3.ncnr.nist.gov n3:6300:2 # n2.ncnr.nist.gov n2:6200:2 # n1.ncnr.nist.gov n1:6100:2

8 Start the server The server is park/servers/mapServer.py: cd park/servers python mapServer.py Or in cygwin in Windows cd Lib/site-packages/park/servers python mapServer.py The full command is: python mapServer.py –port port –host host_name –log log_file_name.

9 Start the server Make sure that python and its environments are set correctly. Make sure that RSH defined in park/servers/environ.py is set to the remote shell command for cluster with multiple working nodes Make sure that this remote shell command can start the remote command without the password. Make sure that the services are executable files. Common Error: [Errno 2] No such file or directory: '~/park/config/hosts': no configure file hosts. ERROR (111, 'Connection refused') –the working server doesnt start. –make sure that the port is not used ERROR (xxx, port is used') –Wait a while before restart the server –make sure that the port is not used

10 Stop the server Shut down the service server by Ctrl-C or kill command. Use kill without -9 command, which will also stop the working server program. Otherwise the working server will continue to work even the service server is killed.

11 Start the client Enter ~/park/client Run the client application: $python AppJob.py Connect the server: –server > server | port (default port is 5400) –click connect button to connect the server. Prepare and submit the service request: –shell > load : load xml service request, which will be shown in the upper text field –click submit button to submit the service request –the message related to service request is shown in the lower text field. View the service results: –view : to view the results. There are 3 types of data to be viewed: experimental data (with error bar), simulation data, and chi square. The experimental and simulation data only show the best results, and chi square shows the improvement of chisq for data fitting. Under the panel is a toolbar, which can be used to zoom in/out, save figure, and change the properties of figure (property button). Shutdown the client: –server > disconnect then close the window –or close the window directly.

12 Map-reduce parallel pattern Map: master node assigns working unit [i] to working node [j] : –map(fn, input[i] ) = output[i] to working node j Reduce: master node collection message from each working node and perform reduce function, and send the result to the user: –reduce(gn, output[0], …, output[n] ) => send to the user client Service Server Mapping Working Nodes Service Server reducing

13 Service request Reduce function map function inputs

14 Software Infrastructure of PARK for data fitting Service Server Service Working Nodes User Interface Scientist View DeveloperReduce Service Developer Data reduction Model Developer Data simulation Data presentation Data View

15 Reduce function The class inherits from park/services/reduce/reduce.Reduce. class Reduce: """ A base class as the reduce function. """ def __init__(self): """ constructor. """ self.archive = None self.msgqueue = None def setArchive(self, archive): """ set the archive to store data """ self.archive = archive def setMsgQueue(self, msgqueue): """ set the message queue. """ self.msgqueue = msgqueue def __call__(self, msg): """ called by the PARK to process the reply from the working node. """ pass

16 A example of Reduce function park/services/reduce/Chisq.Chisq: class Chisq (Reduce): """ A class to handle the chisq for data fitting. """ def __init__(self):""" constructor. """ Reduce.__init__(self) self.chisq = None def __call__(self, reply): keys = {}; keys['gid'] = reply.gid; keys['jid'] = reply.id self.archive.put(keys, str(reply)) if hasattr(reply, 'chisq'): chisqval = self.chisq if self.chisq is None: self.chisq = chisqval elif chisqval < self.chisq: self.chisq = chisqval self.msgqueue.putMsg(reply.gid, '%s ' \ %(XML_HEADER, str(reply.gid), str(reply.id), str(chisqval)))

17 map function 1.The pure python function. - Running as a thread in PARK. -Bad scalability for SMP (due to python multithreading implementation) -Only works for pure python function. Format: output_string function_name(input_string) The executable program. - Running as a separated process in PARK. -Excellent scalability for SMP -Works for any executable program -Need more memory and long start-up time Read input from the standard in and output the results to standard out.

18 A example of map function park/services/tester/longwinstr.py: if __name__ == '__main__': try: longwin() except: sys.stderr.write('Exception:%s' %(sys.exc_info()[1]))

19 A example of map function def longwin(): print 'call longwin' s0 = sys.stdin.read() node = minidom.parseString(s0).childNodes[0] t = int(node.getAttribute('count')) if t > 25: count = t else: count = 2**t print ' Start work with iteration number: ', t cnt = 0 while (cnt < count): a= math.sqrt(2.0) cnt += 1 print ' finish work: cnt=', cnt

20 Fully Distributed Services ? Service Register Cluster Management Service Management Job Queue Message Queue Data Fetching Archive Logging Task Management User Client Services Shared Files

21 Pull or put ? Working Server Job Server Message Server 1. Job server sends job to working server, and working server send results to message server 2. Job server sends job to working server, and message server working retrieve results from working server 3. Working server retrieves job from job server and send results to message server 4. Working server retrieves job from job server and message server working retrieve results from working server

22 Security: authentication and authorization Working Server Job Server Security Server MessageServer

23 Data Transfer 1.Provide the data center server for the cluster, which will retrieve data from remote data server, and store the data for the accessing by the local working nodes. Necessary for diskless nodes in the cluster. 2.Provide the reference to the remote data (similar to url), and each working node will access the data individually.

24 UI/Visualization MVC model Traits-UI 2D/3D

25 Multi-tier of PARK Service Server Working Server Reduce Server Data Server Client Server Explicit direct connection Implicit direct connection Possible connection All are working as both the server and the client

26 Multi-tier of PARK Service Server Working Server Reduce Server Data Server Client Server Explicit direct connection Implicit direct connection Possible connection All are working as both the server and the client


Download ppt "Tutorial for PARK data fitting Paul KIENZLE, Wenwu CHEN and Ziwen FU Reflectometry Group."

Similar presentations


Ads by Google