Computing Simulation in Orders Based Transparent Parallelizing Pavlenko Vitaliy Danilovich, Odessa National Polytechnic University Burdeinyi Viktor Viktorovych,

Slides:

Advertisements

Similar presentations

COMMUNICATING SEQUENTIAL PROCESSES C. A. R. Hoare The Queen’s University Belfast, North Ireland.

Advertisements

CS 540 Database Management Systems

Walter Binder University of Lugano, Switzerland Niranjan Suri IHMC, Florida, USA Green Computing: Energy Consumption Optimized Service Hosting.

Extensibility, Safety and Performance in the SPIN Operating System Presented by Allen Kerr.

Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.

Spark: Cluster Computing with Working Sets

GridRPC Sources / Credits: IRISA/IFSIC IRISA/INRIA Thierry Priol et. al papers.

11Sahalu JunaiduICS 573: High Performance Computing5.1 Analytical Modeling of Parallel Programs Sources of Overhead in Parallel Programs Performance Metrics.

1 Languages and Compilers (SProg og Oversættere) Sequence control and Subprogram Control.

Distributed Systems Architectures

16/13/2015 3:30 AM6/13/2015 3:30 AM6/13/2015 3:30 AMIntroduction to Software Development What is a computer? A computer system contains: Central Processing.

1 SWE Introduction to Software Engineering Lecture 23 – Architectural Design (Chapter 13)

Based on Silberschatz, Galvin and Gagne  2009 Threads Definition and motivation Multithreading Models Threading Issues Examples.

Lecture Notes 1/21/04 Program Design & Intro to Algorithms.

Extra Notes for Assignment 1 – Salary Tax Run the sample program –The program executable can only be executed on CSLINUX machines (linux platform only)

A. Frank - P. Weisberg Operating Systems Introduction to Tasks/Threads.

PRASHANTHI NARAYAN NETTEM.

Cmpt-225 Simulation. Application: Simulation Simulation  A technique for modeling the behavior of both natural and human-made systems  Goal Generate.

DISTRIBUTED PROCESS IMPLEMENTAION BHAVIN KANSARA.

Pipelined Two Step Iterative Matching Algorithms for CIOQ Crossbar Switches Deng Pan and Yuanyuan Yang State University of New York, Stony Brook.

Comparative Programming Languages hussein suleman uct csc304s 2003.

Christopher Jeffers August 2012

CS492: Special Topics on Distributed Algorithms and Systems Fall 2008 Lab 3: Final Term Project.

สาขาวิชาเทคโนโลยี สารสนเทศ คณะเทคโนโลยีสารสนเทศ และการสื่อสาร.

Chapter 1. Introduction.

CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.

MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat.

Performance Measurement n Assignment? n Timing #include double When() { struct timeval tp; gettimeofday(&tp, NULL); return((double)tp.tv_sec + (double)tp.tv_usec.

CS453 Lecture 3.  A sequential algorithm is evaluated by its runtime (in general, asymptotic runtime as a function of input size).  The asymptotic runtime.

A performance evaluation approach openModeller: A Framework for species distribution Modelling.

Performance Measurement. A Quantitative Basis for Design n Parallel programming is an optimization problem. n Must take into account several factors:

Scheduler Activations: Effective Kernel Support for the User-level Management of Parallelism Thomas E. Anderson, Brian N. Bershad, Edward D. Lazowska,

The Performance of Micro-Kernel- Based Systems H. Haertig, M. Hohmuth, J. Liedtke, S. Schoenberg, J. Wolter Presentation by: Seungweon Park.

MATRIX MULTIPLY WITH DRYAD B649 Course Project Introduction.

CSC480 Software Engineering Lecture 11 September 30, 2002.

CE Operating Systems Lecture 3 Overview of OS functions and structure.

OPERATING SYSTEM SUPPORT DISTRIBUTED SYSTEMS CHAPTER 6 Lawrence Heyman July 8, 2002.

CSC 480 Software Engineering Lecture 18 Nov 6, 2002.

Server to Server Communication Redis as an enabler Orion Free

ITC Research Computing Support Using Matlab Effectively By: Ed Hall Research Computing Support Center Phone: Φ Fax:

The basics of the programming process The development of programming languages to improve software development Programming languages that the average user.

1. 2 Preface In the time since the 1986 edition of this book, the world of compiler design has changed significantly 3.

Parallelization of likelihood functions for data analysis Alfio Lazzaro CERN openlab Forum on Concurrent Programming Models and Frameworks.

Big traffic data processing framework for intelligent monitoring and recording systems 學生 : 賴弘偉教授 : 許毅然作者 : Yingjie Xia a, JinlongChen a,b,n, XindaiLu.

Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies

Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 4: Threads.

Background Computer System Architectures Computer System Software.

Nguyen Thi Thanh Nha HMCL by Roelof Kemp, Nicholas Palmer, Thilo Kielmann, and Henri Bal MOBICASE 2010, LNICST 2012 Cuckoo: A Computation Offloading Framework.

1 Potential for Parallel Computation Chapter 2 – Part 2 Jordan & Alaghband.

Java Programming Fifth Edition Chapter 1 Creating Your First Java Classes.

DGrid: A Library of Large-Scale Distributed Spatial Data Structures Pieter Hooimeijer,

Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.

1 Distributed Systems Architectures Distributed object architectures Reference: ©Ian Sommerville 2000 Software Engineering, 6th edition.

Emulating Volunteer Computing Scheduling Policies Dr. David P. Anderson University of California, Berkeley May 20, 2011.

Algorithm Complexity is concerned about how fast or slow particular algorithm performs.

Excel Translator™ Ultimate Risk Solutions, Inc.:

Optimizing Distributed Actor Systems for Dynamic Interactive Services

Advanced Computer Systems

ICS 3UI - Introduction to Computer Science

Memory Management.

GdX - Grid eXplorer parXXL: A Fine Grained Development Environment on Coarse Grained Architectures PARA 2006 – UMEǺ Jens Gustedt - Stéphane Vialle - Amelia.

Chapter 4: Threads.

Fast Communication and User Level Parallelism

COMP60621 Fundamentals of Parallel and Distributed Systems

Programming Fundamentals (750113) Ch1. Problem Solving

Presented By: Darlene Banta

Programming Fundamentals (750113) Ch1. Problem Solving

COMP60611 Fundamentals of Parallel and Distributed Systems

SPL – PS1 Introduction to C++.

Presentation transcript:

Computing Simulation in Orders Based Transparent Parallelizing Pavlenko Vitaliy Danilovich, Odessa National Polytechnic University Burdeinyi Viktor Viktorovych, I.I. Mechnikov Odessa National University

Purpose of research ► There’s huge amount of problems that cannot be solved on single processor fast enough ► Clusters are often used because of their comparatively low price and high scalability ► Cluster computing libraries usually offer user only low-level operations for sending and receiving messages Computing Simulation in Orders Based Transparent Parallelizing, ICIM-2008, September

Requirements to the technology ► High level of technology (to make development and debugging faster and easier) ► Transparency of parallel architecture ► Efficiency of the technology – low overhead, low processors idle time ► High speed and low labor intensiveness of parallel applications development Computing Simulation in Orders Based Transparent Parallelizing, ICIM-2008, September

First principle Separate processors Main procedure Nested procedures Input parameters Output parameters Computing Simulation in Orders Based Transparent Parallelizing, ICIM-2008, September

Second principle Data requests Computing Simulation in Orders Based Transparent Parallelizing, ICIM-2008, September

Implementing the principles How to get values before they are used? ► Data getting logic is added by preprocessor ► Data getting logic is added by bytecode analyzer (.NET, Java) ► Classes encapsulate access to data they contain, and data presence check logic is located in methods of such classes How to declare selected procedures? ► Procedures are declared in external file ► Procedures are marked in source with special comments ► Procedures are annoted (C#, Java) Computing Simulation in Orders Based Transparent Parallelizing, ICIM-2008, September

Algorithms Order execution algorithm: ► Get ID of procedure to execute from server ► Get values of input and input/output parameters or corresponding IDs from server ► Execute the procedure. ► Send values of output and input/output parameters to server. If these values are not executed yet, send corresponding IDs to server ► Notify the scheduler about a new free processor Algorithm of making an order: ► Send procedure ID to server. ► Send values of input and input/output parameters or corresponding IDs to server. ► Get IDs for output and input/output parameters from server. ► Bind the IDs with corresponding variables. Value getting algorithm: ► Request data from server. ► If there is no value of the server, suspend the current thread and notify the scheduler about a new free processor. Repeat data request after getting a notification from the scheduler. Computing Simulation in Orders Based Transparent Parallelizing, ICIM-2008, September

States of an order 1. Some computer of cluster gets the order. 2. Order tries to get not computed data. 3. Needed data has been computed. 4. Number of concurrently executed orders is less than number of processors of computer, so we can continue execution of that order. 5. Order execution has been successfully completed. 6. Order execution threw an exception. 7. Other method that had to compute data for current one has thrown an exception. Computing Simulation in Orders Based Transparent Parallelizing, ICIM-2008, September

Execution time estimation ► During development and optimization ► Profiling and asymptotic analysis ► In the case of parallel computing: large time of test run, no common time scale, communication should be taken into consideration Computing Simulation in Orders Based Transparent Parallelizing, ICIM-2008, September

Computing emulation, 1 st step Matrix b; ExecutionRuntime.startBlock(Math.pow(a.getN(),3)); if (ExecutionRuntime.executeBlocks()) b = a.mulBy(a); b = a.mulBy(a);else b = new Matrix(a.getN()); b = new Matrix(a.getN());ExecutionRuntime.endBlock(); 1.Asymptotically tight bound for block execution time is known 2.No orders are made inside blocks and do data getting is performed inside blocks. 3.Result of block execution is not used either directly or indirectly in further compare operations 4.There is fast alternative implementation that does not break further execution Computing Simulation in Orders Based Transparent Parallelizing, ICIM-2008, September

Computing emulation, 2 nd step Computing Simulation in Orders Based Transparent Parallelizing, ICIM-2008, September

Computing emulation, 3 rd step Computing Simulation in Orders Based Transparent Parallelizing, ICIM-2008, September

Computing emulation, 4 th step Computing Simulation in Orders Based Transparent Parallelizing, ICIM-2008, September

Sample problem ► Algorithm:  Find averages of distributions of features for each class and build covariation matrixes  For each non-empty set of features: ► Build quadratic decision rule and calculate its values using provided values of features of objects of both classes ► Find maximal probability of right recognition for this rule ► Save the result as the best one on the first iteration or compare it with current best result on any other iteration ► Parallelizing:  Unite all possible sets of features into groups and create a procedure that performs analyzes a group of sets. Implement classes for all types that can be passed as parameters according to the requirements of the framework. Make the results of groups processing to be compared only after all groups are analyzed.  Declare main procedure and group analyze procedure according to the requirements of the framework. Computing Simulation in Orders Based Transparent Parallelizing, ICIM-2008, September

Results of experiment Value P=N*T (N – number of computers, T – execution time) grows by not more than 1.13% when using 2, 3 or 5 computers instead of one, and by not more than 3.25% when using 10 computers instead of one Computing Simulation in Orders Based Transparent Parallelizing, ICIM-2008, September

Conclusion ► A new parallel applications development technology, based on transparent replacement of methods calls with their execution on other computers of cluster, is proposed. Offered technology has been implemented as a framework on Java programming language. ► The framework has been tested by solving the problem of determination of diagnostic value of formed features diagnostics. ► Efficiency of the technology has been proven by solving a sample problem on a cluster of 10 computers; the value P=N*T (N – number of computers, T – problem solving time) has grown up by not more than 3.25% comparatively to the case of one computer. Computing Simulation in Orders Based Transparent Parallelizing, ICIM-2008, September