A SPMD Model for OCR (with collectives) Sanjay Chatterjee 2/9/2015 Intel Confidential1.

Slides:



Advertisements
Similar presentations
Operating Systems Lecture 7.
Advertisements

1 Non-Blocking Communications. 2 #include int main(int argc, char **argv) { int my_rank, ncpus; int left_neighbor, right_neighbor; int data_received=-1;
6.1 Synchronous Computations ITCS 4/5145 Cluster Computing, UNC-Charlotte, B. Wilkinson, 2006.
CS3771 Today: deadlock detection and election algorithms  Previous class Event ordering in distributed systems Various approaches for Mutual Exclusion.
©2009 Operačné systémy Procesy. 3.2 ©2009 Operačné systémy Process in Memory.
Communication between modules, cohesion and coupling
Getting Started with MPI Self Test with solution.
Copyright W. Howden1 Lecture 7: Functional and OO Design Descriptions.
Road Map Introduction to object oriented programming. Classes
Starting Out with C++: Early Objects 5/e © 2006 Pearson Education. All Rights Reserved Starting Out with C++: Early Objects 5 th Edition Chapter 6 Functions.
2/23/2009CS50901 Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial Fred B. Schneider Presenter: Aly Farahat.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 3: Processes.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Starting Out with C++ Early Objects Sixth Edition Chapter 6: Functions by.
Lesson 6 Functions Also called Methods CS 1 Lesson 6 -- John Cole1.
1/26/2007CSCI 315 Operating Systems Design1 Processes Notice: The slides for this lecture have been largely based on those accompanying the textbook Operating.
SE-565 Software System Requirements More UML Diagrams.
Asynchronous Web Services Approach Enrique de Andrés Saiz.
OCR User Hints API Rob, Sanjay, Zoran. Motivation for OCR user hints API Create a facility for the OCR application developer to provide application specific.
מידול התנהגותי 1. Today’s Session Sequence Diagrams State Machines 2.
PROGRAMMING IN VISUAL BASIC.NET VISUAL BASIC BUILDING BLOCKS Bilal Munir Mughal 1 Chapter-5.
1 MPI: Message-Passing Interface Chapter 2. 2 MPI - (Message Passing Interface) Message passing library standard (MPI) is developed by group of academics.
Copyright 2003 Scott/Jones Publishing Standard Version of Starting Out with C++, 4th Edition Chapter 6 Functions.
Copyright © 2012 Pearson Education, Inc. Chapter 6: Functions.
Chapter 6: Functions Starting Out with C++ Early Objects
CSE 486/586 CSE 486/586 Distributed Systems Graph Processing Steve Ko Computer Sciences and Engineering University at Buffalo.
Object-Oriented Modeling Using UML CS 3331 Section 2.3 of Jia 2003.
Chapter 9 1 Chapter 9 – Part 1 l Overview of Streams and File I/O l Text File I/O l Binary File I/O l File Objects and File Names Streams and File I/O.
CIS 540 Principles of Embedded Computation Spring Instructor: Rajeev Alur
Exascale Programming Models Lecture Series 06/12/2014 What is OCR? TG Team (presenter: Romain Cledat) June 12,
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6: Functions Starting Out with C++ Early Objects Seventh Edition.
3. Controlling Program Flow Methods, parameters, and return values Boolean expressions Conditional branching Loops.
Loops (cont.). Loop Statements  while statement  do statement  for statement while ( condition ) statement; do { statement list; } while ( condition.
CSC 107 – Programming For Science. Today’s Goal  Discuss writing functions that return values  return statement’s meaning and how it works  When and.
Overview The Basics – Python classes and objects Procedural vs OO Programming Entity modelling Operations / methods Program flow OOP Concepts and user-defined.
CSCI-455/522 Introduction to High Performance Computing Lecture 4.
Artificial Intelligence Lecture No. 26 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
Alternate Version of STARTING OUT WITH C++ 4 th Edition Chapter 6 Functions.
Starting Out with C++ Early Objects ~~ 7 th Edition by Tony Gaddis, Judy Walters, Godfrey Muganda Modified for CMPS 1044 Midwestern State University 6-1.
(a) What is the output generated by this program? In fact the output is not uniquely defined, i.e., it is not necessarily the same in each execution. What.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Unix System Calls and Posix Threads.
C++ / G4MICE Course Session 1 - Introduction Edit text files in a UNIX environment. Use the g++ compiler to compile a single C++ file. Understand the C++
Chapter Functions 6. Modular Programming 6.1 Modular Programming Modular programming: breaking a program up into smaller, manageable functions or modules.
How to write a MSGQ Transport (MQT) Overview Nov 29, 2005 Todd Mullanix.
Copyright © 2014, 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6: Functions Starting Out with C++ Early Objects Eighth Edition.
G.v. Bochmann, revised Jan Comm Systems Arch 1 Different system architectures Object-oriented architecture (only objects, no particular structure)
Repeated pattern hints Original plan: attach all EDT hints to the EDT template; there is no field in the ocrEdtCreate call for hints Observation: 1 repeated.
Iteration & Loop Statements 1 Iteration or Loop Statements Dept. of Computer Engineering Faculty of Engineering, Kasetsart University Bangkok, Thailand.
A SPMD Model for OCR Sanjay Chatterjee 2/9/2015 Intel Confidential1.
Copyright © 2015, 2012, 2009 Pearson Education, Inc., Publishing as Addison-Wesley All rights reserved. Chapter 6: Functions.
Object Oriented Programming and Data Abstraction Rowan University Earl Huff.
Beginning C For Engineers Fall 2005 Lecture 3: While loops, For loops, Nested loops, and Multiple Selection Section 2 – 9/14/05 Section 4 – 9/15/05 Bettina.
Mindstorm NXT-G Introduction Towson University Robotics.
Slide 1 Good Methods. Slide 2 Cohesion and Coupling l For structured design These software metrics were used extensively Proven to be effective l For.
C++ Programming Lecture 12 Functions – Part IV
SOCSAMS e-learning Dept. of Computer Applications, MES College Marampally INTERPROCESS COMMUNICATION AND SYNCHRONIZATION SYNCHRONIZATION.
1 Chapter 11 Global Properties (Distributed Termination)
COP 2220 Computer Science I Topics –Breaking Problems Down –Functions –User-defined Functions –Calling Functions –Variable Scope Lecture 4.
Chapter 6 Functions. 6-2 Topics 6.1 Modular Programming 6.2 Defining and Calling Functions 6.3 Function Prototypes 6.4 Sending Data into a Function 6.5.
3.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition Process Termination Process executes last statement and asks the operating.
Department of Computer Science, Johns Hopkins University Pregel: BSP and Message Passing for Graph Computations EN Randal Burns 14 November 2013.
CS422 Principles of Database Systems Stored Procedures and Triggers Chengyu Sun California State University, Los Angeles.
MPI: Message Passing Interface An Introduction S. Lakshmivarahan School of Computer Science.
State Modeling. Introduction A state model describes the sequences of operations that occur in response to external stimuli. As opposed to what the operations.
Chandra S. Martha Min Lee 02/10/2016
Repetition (While-Loop) version]
Test Review CIS 199 Exam 2 by.
Java Programming Language
Object Oriented System Design Class Diagrams
Standard Version of Starting Out with C++, 4th Edition
Presentation transcript:

A SPMD Model for OCR (with collectives) Sanjay Chatterjee 2/9/2015 Intel Confidential1

OCR SPMD model A SPMD context in OCR is a collection of individual logical execution units called ranks A rank has a unique id within a SPMD context and can be viewed as a sequential chain of SPMD-EDTs SPMD EDTs have special semantics and can exist only within a SPMD context A SPMD context includes two kinds of SPMD EDT templates: compute and sync SPMD ranks collectively start computation by individually calling COMPUTE SPMD ranks collectively synchronize by individually calling SYNC A SPMD EDT restarts itself by calling NEXT Intel Confidential2 SPMD CONTEXT RANK 1 I1 C10 C11 C12 S10 S11 C13 RANK 0 I0 C00 S00 S01 S02 S03 C01 RANK 2 I2 C20 C21 C22 S20 S21 S22 C23 RANK 3 I3 C30 C31 C32 S30 S31 C33 COMPUTE PHASE COLLECTIVE SYNC PHASE COMPUTE PHASE RANK MESSAGE NEXT SYNC COMPUTE NEXT

Creating and launching a SPMD Context u8 ocrSpmdLaunch(u64 numRanks, ocrInitFunc_t funcPtr, ocrGuid_t initDb, ocrGuid_t computeTemplate, ocrGuid_t syncTemplate, ocrGuid_t outputEvent); [in] numRanks : Number of ranks in the SPMD context [in] funcPtr : Rank initialization function typedef void (*ocrInitFunc_t)(void *initDbPtr) [in] initDb : DB that is passed to the initialization function on every rank. Every rank gets a private copy of this DB which is destroyed after the init function [in] computeTemplate : Compute SPMD EDT template [in] syncTemplate : Collective synchronization SPMD EDT template [in] outputEvent : SPMD output event Intel Confidential3

SPMD EDTs vs Regular EDTs SPMD EDTs are similar to regular EDTs with some differences SPMD EDTs are anonymized i.e they do not have a guid A SPMD EDT only lives within a SPMD context and is associated with a rank Returning from a SPMD EDT will exit the rank from the SPMD context A SPMD EDT can restart itself by calling NEXT A SPMD EDT is created as either a compute or synchronization EDT A compute EDT can call SYNC to exit itself and start a new sync EDT on the same rank A sync EDT can call COMPUTE to exit itself and start a new compute EDT on the same rank A compute EDT calling COMPUTE or a sync EDT calling SYNC is an error A SPMD EDT in one rank can communicate with another rank using rank messages A SPMD EDT can add a self dependence Intel Confidential4

Creating SPMD EDTs in a rank Similar to regular EDT create, except no output guid or input template parameter. u8 ocrComputeSpmdEdtCreate(u32 paramc, u64* paramv, u32 depc, ocrGuid_t *depv, u16 properties, ocrGuid_t affinity, ocrGuid_t *outputEvent); u8 ocrSyncSpmdEdtCreate(u32 paramc, u64* paramv, u32 depc, ocrGuid_t *depv, u16 properties, ocrGuid_t affinity, ocrGuid_t *outputEvent); Has to be created inside the initialization function before making the first call to COMPUTE or SYNC Intel Confidential5

SPMD Rank Messages SPMD rank messages support point-to-point communication between ranks Messages can be communicated only between the same kind of SPMD EDT templates Compute SPMD EDTs on one rank can only send/receive messages to/from compute SPMD EDTs on other ranks Sync SPMD EDTs on one rank can only send/receive messages to/from sync SPMD EDTs on other ranks Message ordering at source rank is guaranteed to be maintained at destination rank depv slot u8 ocrSend(u64 dstRank, u64 dstSlot, ocrGuid_t db); [in] dstRank: rank id of message destination rank [in] dstSlot: slot id at destination rank [in] db: Guid of the datablock communicated Called by message source Message send is guaranteed to be complete after NEXT is called Another send to the same location and slot is permitted only after calling NEXT u8 ocrRecv(u64 srcRank, u64 dstSlot); [in] srcRank: rank id of the message source rank [in] dstSlot: slot id in current rank where message will be received Called by message destination DB at destination can be accessed in slot after calling NEXT Intel Confidential6

SPMD EDT Self Dependence ocrAddSelfDependence(ocrGuid_t source, u32 slot, ocrDbAccessMode_t mode); [in] source: Source of the dependence edge. Maybe event or DB. [in] slot: Slot in the current SPMD EDT that will be satisfied by the dependence [in] mode: The access mode on the DB attached to the slot Adds a dependence to an event or DB source Allows SPMD EDT to wait for an event NEXT has to be called for completion of the wait on the satisfaction of the dependence The data from the source is visible only after calling NEXT Intel Confidential7

API for NEXT void ocrNext(); exits and restarts current SPMD EDT All sends and receives called before ocrNext are guaranteed to be complete before the EDT restarts After restart, the depv slots that receive messages are updated with new DB. All other depv slots and params maintain their state from previous ocrNext Intel Confidential8

API for COMPUTE void ocrCompute(); Creates and launches a new SPMD EDT in the current rank from the compute template of the SPMD context Can be called from either the initialization function or a sync SPMD EDT All compute EDTs in a rank share the same paramv and depv state setup during the initialization function can be updated during the lifetime of the rank Intel Confidential9

API for SYNC void ocrSync(ocrCollective_t colType, ocrGuid_t db, bool reqResult); [in] colType: type of collective synchronization to be performed. E.g: sum-reduction, barrier, etc [in] db: DB that the current rank gives to the collective. Placed into depv[0] of sync EDT [in] reqResult: Boolean to indicate if current rank needs the result of the collective Creates and launches a new SPMD EDT in the current rank from the sync template of the SPMD context Can be called from initialization function or compute SPMD EDT u8 ocrSyncResult(ocrGuid_t *db); [out] db: DB of the result from the collective Can be called only from a compute SPMD EDT Call will result in error if the previous ocrSync was called with reqResult as FALSE Intel Confidential10

Other API supported inside a SPMD context u64 ocrGetRank() – returns the current rank u64 ocrNumRanks() – returns total number of ranks in the SPMD context Intel Confidential11

Example: Sum-Reduction Intel Confidential12 … ocrEdtTemplateCreate(& syncRedTempl, syncRed, 2, 2); ocrEventCreate(& outputRed, OCR_EVENT_STICKY_T, TRUE); ocrSpmdLaunch(NUM_RANKS, initRed, NULL_GUID, NULL_GUID, syncRedTempl, outputRed); … } void initRed(void *dbPtr) { ocrDbCreate(&elementDb, …); u64 paramv[2]; paramv[0] = ocrGetRank(); paramv[1] = 1; ocrSyncSpmdEdtCreate(2, &paramv, 2, NULL, 0, NULL_GUID, NULL); ocrSync(SUM_REDUCTION_BINARY, elementDb, FALSE) } ocrGuid_t syncRed ( u32 paramc, u64* paramv, u32 depc, ocrEdtDep_t depv[]) { u64 myRank = ocrGetRank(); u64 numRanks = ocrNumRanks(); u64 tree_counter = paramv[0]; if (tree_counter % 2) == 0) { //reduce: depv[0] = depv[0] + depv[1]; u64 srcRank = myRank + paramv[1]; if (srcRank >= numRanks) break; ocrRecv(srcRank, i); paramv[0] = tree_counter / 2; paramv[1] *= 2; ocrNext(); } else { u64 dstRank = myRank - paramv[1]; //ASSERT(dstRank >= 0 && dstRank < numRanks); ocrSend(dstRank, 1, depv[0].guid); } return depv[0].guid; }