OPTICAL BUSES CSC 8530: Parallel Algorithm Instructor: Dr. Sushil Prasad Presented by : DM Rasanjalee Himali.

OPTICAL BUSES CSC 8530: Parallel Algorithm Instructor: Dr. Sushil Prasad Presented by : DM Rasanjalee Himali

OVERVIEW Introduction to Optical Buses Linear Array of Optical Buses –Waiting Function –Two-Way Communication Data Communication –Broadcasting –Permutation –Data Distribution PRAM Simulation Meshes with Optical Buses Paper

BUS Bus : –a communication link to some/all of the processors in mesh –can be used for direct communication between non-neighboring processors –Traditionally was electronic –Broadcasting has exclusive access to bus Properties: 1.The bidirectionality –Datum placed on bus by processor P travels on the bus in both directions (left, right) simultaneously reaching all processors attached to the bus 2.The lack of a precise function describing time taken by signal to propagate along given section of a wire –Assume O(1) time

OPTICAL BUSES Alternative to electronic buses Uses light signals Bus is called an optical waveguide Allows several processors to inject data on to the bus simultaneously –Each datum destined for one or several processors Properties: 1.Unidirectionality –Datum placed on bus travel in only one (always the same) direction 2.Predictability of propagation delay per unit time is linear Time taken to travel  distance

OPTICAL BUSES Let processors P 0, P 1 …P n-1 are connected to an optical bus Several processors can place data on the bus simultaneously, one datum per processor (property 1) Data form a pipeline and travel down the bus in the same direction Difference between arrival times of two data d i and d j at a processor P l can be determined by distances separating P i and P j (property 2) P0P0 PiPi PjPj PkPk PlPl P n-1 didi djdj dkdk Optical Waveguide (bus) A pipeline of data on an optical bus

LINEAR ARRAYS WITH OPTICAL BUSES Processors are arranged in a 1- dimensional pattern n processors are connected to an optical wave Each processor is attached to the bus via a two-way link –One to write data on to the bus –Other to read data from the bus No processor is directly connected to another processor by a standard linear-array link Unidirectional bus: –P i can send message to P j if j>i A linear array of data with an optical bus P0P0 PiPi P n-1 D write read

LINEAR ARRAYS WITH OPTICAL BUSES Let, –b :bits per datum/message placed on bus –w : width of light pulse representing the bit duration of pulse in time units –Optical Distance between any two processors P i and P j is: the length of waveguide separating them –D : optical distance between any two consecutive processors Is a constant –  D :number of time units required by a light pulse to traverse D  D = D / v v : speed of light If P i sends message to P j (j>i), message arrives at P j after (j-i)  D time units

LINEAR ARRAYS WITH OPTICAL BUSES All processors can place data on bus simultaneously –This leads to overlapping of messages To avoid message overlap the following conditions must be satisfied 1.D > bwv 2.Processors wishing to write on the bus do so synchronously –They place data on the bus at pre-specified times, separated by regular time intervals Bus Cycle : –  B (L) : time taken by an optical signal to traverse the optical bus from one end(P 0 ) to the other (P n-1 ) –= O(1) –Assume  B (L) smaller than or equal to the time required by a basic computational operations Optical length of bus L: –L = (n-1)D  B (L) = L/v Time is divided into bus cycle intervals

LINEAR ARRAYS WITH OPTICAL BUSES Waiting function: –When is a processor P j to read a message d i from the bus? –Two cases: 1.Receiver P j knows the identity of the sender Pi 2.Receiver P j does not know the identity of the sender Pi

WAITING FUNCTION 1.Receiver P j knows the identity of the sender P i All processors wishing to place a datum on the bus do so only at the beginning of a bus cycle before reading d i Function : wait(i,j): –Definition 1: wait(i,j) = (j - i)  D = (j – i) ;  D = o(1) –Definition 2: If, during each bus cycle, each processor is required to place a message on the bus: wait(i,j) – 1 gives the no. of messages P j should skip before reading d i Transmission of d i from P i to P j takes one bus cycle

LINEAR ARRAYS WITH OPTICAL BUSES 1.Receiver P j knows the identity of the sender P i EXAMPLE : n = 5 P 3 expects d 0 (from P 0 ) P 4 expects d 2 (from P 2 ) Relative to the beginning of the bus cycle, –P 0 and P 2 place data at the beginning of the bus cycle –P 4 receives d 2 at time 4-2 =2 –P 3 receives d 0 at time 3-0 =3 A linear array of data with an optical bus P0P0 P1P1 P4P4 d0d0 d2d2 P2P2 P3P3 T = 0T = 1T = 2 T = 3T = 4 d0d2

LINEAR ARRAYS WITH OPTICAL BUSES 2.Receiver P j does not knows the identity of the sender P i But the sender knows the identity of the receiver Sender P i writes message d i destined to P j, on the bus at time ((n-1)-j+i)  D relative to the beginning of bus cycle All processors read the bus simultaneously at the end of the bus cycle (at time (n-1)  D ) Transmission of di from P i to P j takes one bus cycle

LINEAR ARRAYS WITH OPTICAL BUSES 1.Receiver P j does not know the identity of the sender P i EXAMPLE : n = 5 P 0 wish to send d 0 (to P 3 ) P 2 wish to send d 2 (to P 4 ) Relative to the beginning of the bus cycle, –Place data at time :((n-1)-j+i) –P 0 place d 0 at time (5-1)-3+0 =2 –P 2 place d 2 at time (5-1)-4+2 =3 –Data reach destination at time 4 A linear array of data with an optical bus P0P0 P1P1 P4P4 d0d0 d2d2 P2P2 P3P3 T = 0T = 1T = 2 T = 3T = 4 d0 d2

LINEAR ARRAYS WITH OPTICAL BUSES Two Way Communication Allows messages to be sent in both directions Two optical buses used –Upper bus: send data L  R –Lower bus: send data R  L Wait function: P0P0 PiPi P n-1 If wait(i,j)>0 then Pj reads di from the upper bus at time wait(i,j) Else If wait(i,j) < 0 then Pj reads di from the lower bus at time –wait(i,j) Else Pj does not read in this cycle End If

DATA COMMUNICATION Linear array with optical buses can be used to execute a variety of data communication schemes. Ex: –Broadcasting –Permutation –Data Distribution All schemes are uniquely determined by the wait function We can define a communication pattern simply by giving wait function: –wait(i,j) = j – i

BROADCASTING Processor P i wishes to broadcast a datum to all other processors Set wait(i,j) = j – i, for specific i and all j  i Ex: –n =5 –At beginning of bus cycle P 2 place d 2 on both upper and lower buses –Datum reach P 1 and P 3 simultaneously –Then it reaches P 0 and P 4 simultaneously Broadcast takes one bus cycle P0P0 P1P1 P4P4 P2P2 P3P3 T=0T=1T=2T=3T=4 d2d2 d2d2   d2

PERMUTATION Arbitrary permutation r of indices {0,1,…,n-1} –0 <= i, r(i) <= n-1 –d i is sent to P r(i) for every i –Each processor receives exactly 1 datum Ex: –n = 5 –Pi holds di for 0<=i<=n-1 –Permutation needed: P0 hold d4, P1 hold d3 P2 hold d1 P3 hold d0 P4 hold d2 –Initially each processor places its data on upper and lower waveguides at beginning of bus cycle –Entire permutation takes one bus cycle P0P0 P1P1 P4P4 P2P2 P3P3 T=0T=1T=2T=3T=4 d2d2   d1d1 d0d0 d3d3 d4d4 d1 d2 d3d0d4

DATA DISTRIBUTION Send datum di held by Pi to two or more processors or no processors at all s(j) : index of the processor from which Pj receives a datum where 0 <= j,s(j) <= n-1 This allows s(j) = s(k) = I for j  k –Function is not necessarily a permutation Set wait(s(j),j) = j – s(j), for all j Ex: –n =5 –Pi holds di for 0 <= I <= n-1 –Distribution needed:’ P0 hold d3, P1 hold d3 P2 hold d0 P3 hold d1 P4 hold d0 –Performed in same way as permutation –Entire distribution takes one bus cycle

PRAM SIMULATION Broadcasting, permutation and distribution takes 1 bus cycle (O(1) time) Permutation and data distribution allow a linear array of n processors with optical buses and O(1) memory locations per processor to simulate in constant time any form of memory access allowed on a PRAM with n processors and O(n) shared memory locations (except CW operation) ER and EW are permutations of the data CR is a data distribution operation CW cannot be simulated in constant time –  B (L) is assumed to be smaller than or equal to time required by a basic operation –but CW typically involves arbitrary no. of such operations.

MESHES WITH OPTICAL BUSES Two problems in linear arrays with optical buses: 1.Optical signals weaken rapidly as they travel over long distances 2.The time taken by a message to travel from one end of the bus to the other end (bus cycle) grows considerably Solution: Mesh with Optical Buses –Use 2-D array of processors augmented with row and column optical buses –Array consist of X rows numbered 0 to X-1 and Y columns numbered 0 to Y-1 where XY =n –Each processor connected to 4 optical buses –The two buses on each column send messages vertically –A message can be sent from P(i,j) to P(k,l) in two bus cycles: P(i,j)  P(i,l)  P(k,l)

MESHES WITH OPTICAL BUSES A Sorting Algorithm Sorts n numbers arranged in a rectangular array A of 2 s rows and 2 2r columns with s>=r and XY = n Each step of the algorithm executes one or more of the following operations on A: –Sorting a row –Sorting a column –Cyclically shifting a row Mesh Sort algorithm: } 1.Whenever a row is to be sorted, the linear array with optical buses on that row simulates algorithm PRAM SORT 2.Whenever a column is to be sorted, the linear array with optical buses on that column simulates PRAM SORT 3.Whenever a row is to be cyclically shifted, this is done using function wait(i,j) since a cyclic shift is just a permutation

MESHES WITH OPTICAL BUSES Analysis PRAM Sort need no CW instructions –Thus, each memory access need O(1) time by a linear array with optical buses Algorithm PRAM SORT sorts in n 1/2 numbers in O(logn 1/2 )= O(logn) time using O(n 1/2 ) processors –Each application of PRAM SORT algo. Requires O(logn) bus cycles –There are 13 such steps in algorithm MESH SORT Therefore, a mesh with row and column optical buses can sort a sequence of n numbers in –time t(n) = O(logn) –using p(n) = n processors, –for a cost of c(n) = O(nlogn) which is optimal

PARALLEL ALGORITHMS AND ARCHITECTURES BASED ON PIPELINED OPTICAL BUSES Purpose of the paper: –Explore potential value of a new kind of an optical bus for SIMD architectures in which processing units are connected with buses Two basic multiprocessor interconnection schemes: –Nearest-neighbor connection –Exclusive-access bus connection

PARALLEL ALGORITHMS AND ARCHITECTURES BASED ON PIPELINED OPTICAL BUSES Each scheme has both advantages and disadvantages. nearest-neighbor connection –There can be O(n) active messages in the system at any time instant –But takes up to O(n) steps to have a message delivered from one processor to another. exclusive-access bus connection –Any message transfer takes only one step (one bus Cycle –Because the access of the bus by processors is exclusive there can be only one active message on the bus at any time. To obtain a high communication efficiency it is desirable to combine the advantages of the two interconnections –allowing both O(n) active messages at any time and a one-step message transfer between any two processors. –This may be achieved by the use of a message- pipelined optical bus

PIPELINED OPTICAL BUS The pipelined optical bus takes advantage of two unique properties of optical signal transmissions in waveguides: – unidirectional propagation and –predictable path delays. Each processor is coupled to the optical bus waveguide with two passive optical couplers, one for writing signals, and the other for reading. All the processors may write their messages on an optical bus simultaneously

PIPELINED OPTICAL BUS To be able to transmit messages in both directions in the same bus cycle, a second bus can be added. Then we obtain the architecture of a linear array processor with pipelined buses (APPB) with dual-bus connections in which each processor is coupled to two optical buses, one for message transfers in each direction:

APPB WITH FOLDED OPTICAL BUS A simple structural variation of the dual-bus linear APPB architecture –A linear APPB with folded optical bus All processors send their messages on the upper segment of the bus (before the folding) and receive messages from the lower segment (after the folding), without regard to the relative sender–receiver position. Compared with routing with the dual-bus connection, routing on the folded bus is simpler : –all processors write and read their messages on the same bus (upper segment for writing and lower segment for receiving), However, this simplification of routing on the folded bus is at the cost of doubling the bus cycle length

APPB WITH FOLDED OPTICAL BUS dual-bus connections: wait takes both positive and negative values, Folded bus : wait is always positive The linear APPB architectures can be extended to 2- D APPB’s in which processors in each row and column may be connected with either dual or folded buses.

INTEGER ADDITION Add two n-bit integers x = x n-1 …x 0 and y = y n-1 …y 0 Let –sum z = z n z n-1 …z 0 –Carry-in c i = carry in at the i th bit ; c 0 =0 If c i is known, then x i +y i +c i can be calculated z i and c i+1 (carry-out at bit i) are equal to the lower and higher bits of the sum respectively Using one processor, we can compute x i +y i +c i to obtain z i and c i+1 sequentially for i=0,1,…,n-1 and finish addition in O(n) steps With carry look-ahead addition, the computation can be completed in faster in optical bus

CARRY LOOK-AHEAD ADDITION Carry look-ahead logic uses the concepts of generating and propagating carries The addition of two bits A and B is said to generate if the addition will always carry, regardless of whether there is an input carry –Gi = Ai.Bi A + B is said to propagate if the addition will carry whenever there is an input carry –Pi = Ai  Bi

CARRY LOOK-AHEAD ADDITION For each bit in a binary sequence to be added, the Carry Look Ahead Logic will determine whether that bit pair will generate a carry or propagate a carry. This allows the circuit to "pre-process" the two numbers being added to determine the carry ahead of time. Then, when the actual addition is performed, there is no delay from waiting for the ripple carry effect (or time it takes for the carry from the first Full Adder to be passed down to the last Full Adder)

INTEGER ADDITION With folded optical bus, carry look-ahead addition can be performed in O(1) steps. Number of processors required : n+1 c i can be determined quickly by the examination of a string of g,s and p obtained at bit positions lower than i –g is assigned to bit i if x i = y i = 1 This is because a carry-out is always generated at this bit position, regardless of the value of the carry-out from the lower bits –s is assigned to bit i if x i = y i = 0 This is because the carry-out from lower bits is stopped at this point –p is assigned to bit i if x i  y i This is because carry out from lower bit is propagated through this bit

INTEGER ADDITION Ex: g-s-p string for addition of two 12- bit numbers –g is assigned to bit i if x i = y i = 1 –s is assigned to bit I if x i = y i = 0 –p is assigned to bit I if x i  y i The g, s, and p values can be represented with a 2-bit variable, A i, at each processor i.

INTEGER ADDITION Let k be the largest index smaller than i such that A k  p. Then it is easy to show that c i =1 iff A k =g. Therefore we have the following algorithm for computing the addition: –In the algorithm it is assumed that initially x i and y i are held by processor i. –At the completion of the algorithm, z i is stored at processor i.

REFERENCES Akl, Parallel Computation, Models and Methods, Prentice Hall 1997 Parallel algorithms and architectures based on pipelined optical buses, Zicheng Guo and H. John Caulfield, APPLIED OPTICS,Optical Society of America, Vol. 34, No. 35, 10 December 1995

OPTICAL BUSES CSC 8530: Parallel Algorithm Instructor: Dr. Sushil Prasad Presented by : DM Rasanjalee Himali.

Similar presentations

Presentation on theme: "OPTICAL BUSES CSC 8530: Parallel Algorithm Instructor: Dr. Sushil Prasad Presented by : DM Rasanjalee Himali."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

OPTICAL BUSES CSC 8530: Parallel Algorithm Instructor: Dr. Sushil Prasad Presented by : DM Rasanjalee Himali.

Similar presentations

Presentation on theme: "OPTICAL BUSES CSC 8530: Parallel Algorithm Instructor: Dr. Sushil Prasad Presented by : DM Rasanjalee Himali."— Presentation transcript:

Similar presentations

About project

Feedback