Copyright (c) 2003 by Valery Sklyarov and Iouliia Skliarova: DETUA, IEETA, Aveiro University, Portugal.

The following C/C++ recursive function gcd calculates the greatest common divisor of two unsigned integers A and B: unsigned int gcd(unsigned int A, unsigned int B) { if (B > A) return gcd(B,A); else if (B<=0) return A; else return gcd(B,A%B);} The following example demonstrates how to call this function: int main(int argc, char* argv[]) {cout << “The greatest common divisor is "; cout << gcd(120,164) << endl; cout << “The greatest common divisor is "; cout << gcd(27,63) << endl; cout << “The greatest common divisor is "; cout << gcd(27,13) << endl; return 0;} It produces the following results: The greatest common divisor is 4 The greatest common divisor is 9 The greatest common divisor is 1 VC_gcd.zip

There are many tools available for hardware design that allow similar code to be translated to the appropriate hardware circuit The proposed technique enables recursion to be implemented in hardware by establishing a special control sequence that is provided by a hierarchical finite state machine (HFSM) For example, the Handel-C language supports all the constructions in the function above except for recursion. Hardware description languages such as VHDL allow very similar expressions, but recursive calls are prohibited

Begin x 1 y3y3 Z y 1,y 2,z x 2 y 1,y 4,z End 1 0 0 if (B>A) if (B==0) a 0 =0 a 1 =1 a 2 =2 a 3 =3 a 4 =4 y 1  Data_A=B; y 2  Data_B=A; y 3  result=A; y 4  Data_B = A%B; 1 unsigned int gcd(unsigned int A, unsigned int B) { if (B > A) return gcd(B,A); else if (B<=0) return A; else return gcd(B,A%B);}

void main() // similar code can be used for any recursive algorithm { unsigned int 6 module; unsigned int 4 state; unsigned int 1 done; unsigned int 8 A, B, result, Data_A, Data_B; Data_A = 164; Data_B = 120; // Example of initial data HFSM_reset(); // Reset the HFSM HFSM_new_module(0);// The 1st HFSM module has code 0 do{ // do-while C/Handel-C loop executing transitions par {// parallel execution of operations between braces {…} below module=get_module(); // Get the current HFSM module state=get_state(); // Get the current module state A=Data_A; // Get the 1st argument for the module B=Data_B;} // Get the 2nd argument for the module A = 164 B = 120 gcd HC_gcd.zip

switch(module){// Select appropriate HFSM module case 0: // HGS has just one module coded by 0 switch(state) {// Select the state within the module case 0:// Execute micro operations for state 0 if (B>A) HFSM_next_state(1); // Transition to state a1 else if (B<=0) HFSM_next_state(2);// Transition to a2 else HFSM_next_state(3); // Transition to a3 break; // Exit the switch statement case 1: // Execute micro operations for state 1 par {// execute operations below {…} in parallel HFSM_next_state(4);Data_A=B;Data_B=A; // Swap A and B HFSM_new_module(0); } // Recursive call of module 0 break;// Exit the switch statement case 2: // Execute micro operations for state 2 par { HFSM_next_state(4); result = A; } break; // See the HGS case 3: // Execute micro operations for state 3 par { HFSM_next_state(4);Data_A=B;Data_B=A%B; // See the HGS HFSM_new_module(0);}// Recursive call of module 0 break; case 4: // Execute micro operations for state 4 HFSM_end_module(); //Recursive return, or terminate algorithm } // if stack_ptr=0 (i.e. set ends=1) // case statements for other modules if necessary } done=test_ends();} // Exit the do-while loop when ends=1 while(!done); // display the results and end the function main

Let us assume that we are receiving integers (unsigned) from a channel 1)17 2)6 3)18 4)9 5)5 6)21 7)… 17 root of a binary tree 6 left node because 6 < 17 right node because 18 > 17 189 right node of the node 6, because 9 6 5 left node of the node 6, because 5 < 17 and 5 < 6 21 right node of the node 18, because 21 > 17 and 21 > 18

x 1 = 0 when there is no pointer to other nodes (i.e. the pointer is equal to 1..1) 17 6 18 17..10 0..1 6 1..1 1..1 18 1..1 1..1 y 1 – push RAM address onto the local stack y 2 – write an address of the left node m 2 – call itself

x 1 = 0 when there is no pointer to other nodes (i.e. the pointer is equal to 1..1) 17 6 18 17..10 0..1 6 1..1 1..1 18 1..1 1..1 y 1 – push RAM address onto the local stack y 2 – write an address of the left node m 2 – call itself Local stack 0..00 0.01 address of 17 address of 6

x 1 = 0 when there is no pointer to other nodes (i.e. the pointer is equal to 1..1) 17 6 18 17..10 0..1 6 1..1 1..1 18 1..1 1..1 Local stack 0..00 0.01 address of 17 address of 6 1..1 y 5 – pop data from the local stack to the address register y 3 – record output data

x 1 = 0 when there is no pointer to other nodes (i.e. the pointer is equal to 1..1) 17 6 18 17..10 0..1 6 1..1 1..1 18 1..1 1..1 Local stack 0..00 y 1 – push RAM address onto the local stack 0..01 y 4 – – write an address of the right node m 2 – call itself

x 1 = 0 when there is no pointer to other nodes (i.e. the pointer is equal to 1..1) 17 6 18 17..10 0..1 6 1..1 1..1 18 1..1 1..1 Local stack 0..00 0.01 1..1 y 5 – pop data from the local stack to the address register y 3 – record output data

x 1 = 0 when there is no pointer to other nodes (i.e. the pointer is equal to 1..1) 17 6 18 17..10 0..1 6 1..1 1..1 18 1..1 1..1 Local stack y 1 – push RAM address onto the local stack 0..00 y 4 – – write an address of the right node m 2 – call itself

x 1 = 0 when there is no pointer to other nodes (i.e. the pointer is equal to 1..1) 17 6 18 17..10 0..1 6 1..1 1..1 18 1..1 1..1 Local stack 0..00 0..10 1..1 y 5 – pop data from the local stack to the address register y 3 – record output data

x 1 = 0 when there is no pointer to other nodes (i.e. the pointer is equal to 1..1) 17 6 18 17..10 0..1 6 1..1 1..1 18 1..1 1..1 Local stack 0..00 y 1 – push RAM address onto the local stack 0..10 y 4 – – write an address of the right node m 2 – call itself

x 1 = 0 when there is no pointer to other nodes (i.e. the pointer is equal to 1..1) 17 6 18 17..10 0..1 6 1..1 1..1 18 1..1 1..1 Local stack 0..00 0..10 1..1 y 5 – pop data from the local stack to the address register

Changes that are required in order to change the order: y 1,y 4,m 2 y 1,y 2,m 2

Begin m2m2 a0a0 x 1 y 1,y 2,m 2 y3y3 y 1,y 4,m 2 End y 5 0 1 a1a1 a2a2 a3a3 a4a4 c2c2 r 15 7 39 6 c1c1 r3r3 c4c4 c5c5 r6r6 c7c7 r8r8 r9r9 r 10 c 11 c 12 r 13 c 14 r 16 r 3  3 r 6  6 r 10  7 r 13  9 switch(module) { case 0: switch(state) {// describing the functionality of the module 0 // (see case 2: statement as an example) } break; // describing the functionality of the module 1 case 2: switch(state) { case 0: if (reg != 31) next_state(1); else next_state(4); break; case 1: par { next_state(2); // datapath operations new_module(2); } break; case 2: par { next_state(3); // datapath operations } break; case 3 par { next_state(4); // datapath operations new_module(2); } break; case 4: par { end_module(); // datapath operations } break; } // describing the functionality of other modules RAM Register Local stack push/pop (y 1 /y 5 ) write the address of the left node y2y2 y4y4 write the address of the right node y 3 record output data x 1 =1 – there is a node; x 1 =0 – there is no node a) b) c) d)

HC_sort.zip switch(module) { case 0: switch(state) { case 0:HFSM_next_state(1); break; case 1:par {HFSM_next_state(2); HFSM_new_module(2);} break; case 2: HFSM_next_state(2); HFSM_end_module(); break; } break;

Many engineering problems can be formulated as instances of the knapsack problem. Examples of such problems are public key encryption in cryptography, routing nets on FPGAs interconnected by a switch matrix, analysis of the power distribution networks of a chip, etc. There exist numerous versions of the knapsack problem as well as of the solution methods. We will consider a 0-1 problem and a branch-and-bound method. A 0-1 problem is a special instance of the bounded knapsack problem. In this case there exist n objects, each with a weight w i  Z + and a volume v i  Z +, i=0,…,n-1. The objective is to determine what objects should be placed in the knapsack so as to maximize the total weight of the knapsack without exceeding its total volume V. In other words, we have to find a binary vector x =[x 0, x 1,…, x n-1 ] that maximizes the objective function while satisfying the constraint

A simple approach to solve the 0-1 knapsack problem is to consider in turn all 2 n possible solutions, calculating each time their volume and keeping track of both the largest weight found and the corresponding vector x. Since each x i, i=0,…,n-1, can be either 0 or 1, all possible solutions can be generated by a backtracking algorithm traversing a binary search tree in a depth-first fashion. In the search tree the level i corresponds to variable x i and the leaves represent all possible solutions. This exhaustive search algorithm has an exponential complexity  (n2 n ) (because the algorithm generates 2 n binary vectors and takes time  (n) to check each solution) making it unacceptable for practical applications The average case complexity of the algorithm may be improved by applying two simple methods:

1. Pruning the branches that lead to non-feasible solutions. This can easily be done by calculating the current volume at each node of the search tree, which will include the volumes of the objects selected so far. If the current volume at some node exceeds the capacity constraint V the respective branch does not need to be explored further and can safely be pruned away since it will lead to non-feasible solutions. 2. Pruning branches that lead to (potentially feasible) solutions whose weight is smaller than the optimal weight found so far. This can be done by introducing a bounding function, which calculates an upper bound for each node, and comparing the obtained bound with the optimal weight found up to the point. If the bound is less than or equal to the optimal weight, the current branch can be pruned. One way to calculate the bounding function is to add the weights of already selected objects to the optimal solution of the relaxed “remaining” sub-problem, which can be computed using a fast greedy algorithm.

x = 0; //current solution opt_x = 0; //optimal solution found so far opt_W = 0; //weight of the optimal solution cur_V = 0; //volume of the current solution level = 0; //level in the search tree Knapsack_1 (level, cur_V) { begin: if (level == n) { if () { ; opt_x = x; } else { if ( (cur_V + vlevel)  V ) { xlevel = 1; Knapsack_1(level+1, cur_V + vlevel); } xlevel = 0; goto begin; //instead of Knapsack_1(level+1, cur_V); }

Suppose there exist 3 objects having volumes 10, 11, 7, and 8; weights 20, 16, 18, and 23 (respectively); and capacity V=25. The resulting search tree constructed by applying the algorithm Knapsack_1 is shown below> 0 x=[ _ _ _ _ ] 15780 x=[ 0 0 0 0 ] 181911 25 17181021 181170211710 1102110 x=[ 0 0 0 1 ]x=[ 0 0 1 0 ] x=[ 0 0 1 1 ] x=[ 0 1 0 0 ] x=[ 0 1 0 1 ]x=[ 0 1 1 0 ] x=[ 0 0 0 - ] x=[ 0 0 1 - ] 100 x=[ 0 _ _ _ ] x=[ 1 _ _ _ ] x=[ 1 1 - - ] x=[ 1 0 - - ] x=[ 1 1 0 - ] x=[ 0 1 - - ] x=[ 0 0 - - ] x=[ 1 0 1 - ]x=[ 1 0 0 - ] x=[ 1 0 0 0 ] x=[ 1 0 0 1 ] x=[ 1 0 1 0 ] x=[ 1 0 1 1 ] x=[ 1 1 0 0 ] x=[ 0 1 1 - ]x=[ 0 1 0 - ]

a) An HGS describing the recursive search algorithm employed for solving the knapsack problem; b) Recursive hierarchical finite state machine (RHFSM) with a Handel-C example of a new module invocation a)a) b)b)

// Handel-C function for solving the knapsack problem void ExhaustiveSearch() { HFSM_reset(); do { par { module = get_module(); state = get_state(); } switch(module) { case 0: // description of the module z 0 switch (state) { case 0: // state a 0 par { //initialize variables HFSM_next_state(1); HFSM_new_module(1); } break; case 1: // state a 1 HFSM_end_module(); } break; HC_knapsack.zip

case 1: // description of the module z 1 switch (state) { case 0: // state a 0 if ( not_equal(level, n) && /*…*/ ) HFSM_next_state(1); else HFSM_end_module(); break; case 1: // state a 1 - update current solution HFSM_next_state(2); break; case 2: // state a 2 par { //update the best solution HFSM_new_module(1); // recursive call HFSM_next_state(3); } break; case 3: // state a 3 // restore the previous solution if (not_equal(level, n)) par { // increment the search tree level HFSM_next_state(0);} else HFSM_end_module(); break; } done = test_end(); } while(!done); } HC_knapsack.zip

Check if the RHFSM finished its execution: unsigned 1 test_end() { return end; } Stacks: unsigned MODULE_SIZE M_stack[MAX_LEVELS]; unsigned STATE_SIZE FSM_stack[MAX_LEVELS]; Reset the HRFSM: void HFSM_reset() { par { stack_ptr = 0; end = FALSE; } Get active module: unsigned MODULE_SIZE get_module() { return M_stack[stack_ptr]; } Get active state: unsigned STATE_SIZE get_state() { return FSM_stack[stack_ptr]; } Get next state: void HFSM_next_state(unsigned STATE_SIZE state) { FSM_stack[stack_ptr] = state; } Switch to new module: void HFSM_new_module(unsigned module) { if(stack_ptr != (MAX_LEVELS-1)) par { stack_ptr++; FSM_stack[stack_ptr+1] = 0; M_stack[stack_ptr+1] = module; } else delay; } Terminate currently active module: void HFSM_end_module(void) { if(stack_ptr == 0) end = TRUE; else stack_ptr--; } Global variables: unsigned 1 end; unsigned NUMBER_LEVELS stack_ptr; Handel-C functions responsible for the RHFSM reset, state transitions, hierarchical module calls and returns, etc.

There are 3 examples available that demonstrate different steps of the tutorial 1. Visual C++ zipped project for C++ program considered at the beginning 2. Handel-C zipped project for DK2/ISE 6.2 (discovering the greatest common divisor for two integers) 3. Handel-C zipped project for DK2/ISE 6.2 (sorting of integers) 4. Handel-C zipped project for DK2/ISE 6.2 (solving the knapsack problem) The names of the projects are: 1.VC_gcd.zip 2.HC_gcd.zip 3.HC_sort.zip 4.HC_knapsack.zip C++ DK2 ISE 6.2

Copyright (c) 2003 by Valery Sklyarov and Iouliia Skliarova: DETUA, IEETA, Aveiro University, Portugal.

Similar presentations

Presentation on theme: "Copyright (c) 2003 by Valery Sklyarov and Iouliia Skliarova: DETUA, IEETA, Aveiro University, Portugal."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Copyright (c) 2003 by Valery Sklyarov and Iouliia Skliarova: DETUA, IEETA, Aveiro University, Portugal.

Similar presentations

Presentation on theme: "Copyright (c) 2003 by Valery Sklyarov and Iouliia Skliarova: DETUA, IEETA, Aveiro University, Portugal."— Presentation transcript:

Similar presentations

About project

Feedback