Presentation is loading. Please wait.

Presentation is loading. Please wait.

System Development. Numerical Techniques for Matrix Inversion.

Similar presentations


Presentation on theme: "System Development. Numerical Techniques for Matrix Inversion."— Presentation transcript:

1 System Development

2 Numerical Techniques for Matrix Inversion

3 The Elementary Technique Matrix Inversion using Co-Factors

4 Inversion using Co-Factors? Not Suitable Computationally!! This technique is a very bad contender for implementation Complexity : ‘N!’ (N x N-1 x N-2 x … x 3 x 2 x 1) (Evaluated for SIMD machines) A recursive algorithm may lend an elegant solution but – Devours memory resources with extreme greed – Drags the processor out from the Grand Prix into a traffic jam Therefore, a computationally extremely expensive algorithm with magnanimous memory requirements Above all a SPS hardware architecture for this technique is a distant reality because of the irregular global communication requirements lend to it by its recursive algorithm

5 Any Alternatives? Fortunately YES! A technique which employs LU Decomposition and Triangular Matrix Inversion for it’s solution Complexity : N 3 (Evaluated for SIMD machines) What are these numerical techniques? (We’ll soon get to learn them) The distinct advantage of these techniques is the fact that their solution is a mimicry of the Gaussian Elimination procedure, which in turn is an excellent contender for systolic implementations

6 To the Computationally Efficient Numerical Techniques Matrix Inversion using LU Decomposition and Triangular Matrix Inversion

7

8

9

10

11

12 Upper Triangular Matrix

13

14

15

16 Lower Triangular Matrix

17

18

19

20 A Systolic Architecture for Triangular Matrix Inversion Matrix Order is 4 x 4

21 Regular Cells

22 Boundary Cells

23 The following architecture’s abstract computational working has been illustrated using the upper triangular matrix. The same architecture, after some arrangement of data, can be employed for the computation of a lower triangular matrix.

24

25

26

27

28

29

30

31

32

33 Array for LU Decomposition? Left for you to practice! Try to develop an idea of it’s dataflow independently and without any help. It will lend you and excellent understanding systolic data flows.

34 A Systolic System for the Complete Matrix Inversion Algorithm

35

36

37

38

39

40

41

42

43 Mapping Mapping is a procedure through which we can achieve the phenomenon of Resource Reuse. Mapping means that two or more algorithms use the same hardware architecture for their execution. It turns out that the most excellent contenders for Resource Reuse are Arithmetic Blocks or as in our case the Processing Elements. Usually, before mapping algorithms on to the same set of Processing Elements we need to develop a Scheduling Algorithm. A Scheduling Algorithm decides that at ‘which time interval’ will a particular processing element execute ‘what data’ for a particular algorithm, out of the given set of algorithms required to be mapped onto the system.

44 An Example for Mapping The Square Matrix Multiplication Array on the Band Matrix Multiplication Array

45 The Array for Band Matrices

46 The Array for Square Matrices

47 The Combined or “Mapped” Array The ‘maroon’ lines represent common connections to each array

48

49

50 The control signal, in sense, will perform the scheduling of operations

51

52

53 In experience, I’ve found the Muliplexer to be arguably the single most important logic element for Datapath design. It’s use is especially imperative to resource efficient system design, as well as in devising the data-flow (data routing) between various devices within the system. Therefore, learning to utilize and eventually control multiplexers in system interconnection is critically essential for system design. I’ll assert upon the fact that you develop a clever understanding of this device as expertees with it will facilitate your design process and help you groom into excellent ‘Special-Purpose-System’ Datapath Designers. A Sincere Advice!!

54 General Framework for Datapath Development involving Processing Elements which require various Data Sources

55 Procedure that can be adopted for routing data of varoius algorithms and tasks that maybe utilizing the same Processing Elements

56 The Do-Yourself Thing

57

58

59 Resource Efficiency ‘Mapping’ is a technique that results in reduced Logic Resource Consumption. Another effective technique for Area Optimization is developing ‘Partially-Parallel/Semi-Parallel Architectures’ from the Fully-Parallel Algorithm Data-path. This is actually considered as a ‘Time to Area Tradeoff’ approach and is valid only and until it suffices the Real-time requirements of the Special Purpose System being developed.

60 I’ll throw light upon SPS Semi-Parallel Architectures using the Matrix Multiplication Problem

61

62 The Single Processing Element Approach

63

64 The Fully Parallel (Simple and Systolic) Architecture for Matrix Multiplication

65

66

67 The Semi-Parallel (Simple and Systolic) Architecture for Matrix Multiplication

68

69

70 Towards Complete Systems

71 Kalman Filter Equations

72 Extended Kalman Filter Equations

73

74 The Local Control These are usually state machines or counters In this particular example they are used to – Generate addresses and read/write signals for the data storages – Specify the function to be performed by the processing elements of the array – May also be used for selecting data inputs of multiplexers for data transfer between the arrays and also for set, reset and load operations for various registers

75 The Global Control These are usually wait-assert or interrupt based state-machines This may be again a state machine or a counter (at times rather large and complex) May be a Programmable State Machine! Programmable State Machine? These are like small microcontrollers that can be programmed through software

76 HW/SW Co-Design HW/SW stands for hardware software co-design The concept is to solve the problem partially in software and the rest in hardware Why software? Because sequential problems are more suited to software solutions Let’s understand the particular example of Kalman/H-Infinity Filter design using the Xilinx 8- bit PicoBlaze or KCPSM (Constant Coded Programmable State Machine)

77 A Glance at the PicoBlaze Architecture

78 But Why? Why PicoBlaze?

79 Application of Wait-Assert type Global Control in Kalman System Design

80

81 Down Memory Lane Remeber and Relate!!

82

83

84 Q & A s


Download ppt "System Development. Numerical Techniques for Matrix Inversion."

Similar presentations


Ads by Google