Download presentation

Presentation is loading. Please wait.

Published byMaddison Bufford Modified over 4 years ago

1
IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Instruction Set Extensions for Computation on Complex Floating Point Numbers Authors: Philipp Digeser, Marco Tubolino, Martin Klemm, Daniel Shapiro, Axel Sikora and Miodrag Bolic Email: {digeserp, tubolinm, klemmm, sikora}@dhbw-loerrach.de {dshap092, mbolic}@site.uottawa.ca

2
IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Overview Prior Art Complex Floating Point Division Instruction Set Extensions (ISE) Instruction Hardware Software Interface Experiment Performance Evaluation Hardware Resource Utilization Future Work Conclusion

3
IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Prior Art We described the possibility of accelerating scientific observation using ISEs instead of software libraries such as carith In this work we demonstrated this possibility The extension of our prior work can perform several operations (complex addition/subtraction/multiplication/division) which improves the chances of our ISE being widely applicable.

4
IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Complex Floating Point Computations Unlike real multiplication or division, mathematical operations for complex numbers are usually provided by slow software. Consider complex division: Slow 3 Additions/Subtractions 6 Multiplications 2 Divisions

5
IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Complex Floating Point Computations Fast complex computations are necessary – Image and audio manipulation – Multi-antenna – Correlation – Others Example: STSDAS offers math libraries for image analysis, including stsdas.analysis.fourier.carith, which is used to multiply or divide two complex images [1].

6
IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Instruction Set Extension Instruction-Set Extensions, as the name implies, involves the addition of custom instructions to a processors instruction set Generic custom instruction datapath [2]

7
IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Instruction Set Extension An ISE candidate has limited I/O access to the register file. We use multicycle reads/writes from/to the register bank in order to squeeze several operands into the two input-one-output register file [4] The computations can be distributed to one adder, one multiplier and one divider They can be pipelined In case of divide by zero and overflow flags are set Original custom logic block [3]

8
IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Instruction Hardware Operation when n=0 above, n=1 at right.

9
IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Software Interface The designed hardware for complex division can be used easily in assembly (by inline) or C/C++ code as shown below: ALT_CI_COMPLEX_CORE_INST(0, in_A, in_C); out_real = ALT_CI_COMPLEX_CORE_INST(1, in_B, in_D); out_imag = ALT_CI_COMPLEX_CORE_INST(0, 0, 0);

10
IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Experiment h(u,v) is some blurred picture taken by a telescope – Motion blurring: long exposure time and moving of the camera. E.g. hubble g(u,v) illustrates the image aimed to be recovered f(u,v) the failure, called a point spread function, can be calculated out of the known movement of the target h(u,v) g(u,v) f(u,v)

11
IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Experiment To restore the image, they must be transformed into the freq. domain by applying an FFT and back using IFFT This transformation leads to complex arrays in the freq. domain that need to be divided: h(u,v) f(u,v) g(u,v) f(u,v) g(u,v)=h(u,v) G(u,v)=H(u,v)/F(u,v)

12
IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Performance Evaluation ApproachExecution Time (seconds) Loop Overhead (seconds) Speedup SW division ISE accelerated division 9.17673 0.77180 0.02258 12.2182 SW multiplication ISE accelerated multiplication 6.41827 0.76075 0.02273 8.6651 SW addition ISE accelerated addition 2.50610 0.74385 0.02259 3.44344 SW subtraction ISE accelerated subtraction 2.58661 0.74477 0.02260 3.55442

13
IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Hardware Resource Utilization Considerable The entire system requires 8864 Logic Elements and 27 9-Bit DSP units The complex core requires 2520 Logic Elements and 23 9-Bit DSP units Optimizing the ISE hardware to maximize reuse was essential to limiting the hardware size

14
IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Future Work Adding FFT and IFFT To accelerate other embedded complex mathematics algorithms Correlation of pictures – Instead of doing a slow time domain correlation – Heavy complex multiplication in freq. domain

15
IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Conclusion The designed ISE can be used to accelerate embedded complex mathematics operations Significant Speedup (up to 12)

16
IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Questions?

17
IEEEI 2010 ISE for Computation on Complex Floating Point Numbers References [1] Space Telescope Science Institute. (2010) carith. [Online]. Available: http://stsdas.stsci.edu/cgi-bin/gethelp.cgi?carith.hlp [2] ALTERA Corperation. (2007) Nios II custom instruction user guide. [Online]. Available: http://www.altera.com/literature/tt/tt nios2 multiprocessor tutorial.pdf [3] P. Digeser, M. Tubolino, M. Klemm, D. Shapiro, and M. Bolic, Instruction set extension in the NIOS II: A floating point divider for complex numbers, in CCECE, 2010. [4] L. Pozzi and P. Ienne, Exploiting pipelining to relax register-file port constraints of instruction-set extensions, in CASES 05: Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems. New York, NY, USA: ACM, 2005, pp. 2–10.

Similar presentations

OK

CS61C L20 Introduction to Synchronous Digital Systems (1) Garcia © UCB Lecturer PSOE Dan Garcia www.cs.berkeley.edu/~ddgarcia inst.eecs.berkeley.edu/~cs61c.

CS61C L20 Introduction to Synchronous Digital Systems (1) Garcia © UCB Lecturer PSOE Dan Garcia www.cs.berkeley.edu/~ddgarcia inst.eecs.berkeley.edu/~cs61c.

© 2018 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google

Ppt on power grid failure Ppt on teachers day free download Ppt on french revolution Ppt on rc phase shift oscillator circuit Ppt on operating system memory management Ppt on coalition government in england Ppt on fire safety week Ppt on indian army weapons planes Ppt on product marketing strategies Ppt on sikkim culture and tradition