Reconfigurable HPC Notes on datastream-based FFT Reiner Hartenstein TU Kaiserslautern Baden-Baden,12 June 2013 derived from: R. Hartenstein: Reconfigurable Technologies; 23 July 2004, Seminar given at Kyushu University, Fukuoka, Japan
© 2004, TU Kaiserslautern 2 application-specific distributed memory* Application-specific memory: rapidly growing markets: –IP cores –Module generators –EDA environments Optimization of memory bandwidth for application-specific distributed memory *) see Herz et al.: proc. IEEE ICECS 2002
© 2004, TU Kaiserslautern 3 MoM anti machine an Xputer architecture Multiple Scan Windows data counter memory bank asM asMA distributed memory r DPU smart memory interface example: 4x4 scan windows.....
© 2004, TU Kaiserslautern 4 16 point CGFFT: mapped onto 2-D memory space
© 2004, TU Kaiserslautern 5 output temp coeff. CGFFT: Nested and Parallel Scan Pattern input coeff. in i in i+1 coeff. empty MAC
© 2004, TU Kaiserslautern 6 CGFFT: Parallel Scan Pattern Animation in i in i+1 coeff. empty out k MAC out j 32 steps
© 2004, TU Kaiserslautern 7 CGFFT: Parallel Scan Pattern Animation MAC out j out j+1 out k out k+1 in i in i+1 coeff. empty In i+2 in i+3 coeff. empty MAC 4 MAC units in parallel 8 MAC units in parallel 16 steps 8 steps 4 steps
© 2004, TU Kaiserslautern 8 CGFFT: Nested and Parallel Scan Pattern goto