Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sonic Millip3De: Massively Parallel 3D Stacked Accelerator for 3D Ultrasound Richard Sampson * Ming Yang † Siyuan Wei † Chaitali Chakrabarti † Thomas F.

Similar presentations


Presentation on theme: "Sonic Millip3De: Massively Parallel 3D Stacked Accelerator for 3D Ultrasound Richard Sampson * Ming Yang † Siyuan Wei † Chaitali Chakrabarti † Thomas F."— Presentation transcript:

1 Sonic Millip3De: Massively Parallel 3D Stacked Accelerator for 3D Ultrasound Richard Sampson * Ming Yang † Siyuan Wei † Chaitali Chakrabarti † Thomas F. Wenisch * * University of Michigan † Arizona State University

2 Portable Medical Imaging Devices Medical imaging moving towards portability –MEDICS (X-Ray CT) [Dasika ‘10] –Handheld 2D Ultrasound [Fuller ‘09] Not just a matter of convenience –Improved patient health [Gunnarsson ‘00, Weinreb ‘08] –Access in developing countries Why ultrasound? –Low transmit power [Nelson ‘10] –No dangers or side-effects 2

3 Handheld 3D Ultrasound 3D has numerous benefits over 2D –Easier to interpret images –Greater volumetric accuracy … as well as many challenges –12k transducers, 10M image points 10-20x beyond state of the art –High raw data bandwidth (6Tb/s) Major bottleneck in state of the art –Tight handheld power budget (5W) 3

4 Why a Custom Accelerator? Software algorithms load/store intensive –von Neumann designs inefficient Large system would require over 700 DSPs –General purpose CPUs even less efficient 4 ArchitectureEnergy/Scanline (1 fps) Single Core Time/Scanline Intel Core i J4.46s ARM Cortex-A833.04J132.18s TI C6678 DSP2.84J2.27s

5 Contributions Iterative delay calculation algorithm –Reduces storage by over 400x –Enables streaming data flow Sonic Millip3De design –Leverages 3D die stacking technology –Transform-select-reduce accelerator framework Power and image analysis of Sonic Millip3De –Negligible change in image quality –Able to meet 5W power budget by 11nm node 5

6 Outline Introduction Ultrasound background Algorithm design System design –Sonic Millip3De –Select Sub-Unit Results and analysis Conclusions 6

7 Ultrasound: Transmit and Receive 7 Receive Raw Channel Data Image Space Focal Points Receive Transducer Transmit Transducer

8 Ultrasound: Transmit and Receive 8

9 9

10 10

11 Ultrasound: Transmit and Receive 11

12 Ultrasound: Transmit and Receive 12

13 Ultrasound: Transmit and Receive 13

14 Ultrasound: Transmit and Receive 14

15 Ultrasound: Transmit and Receive 15

16 Ultrasound: Transmit and Receive 16

17 Ultrasound: Transmit and Receive 17

18 Ultrasound: Transmit and Receive 18

19 Ultrasound: Transmit and Receive 19

20 Ultrasound: Transmit and Receive 20 Each transducer stores array of raw receive data

21 Ultrasound: Image Reconstruction 21 Image reconstructed from data based on round trip delay

22 Ultrasound: Image Reconstruction 22 Images from each transducer combined to produce full frame

23 Delay Index Calculation Iterate through all image points for each transducer and calculate delay index Often done with lookup tables (LUTs) instead 50 GB LUT required for target 3D system 23

24 Challenges of Handheld 3D Ultrasound Delay index LUT requires too much storage –New iterative algorithm reduces necessary constant storage by 400x Peak raw data bandwidth (6Tb/s) infeasible –Sub-aperture multiplexing reduces peak data rate, but requires more transmits Handheld power budget very tight (5W) –3D stacked, highly parallel data streaming design reconstructs images efficiently 24

25 Iterative Delay Index Calculation Deltas between adjacent focal points on a scanline form smooth curve Fit piecewise quadratic approx. to delta function Two sections sufficient for negligible error 25 Section 1Section 2

26 Sub-aperture Multiplexing Peak raw data bandwidth (6Tb/s) infeasible Solution: sub-aperture multiplexing –Transmit multiple times from same location –Receive with subset of transducers (sub-aperture) –Sum images together Prior work: reduce data rate Our design: also reduces HW and power requirements 26

27 System Design 27

28 System Design 28 Sonic Millp3De comprises 1,024 parallel pipelines

29 System Design: Transducers 29 Interchangeable CMOS transducer layer; can use older process

30 System Design: ADC/Storage 30 Separate storage layer to reduce wire lengths

31 System Design: Transform-Select-Reduce 31 Accelerator units in fast, low power process

32 Select Sub-Unit Design 32 Selects sample closest to each focal point using our algorithm

33 Select Sub-Unit Design 33 All delays for a scanline estimated using 9 constants Section 1Section 2

34 Select Sub-Unit Design 34 Adders calculate next iteration of quadratic approximation A(n+1) 2 + B(n+1) + C = (An 2 + Bn + C) + 2An + (A+B) Section 1Section 2

35 Select Sub-Unit Design 35 Decrementor selects sample for next image focal point Section 1Section 2

36 Select Sub-Unit Design 36 Section decrementor indicates when to change constants Section 1Section 2

37 Outline Introduction Ultrasound background Algorithm design System design –Sonic Millip3De –Select Sub-Unit Results and analysis Conclusions 37

38 System Parameters ParametersValue Sub-apertures12 Transmit Sources16 Transmits per Frame192 Transducers per Sub-aperture1,024 Total Transducers12,288 Storage per Transducer4,096 x 12 bits Focal Points per Scanline4,096 Image Depth6 cm Image Angular Widthπ/4 Sampling Frequency40 MHz Interpolation Factor4x Interpolated Sampling Frequency (f s )160 MHz Speed of Sound (tissue)1,540 m/s Target Frame Rate1 fps 38

39 Image Quality Comparison Ideal Our Design (12 bit) 39 Our design has negligible difference from ideal system 11 bit BitsIdeal CNR Simulations using Field II [Jensen ‘92, ‘95]

40 Power Analysis and Scaling 40 Can meet 5W by 11nm node

41 Conclusions 3D die stacked Sonic Millip3De design is able to meet 5W power budget by 11nm Algorithm/HW co-design enables order-of-magnitude gains –Power and output quality goals often in conflict –Need guidance from domain experts to balance Architects have much to offer for application-specific system designs 41

42 Questions? 42 Special thanks to : Brian Fowlkes Oliver Kripfgans Ron Dreslinski


Download ppt "Sonic Millip3De: Massively Parallel 3D Stacked Accelerator for 3D Ultrasound Richard Sampson * Ming Yang † Siyuan Wei † Chaitali Chakrabarti † Thomas F."

Similar presentations


Ads by Google