Presentation is loading. Please wait.

Presentation is loading. Please wait.

Towards a Digital Human Kathy Yelick EECS Department U.C. Berkeley.

Similar presentations


Presentation on theme: "Towards a Digital Human Kathy Yelick EECS Department U.C. Berkeley."— Presentation transcript:

1 Towards a Digital Human Kathy Yelick EECS Department U.C. Berkeley

2 The 20+ Year Vision Imagine a “digital body double” –3D image-based medical record –Includes diagnostic, pathologic, and other information Used for: –Diagnosis –Less invasive surgery-by-robot –Experimental treatments Digital Human Effort –Lead by the Federation of American Scientists

3 Digital Human Today: Imaging The Visible Human Project –18,000 digitized sections of the body Male: 1mm sections, released in 1994 Female:.33mm sections, released in 1995 –Goals study of human anatomy testing medical imaging algorithms –Current applications: educational, diagnostic, treatment planning, virtual reality, artistic, mathematical and industrial Used by > 1,400 licensees in 42 countries Image Source: www.madsci.org

4 Digital Human Roadmap 1995200020052010 1 organ 1 model scalable implementations 1 organ multiple models multiple organs 3D model construction better algorithms organ system coupled models faster computers improved programmability digital human

5 Building 3D Models from Images Image source: John Sullivan et al, WPI Image data from visible human MRI Laboratory experiments Automatic construction Surface mesh Volume mesh John Sullivan et al, WPI

6 Simulation: From Genome to Function (James B.Bassingthwaighte, UW) Genes Health Organism Organ Tissue Cell Molecule The Physiome Project http://www.physiome.org Structure to Function: Experiments, Databases Problem Formulation Engineering the Solutions Quantitative System Modeling Archiving & Dissemination

7 Organ Simulation Cardiac cells/muscles –SDSC, Auckland, UW, Utah, Cardiac flow –NYU,… Lung transport –Vanderbilt Lung flow – ORNL Cochlea –Caltech, UM, UCB Kidney mesh generation – Dartmouth Electrocardiography –Johns Hopkins,… Skeletal mesh generation Brain –ElIisman Just a few of the efforts at understanding and simulating parts of the human body

8 Immersed Boundaries within the Body Fluid flow within the body is one of the major challenges, e.g., –Blood through the heart –Coagulation of platelets in clots –Movement of bacteria A key problem is modeling an elastic structure immersed in a fluid –Irregular moving boundaries –Wide range of scales –Vary by structure, connectivity, viscosity, external forced, internally-generated forces, etc.

9 Heart Simulation Developed by Peskin and McQueen at NYU –On a Cray C90: 1 heart-beat in 100 CPU hours –128 3 point fluid mesh; coarser mesh incorrect –Heart represented by muscle fibers –Applications: Understanding structural abnormalities Evaluating artificial heart valves Eventually, artificial hearts Source: www.psc.org

10 Heart Simulation Source: www.psc.org Animation of lower portion of the heart

11 Platelet Simulation Simulation of blood clotting –Developed by Aaron Fogelson and Charlie Peskin –Cells are represented by polygons Rules added to simulate adhesion –Artery walls represented as fibers –2D Simulation Main implementation in F77 for vectors Our implemented in Split-C for distributed memory

12 Cochlea Simulation –Simulates fluid-structure interactions due to incoming sound waves –Potential applications: design of cochlear implants Simulation –OpenMP code E. Givelberg J. Bunn M. Rajan –256x256x128 fluid grid –18 hours on HP Superdome

13 Cochlea Simulation Embedded structures are –Elastic membranes –Shell model –Variable elasticity Capture shape of cochlea Shown here is a plate –Implemented in a few weeks within Titanium code Source: Titanium simulation on a T3E

14 Bacteria Simulation Simulation of bacteria in a fluid –Requires smaller scale –Small animal swimming Lisa Fauchy, Tulane and James Sullivan, Tulane E coli simulation –Video © James A. Sullivan, Cells Alive! Brownian motion extension –Under development by Peter Kramer, RPI –microscopic processes where thermal fluctuation is important

15 Insect Flight Simulation Wings are –Immersed 2D structure Under development –UW and NYU Applications to –Insect robot design Source: Dickenson, UCB

16 Other Applications The immersed boundary method has also been used, or is being applied to –Flags and parachutes –Flagella –Embryo growth –Valveless pumping –Paper making

17 Immersed Boundary Method Structure Fiber activation & force calculation Interpolate Velocity Navier-Stokes Solver Spread Force 4 steps in each timestep Fiber Points Interaction Fluid Lattice

18 Challenges to Parallelization Irregular fiber lists need to interact with regular fluid lattice. –Trade-off between load balancing of fibers and minimizing communication Efficient “scatter-gather” across processors Need a scalable elliptic solver –Plan to uses multigrid –Eventually add Adaptive Mesh Refinement New algorithms under development by Colella’s group at LBNL and Baden at UCSD

19 Software Architecture Application Models Generic Immersed Boundary Method (Titanium) Heart (Titanium) Cochlea Flagellate Swimming … Spectral (Titanium) Multigrid MLC (KeLP) AMR Extensible Simulation Solvers Multigrid (Titanium) –Based on Generic Immersed Boundary Method –Can plug in new models by extending fiber points –Uses Java inheritance for simplicity

20 Immersed Boundaries in Titanium –Heartbeat simulation Experiment 800 1/5 th of heartbeat done Using to validate code Recent improvements –FFTW in solver: 10x faster –Study of fluid/fiber communication Better load balancing Source: Titanium simulation on Millennium Immersed Boundary Method in Titanium is complete –Download from www.cs.berkeley.edu/~smyau

21 Reduced Solver Bottleneck Performance improvements in past year –About 5x overall; ~10 seconds/ts on 64 T3E nodes –Bottleneck has changed to interaction phases 2000 Performance2002 Performance

22 Load Balance and Locality Tried various assignments of fiber points Optimized for Locality Optimized for Load Balance Optimizing for load balance generally better –Application (fiber structure) dependent –Somewhat constrained by spectral solver –Multigrid will give more options

23 Scallop: New Multigrid Poisson Solver A latency tolerant elliptic solver library –Based on domain decomposition –Trades off extra computation for fewer messages –Will be used to build Navier-Stokes Solver Work by Scott Baden and Greg Balls, UCSD –Based on Balls/Colella algorithm –Implemented in C++/KeLP, with a simple interface Solver not yet integrated –Algorithm is complete –Performance tuning is ongoing –Interface between Titanium and C++/KeLP developed

24 Tools for High Performance Challenges to parallel simulation of a digital human are generic Parallel machines are too hard to program –Users “left behind” with each new major generation Efficiency is too low –Even after a large programming effort –Single digit efficiency numbers are common Approach –Titanium: A modern (Java-based) language that provides performance transparency –Sparsity: Self-tuning scientific kernels

25 Titanium Overview Object-oriented language based on Java with: Scalable parallelism –Single Program Multiple Data (SPMD) model of parallelism, 1 thread per processor Global address space –Processors can read/write memory on other processor –Pointer dereference can cause communication Intermediate point between message passing and shared memory

26 Performance Challenges Two separate performance problems –Serial Performance Can Java be used for high performance? What if we compile to machine code, rather than using the JVM? –Communication Performance Global address space encourages the use of small messages Overlapping used to tolerate latencies How well does this work on current networks?

27 SciMark Benchmark Numerical benchmark for Java, C/C++ Five kernels: –FFT (complex, 1D) –Successive Over-Relaxation (SOR) –Monte Carlo integration (MC) –Sparse matrix multiply –dense LU factorization Results are reported in Mflops Download and run on your machine from: –http://math.nist.gov/scimark2http://math.nist.gov/scimark2 –C and Java sources also provided Roldan Pozo, NIST, http://math.nist.gov/~Rpozo

28 SciMark: Java vs. C (Sun UltraSPARC 60) * Sun JDK 1.3 (HotSpot), javac -0; Sun cc -0; SunOS 5.7 Roldan Pozo, NIST, http://math.nist.gov/~Rpozo

29 SciMark: Java vs. C (Intel PIII 500MHz, Win98) * Sun JDK 1.2, javac -0; Microsoft VC++ 5.0, cl -0; Win98 Roldan Pozo, NIST, http://math.nist.gov/~Rpozo

30 Can we do better without the JVM? Pure Java with a JVM (and JIT) – Within 2x of C and sometimes better OK for many users, even those using high end machines –Depends on quality of both compilers We can try to do better using a traditional compilation model –E.g., Titanium compiler at Berkeley Compiles Java extension to C Bounds checking of arrays is crucial – compiler flag to turn this off

31 Java Compiled by Titanium Compiler

32

33 Language Support for Performance Multidimensional arrays –Contiguous storage –Support for sub-array operations without copying Support for small objects –E.g., complex numbers –Called “immutables” in Titanium –Sometimes called “value” classes Semi-automatic memory management –Create named “regions” for new and delete –Avoids distributed garbage collection

34 Performance Challenges Two separate performance problems –Serial Performance Can Java be used for high performance? What if we compile to machine code, rather than using the JVM? –Communication Performance Global address space encourages the use of small messages Overlapping used to tolerate latencies How well does this work on current networks?

35 The Network Parameters EEL – End to end latency or time spent sending a short message between two processes. Parameters of the LogP Model –L – “Latency”or time spent on the network During this time, processor can be doing other work –O – “Overhead” or processor busy time on the sending or receiving side. –G – “gap” the rate at which messages can be pushed onto the network. –P – the number of processors Overhead is the most important parameter – cannot be hidden

36 LogP Parameters: Overhead & Latency Non-overlapping overhead Send and recv overhead can overlap P0 P1 o send L o recv P0 P1 o send o recv EEL = o send + L + o recv EEL < o send + o recv

37 Performance of Remote Read/Write 8-byte messages

38 LogP Parameters: gap The Gap is the delay between sending messages Gap may be larger than send overhead, e.g., –NIC may be busy processing the last message. –Flow control on the network may prevent the NIC from accepting the next message. If the gap is higher than overhead –Need to overlap communication with computation, not other communication P0 P1 o send gap

39 Results: Gap and Overhead

40 Send Overhead Over Time Overhead has not improved significantly –Lack of integration; lack of attention in software

41 Summary Several categories –Application development Heart and cochlea-component simulation –Application-level package Generic immersed boundary method Parallel for shared and distributed memory Enables new larger-scale simulations; finer grid –Titanium language and compiler High level, high performance programming –Communication runtime layer Lightweight communication and benchmarks application data software systems

42 Collaborators Susan Graham Paul Hilfinger Jim Demmel Alex Aiken Phillip Colella Peter McQuorquodale Mike Welcome Sabrina Merchant Kaushik Datta Dan Bonachea Rich Vuduc Jason Duell Paul Hargrove Wei Chen Eun-Jin Im Szu-Huey Chang Carrie Fei Ben Liblit Robert Lin Geoff Pike Jimmy Su Ellen Tsai Siu Man Yau Shaoib Kamil Benjamin Lee Rajesh Nishtala Costin Iancu http://titanium.cs.berkeley.edu/


Download ppt "Towards a Digital Human Kathy Yelick EECS Department U.C. Berkeley."

Similar presentations


Ads by Google