Presentation is loading. Please wait.

Presentation is loading. Please wait.

David P. Anderson Space Sciences Lab U.C. Berkeley Exa-Scale Volunteer Computing.

Similar presentations


Presentation on theme: "David P. Anderson Space Sciences Lab U.C. Berkeley Exa-Scale Volunteer Computing."— Presentation transcript:

1 David P. Anderson Space Sciences Lab U.C. Berkeley Exa-Scale Volunteer Computing

2 A brief history of volunteer computing Applications Platforms 1995 2005 distributed.net, GIMPS SETI@home, Folding@home Entropia, United Devices,... BOINC Climateprediction.net Predictor@home, WCG, Einstein, Rosetta,... 2005 20002008 Bayanihan, Javelin,...

3 Applications Computational biology  protein folding and structure prediction Rosetta++ Biomedical, plant genomics  virtual drug design Autodock, CHARMM Cancer, AIDS, Alzheimer’s, Dengue fever  genetic linkage analysis  phylogenetics Epidemiology  Malaria model Environmental studies  “Virtual Prairie” simulation

4 More applications High-energy physics  CERN: accelerator, collision simulations Climate prediction  HADSM3 (U.K.)‏  WRF (NCAR)‏ Astronomy  gravitational wave detection  SETI  Milky Way, Big Bang studies Nanotechnology Mathematics Distributed seismography

5 The PetaFLOPS milestone Folding@home: Sept 19, 2007  current average: 2.67 PetaFLOPS  40% Cell (40K Sony PS3)‏  40% GPU (10K NVIDIA)‏  20% ‏CPU (250,000 computers)‏ BOINC: Jan 31, 2008  current average: 1.2 PetaFLOPS  568,000 computers; 87% Windows)‏ First supercomputer: May 25, 2008  IBM RoadRunner  1.026 PetaFLOPS  $133M

6 Cost per TeraFLOPS-year Cluster: $124,000 Amazon EC2: $1,750,000 Volunteer computing: $2,000

7 The real goals Enable paradigm-shifting science  change the way resources are allocated Revive public interest in science  avoid return to the Dark Ages So we need to:  make volunteer computing feasible for all scientists  involve the entire public, not just the geeks  solve the “project discovery” problem Progress: non-zero but small

8 The road to ExaFLOPS Consumer computing resources  CPUs in PCs (desktop, laptop)‏  GPUs in PCs  Video-game consoles  mobile devices  home media devices For each type  what is performance potential? how will it change over time?  ease of programming?  energy efficiency?  network connectivity?  how to publicize and deploy?

9 CPUs 2 billion PCs by 2015 Performance increases largely from multicore  need to develop parallel apps Availability will decline (green computing)‏ 1 ExaFLOPS:  40,000,000 PCs x 100 GFLOPS x 0.25 availability Promotional partner: MS? HP? Dell?

10 GPUs NVIDIA 8800: ~500 GFLOPS Programmability: CUDA; OpenCL? 1 ExaFLOPS:  4,000,000 x 1,000 GFLOPS x 0.25 availability

11 Video-game consoles Sony Playstation 3  Cell (~100 GFLOPS) + GPU  Ships with Folding@home  Hard to program Microsoft Xbox  3 PowerPC cores (~30GFLOPS) + GPU 0.25 ExaFLOPS:  10,000,000 consoles x 100 GFLOPS x 0.25 availability

12 Mobile devices (recharging)‏ Cell phones, PDAs, media players, Kindle, etc. Hardware convergence  0.5 GFLOPS CPU (Freescale i.mx37, 65 nm)‏ low power (best FLOPS/watt)‏  >256MB RAM  >10GB stable storage  Internet access  Software  Google Android? 3.3 billion cell phones in 2010 0.5 ExaFLOPS:  1B x 1 GFLOPS x 0.5 availability

13 Home media players Cable set-top box, Blu-Ray player Hardware: low-end PC Software environment: Java-based Multimedia home platform (MHP)‏ 0.1 ExaFLOPS:  100M x 2 GFLOPS x 0.5 availability

14 The BOINC project NSF-funded, based at UC Berkeley  2.5 FTEs  many volunteers Functions:  develop technology for volunteer and desktop grid computing  enable online communities  do research related to volunteer computing

15 BOINC server software Job scheduling  high performance (10M jobs/day)‏  scalability Web code (PHP)‏  community, social network Ways to create a project:  Set up a server on a Linux box  Run BOINC server VM (VMware)‏  Run BOINC server VM on Amazon EC2 MySQL DB (~1M jobs)‏ scheduler (CGI)‏ Clients feeder shared memory (~1K jobs)‏ Various daemons

16 BOINC client software core client application BOINC library GUI screensaver local TCP schedulers, data servers user preferences, control Cross-platform (Win/Mac/Linux)‏ Simple, configurable, secure, invisible graphics app BOINC library

17 BOINC’s project/volunteer model Attachments volunteer PC Projects Independent No central authority ID: URL Climateprediction.net Superlink@Technion World Comm. Grid Rosetta@home

18 Facilitating project discovery volunteer PC BOINC-based projects Climateprediction.net Superlink@Technion World Comm. Grid Rosetta@home Account Manager Web services

19 Application platform Multithread and coprocessor support client scheduler List of platforms, Coprocessors #CPUs jobs, app versions app planning function app versions platform app version job Inputs: host, app class Outputs: avg/max #CPUs coprocessor usage estimated FLOPS

20 Adaptive replication Volunteer PCs are anonymous and untrusted  how do we know results are correct? Replicated computing  require consensus of equivalent results  2x throughput penalty Adaptive replication  maintain estimate of host “validity rate” V(h)‏  if V(h) > K, replicate  else replicate with probability V(h)/K  goal: reduce throughput penalty to 1+ε

21 Simulators Scheduling policies  client: when to fetch work? what project? how much? CPU scheduling  server: what jobs to send to a given client? Problems with in situ experimentation  hard to control  can do a lot of damage Simulators  client simulator: 1 client, N projects  server simulator (EmBA): 1 project, N clients

22 Volunteer-facing features Motivators  competition  community Credit  cross-project statistics Web features  friend lists, private messages, message boards  teams MySpace and Facebook widgets and apps

23 Organizational models Single-scientist projects: a dead-end? Campus-level meta-project: e.g. U. of Houston:  1,000 instructional PCs  5,000 faculty/staff  30,000 students  400,000 alumni Lattice: U. Maryland Center for Bioinformatics MindModeling.org  ACT-R community (~20 universities)‏ IBM World Community Grid  ~8 applications from various institutions Extremadura (Spain)‏  consortium of 5-10 universities EDGeS (SZTAKI)‏  EGEE@home?‏ Almere Grid: community

24 Distributed thinking Stardust@home, Clickworkers, GalaxyZoo, Fold It! What can people do better than computers?

25 New software initiatives Bossa: middleware for distributed thinking  job queueing and replication  volunteer skill estimation Bolt: middleware for web-based training and education Shared infrastructure: malicious useless useful savants BOINC volunteer computing Bolt teaching, training Bossa distributed thinking BOINC Basics accounts, groups, credit, communication

26 Conclusion Volunteer computing  Some big achievements, but not close to potential  Problems are organizational/political, not technical  Volunteer computing + GPUs = ExaFLOPS Distributed thinking  What are the apps?  What are middleware requirements? Interested in either one? – let’s talk! davea@ssl.berkeley.edu


Download ppt "David P. Anderson Space Sciences Lab U.C. Berkeley Exa-Scale Volunteer Computing."

Similar presentations


Ads by Google