Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright Gordon Bell Clusters & Grids The CC – GRID? Era CC GS C 2002 Gordon Bell Bay Area Research Center Microsoft Corporation.

Similar presentations


Presentation on theme: "Copyright Gordon Bell Clusters & Grids The CC – GRID? Era CC GS C 2002 Gordon Bell Bay Area Research Center Microsoft Corporation."— Presentation transcript:

1 Copyright Gordon Bell Clusters & Grids The CC – GRID? Era CC GS C 2002 Gordon Bell (gbell@microsoft.com) Bay Area Research Center Microsoft Corporation

2 Copyright Gordon Bell Clusters & Grids

3 Observations from a mostly Grid workshop Clusters. Let’s finish the job! Grids generally. Grids as arbitrary cluster platforms…why? Examples of Grid-types, especially web services Summary…

4 Copyright Gordon Bell Clusters & Grids Blades aka a “cluster in a cabinet” 366 servers per 44U cabinet – Single processor – 2 - 30 GB/computer (24 TBytes) – 2 - 100 Mbps Ethernets ~10x perf*, power, disk, I/O per cabinet ~3x price/perf Network services… Linux based *42, 2 processors, 84 Ethernet, 3 TBytes

5 Clusters aren’t as bad as programs make them out to be, but we need to make them work better and be more transparent. Everything is becoming a cluster. Certainly all of 500! 64 bit addressing will cause more change! Future nodes should bet on CLMP smP’s (p = 4-32). Utilize existing and emerging smP’s nodes versus assuming lcd PM-pairs & MPI. Massive gains from compiler and runtime. ES has set a new standard of efficiency and system transparency for “clusters”. Expand the MPI programming model: – Full transparency of MPI needs to be the goal – Objectify for greater flexibility and greater insulation from latency

6 Copyright Gordon Bell Clusters & Grids Grids: If they are the solution what’s the problem? Economics… thief, scavenger, power, efficiency or resource sharing? Research funding… that’s where the money is Are they where the problems lie? Does massive collaboration that the Grids enable, create massive overhead and generally less output? Unless the output is for a community! Is funding and middleware a good investment?

7 Same observations as 2000 GRID was/is an exciting concept … – They can/must work within a community, organization, or project. Apps need to drive. – “Necessity is the mother of invention.” Taxonomy… interesting vs necessity – Cycle scavenging and object evaluation (e.g. seti@home, QCD)seti@home – File distribution/sharing for IP theft e.g. Napster – Databases &/or programs for a community (astronomy, bioinformatics, CERN, NCAR) – Workbenches: web workflow chem, bio… – Exchanges… many sites operating together – Single, large objectified pipeline… e.g. NASA. – Grid as a cluster platform! Transparent & arbitrary access including load balancing Web SVCs X

8 Grid n j. An arbitrary distributed, cluster platform A geographical and multi-organizational collection of diverse computers dynamically configured as cluster platforms responding to arbitrary, ill-defined jobs “thrown” at it. Costs are not necessarily favorable e.g. disks are less expensive than cost to transfer data. Latency and bandwidth are non-deterministic, thereby changing cluster characteristics Once a large body of data exists for a job, it is inherently bound to (set into) fixed resources. Large datasets & I/O bound programs need to be with their data or be database accesses… But are there resources there to share? Bound to cost more?

9 Bright spots… near term, user focus, a lesson for Grid suppliers Tony Hey apps-based funding. Web services based Grid & data orientation. David Abramson - Nimrod. – Parameter scans… other low hanging fruit – Encapsulate apps! “Excel”-- language/control mgmt. – “Legacy apps are programs that users just want, and there’s no time or resources to modify code …independent of age, author, or language e.g. Java.” Andrew Grimshaw - Avaki – Making Legion vision real. A reality check. Lip 4 pairs of “web services” based apps Gray et al Skyservice and Terraservice Goal: providing a web service must be as easy as publishing a web page…and will occur!!!

10 Copyright Gordon Bell Clusters & Grids SkyServer: delivering a web service to the astronomy community. Prototype for other sciences? Gray, Szalay, et al First paper on the SkyServer http://research.microsoft.com/~gray/Papers/MSR_ TR_2001_77_Virtual_Observatory.pdf http://research.microsoft.com/~gray/Papers/MSR_ TR_2001_77_Virtual_Observatory.doc Later, more detailed paper for database community http://research.microsoft.com/~gray/Papers/MSR_ TR_01_104_SkyServer_V1.pdf http://research.microsoft.com/~gray/Papers/MSR_ TR_01_104_SkyServer_V1.doc

11 Copyright Gordon Bell Clusters & Grids What can be learned from Sky Server? It’s about data, not about harvesting flops 1-2 hr. query programs versus 1 wk programs based on grep 10 minute runs versus 3 day compute & searches Database viewpoint. 100x speed-ups – Avoid costly re-computation and searches – Use indices and PARALLEL I/O. Read / Write >>1. – Parallelism is automatic, transparent, and just depends on the number of computers/disks. Limited experience and talent to use dbases.

12 Copyright Gordon Bell Clusters & Grids Heuristics for building communities that need to share data & programs Always go from working to working Do it by induction in time and space (Why version 3 is pretty good.) Put ONE database in place that’s useful by itself in terms of UI, content, & queries Invent and demo 10-20 instances of use Get two working in a single location Extend to include a second community, with an appropriate superset capability

13 Some science is hitting a wall FTP and GREP are not adequate (Jim Gray) You can GREP 1 GB in a minute You can GREP 1 TB in 2 days You can GREP 1 PB in 3 years. 1PB ~10,000 >> 1,000 disks At some point you need indices to limit search parallel data search and analysis Goal using dbases. Make it easy to – Publish: Record structured data – Find data anywhere in the network Get the subset you need! – Explore datasets interactively Database becomes the file system!!! You can FTP 1 MB in 1 sec. You can FTP 1 GB / min. … 2 days and 1K$ … 3 years and 1M$

14 Network concerns Very high cost – $(1 + 1) / GByte to send on the net; Fedex and 160 GByte shipments are cheaper – DSL at home is $0.15 - $0.30 Disks cost less than $2/GByte to purchase Low availability of fast links (last mile problem) – Labs & universities have DS3 links at most, and they are very expensive – Traffic: Instant messaging, music stealing Performance at desktop is poor – 1- 10 Mbps; very poor communication links Manage: trade-in fast links for cheap links!!

15 Gray’s $2.4 K, 1 TByte Sneakernet aka Disk Brick Courtesy of Jim Gray, Microsoft Bay Area Research Cost to move a Terabyte Cost, time, and speed to move a Terabyte Cost of a “Sneaker-Net” TB We now ship NTFS/SQL disks. Not good format for Linux. Ship NFS/CIFS/ODBC servers (not disks). Plug “disk” into LAN. DHCP then file or DB serve… Web Service in long term

16 Cost to move a Terabyte

17 Cost, time of Sneaker-net vs Alts Medi aRobot$ Media $ TB read + write time ship time TotalTim/ TBMbps Cost (10 TB) $/TB shipped CD 15002x80024060 hrs 24 hrs6 days28$2 K$208 DVD 2002x8K40060 hrs 24 hrs6 days28$20 K$2,000 Tape 252x15K100092 hrs 24 hrs5 days18$31 K$3,100 DiskBric 71K1,40019 hrs 24 hrs2 days52 $2.6 K$260 Courtesy of Jim Gray, Microsoft Bay Area Research

18 Copyright Gordon Bell Clusters & Grids Grids: Real and “personal” Two carrots, one downside. A bet. Bell will match any Gordon Bell Prize (parallelism, performance, or performance/cost) winner’s prize that is based on “Grid Platform Technology”. I will bet any individual or set of individuals of the Grid Research community up to $5,000 that a Grid application will not win the above by SC2005.

19 Copyright Gordon Bell Clusters & Grids The End How can GRIDs become a real, useful, computer structure? Get a life. Adopt an application community! Success if CCGSC2004 is the last …by making Grids ubiquitous.


Download ppt "Copyright Gordon Bell Clusters & Grids The CC – GRID? Era CC GS C 2002 Gordon Bell Bay Area Research Center Microsoft Corporation."

Similar presentations


Ads by Google