Presentation is loading. Please wait.

Presentation is loading. Please wait.

Microsoft Large Databases and Grid Computing Jim Gray Microsoft Research Presentation to Kaiser.

Similar presentations


Presentation on theme: "Microsoft Large Databases and Grid Computing Jim Gray Microsoft Research Presentation to Kaiser."— Presentation transcript:

1 Microsoft Large Databases and Grid Computing Jim Gray Microsoft Research Gray@Microsoft.com http://research.Microsoft.com/~gray Presentation to Kaiser Information Management Briefing 21 May 2003

2 About me in Microsoft research (located in San Francisco) A database researcher –IBM, Tandem, DEC, Microsoft Work on Scalable Systems –Building supercomputers from commodity components. Do academic/government things too –PITAC, GriPhyn TAB, NSF/CISE, Library of Congress, … For the last 4 years, been working with the astronomy community to build the World Wide Telescope.

3 Agenda TerraServer –What it is –What we learned –What we are doing now. SkyServer / WWT –What it is –What we learned –What we are doing now Grid Computing –General comments –Build a web service

4 TerraServer TerraService.net TerraService.net A photo of the United States –1 meter resolution (photographic/topographic) –USGS data –Some demographic data (BestPlaces.net) –Home sales data –Linked to Encarta Encyclopedia 15 TB raw, 6 TB cooked (grows 10GB/w) Point, Pan, zoom interface Among top 1,000 websites –40k visitors/day –4M queries/day –3 B page views (in 5 years) All in an SQL database

5 TerraServer Statistics June 98 Jan 99 Jan 00 May 00 Sept 01Dec 02 SQL 7.0 1.0 TB Db SQL 2000 1.0 TB Db SQL 2000 1.2 TB Db SQL 2000 1.4 TB Db SQL 2000 2.0 TB Db 1 Server / Win NT 4.0 EE2 nd Server / Win 2k DataCenter4 Node / Win2k Datacenter Failover Cluster SQL 7.0 1.0 TB Db 217 m Rows SQL 7.0 1 Server 1.5 TB Db SQL 2000 1 Server.8 TB Db 298 m Rows SQL 7.0.75 TB Db 173 m Rows 755m Rows SQL 2000.8 TB Db 231 m Rows 900 m Rows Unique Users Page Views Image Tiles Db Queries Bytes Xfered Daily Average 40,011 1,266,838 3,735,789 4,484,089 70 gb Peak Day 277,292 12,388,104 10,475,674 163 gb 2,401,209 June 1998 - Oct, 2002 63,656,904 2,015,539,605 5,943,641,024 7,134,186,170 108tb

6 TerraServer Cluster SQL\Inst1 SQL\Inst2 SQL\Inst3 Spare F G L KPQE E JJ O O I H M N R S 2200 2200 2200 2200 2200 2200 2200 2200 2200 One SQL database per rack Each rack contains 4.5 TB 1 rack not in picture 18.0 TB total Meta Data Meta Data Stored on 101 GB Fast, Small Disks (18 x 18.2 GB) Imagery Data Imagery Data Stored on 4 339 GB Slow, Big Disks (15 x 73.8 GB) Added 90 72.8 GB Disks in Feb 2001 to create 18 TB SAN 8 Compaq DL360 Photon Web Servers Fiber SAN Switches 4 Compaq ProLiant 8500 Db Servers

7 Cluster Configuration 1 Compaq SAN switch by Brocade Communications Compaq StorageWorks MA8000/HSG80 Controllers (3) 2 3 Compaq ProLiant 8500 (4) 100-Mbps Ethernet Internet Gigibit Ethernet Microsoft Corporat e LAN Extreme Networks Summit 48 Switch Summit 7i Switch (2) Cisco 12000 Internet Router Compaq DL360 (6) (Windows 2000 Web Servers) TerraServer.microsoft.com Compaq DL360 (10) Database Cluster ADIC LTO Tape Library TerraServer SAN

8 TerraServer Becomes a Web Service TerraServer.net -> TerraService.Net TerraService.Net Web server is for people. Web Service is for programs –The end of screen scraping –No faking a URL: pass real parameters. –No parsing the answer: data formatted into your address space. Hundreds of users but a specific example: –US Department of Agriculture

9 And now.. 4 slides from the customer who built a portal using TerraService

10 Data Gateway Functional Overview Navigation Service Catalog Service Ship Service > Item Broker Customer Orders Data XML Order Placer Listen for OrderPlacer Raised Event Select sequenced Item Output XML rasie event : stats.delivery start validate (dtd) Insert into SQL @@Identity / GUID to client return est time raise OrderMgr.event Order Database Selects from XML Request for data Logger Called by anyone rasies to stats svc' ASP XML Soil Data Viewer Geospatial Data Acknowledges item ready for delivery Data Services Package Service Send order info FTP Services Rimage CD Service Product Catalog Updates Billing Services NCGC - Fort Worth, TexasITC - Fort Collins, Colorado Terra Service

11 Custom End Product Web Soil Data ViewerXML Soil ReportSoil Interpretation Map

12 ESRI Spatial Data Engine WebSDVArcIMS Connector Connects to ArcIMS; communication is done through ArcIMS XML (AXL) Retrieves and processes Soils Data from the NASIS relational Database Image RetrieverIMSNavigator Generates maps (JPGs) using ArcIMS Retrieves imagery from the Microsoft TerraServer Terraserver Geospatial Data Business Rules National Soils Data Database Server - Microsoft SQL Server Database Server - ESRI Spatial Data Server Web Server - COM+ Applications Microsoft Terraserver

13 Brief tour of TerraService Show map service Show some methods See TerraService.NET: An Introduction to Web Services TerraService.NET: An Introduction to Web Services Tom Barclay; Jim Gray; Eric Strand; Steve Ekblad; Jeffrey Richter, MSR TR 2002-53, pp 13, June 2002

14 What We Learned You can build and manage a very popular website with relatively little effort (if you do it right and have Tom Barclay) Loading 20 TB takes a lot of energy And you get to do it many times -- automate Tape and tape software are problematic Triplex and snap-shot disks works (we have never had to use it, but..) The internet gives you 2-9s Servers can run at 4 9s easily, 5 9s with effort.

15 What we are doing now. Building with 3K$ 2TB bricks 4 bricks = 1 backend Triplexing systems Duplexing sites. 4*3*2 = 24k$ for Geoplex Very simple operations model See: TeraScale SneakerNet: Using Inexpensive Disks for Backup, Archiving, and Data Exchange, Jim Gray; Wyman Chong; Tom Barclay; Alex Szalay; Jan Vandenberg, pp. 1-8, May 2002TeraScale SneakerNet: Using Inexpensive Disks for Backup, Archiving, and Data Exchange

16 Agenda TerraServer –What it is –What we learned –What we are doing now. SkyServer / WWT –What it is –What we learned –What we are doing now Grid Computing –General comments –Build a web service

17 SkyServer SkyServer.SDSS.org SkyServer.SDSS.org Like the TerraServer, but looking the other way: a picture of ¼ of the universe Pixels + Data Mining Astronomers get about 400 attributes for each object Get Spectrograms for 1% of the objects

18 Why Astronomy Data? It has no commercial value –No privacy concerns –Can freely share results with others –Great for experimenting with algorithms It is real and well documented –High-dimensional data (with confidence intervals) –Spatial data –Temporal data Many different instruments from many different places and many different times Federation is a goal The questions are interesting –How did the universe form? There is a lot of it (petabytes) IRAS 100 ROSAT ~keV DSS Optical 2MASS 2 IRAS 25 NVSS 20cm WENSS 92cm GB 6cm

19 Demo of SkyServer Shows standard web server Pixel/image data Point and click Explore one object Explore sets of objects (data mining)

20 Virtual Observatory http://www.astro.caltech.edu/nvoconf/ http://www.voforum.org/ http://www.astro.caltech.edu/nvoconf/ http://www.voforum.org/ Premise: Most data is (or could be online) So, the Internet is the worlds best telescope: –It has data on every part of the sky –In every measured spectral band: optical, x-ray, radio.. –As deep as the best instruments (2 years ago). –It is up when you are up. The seeing is always great (no working at night, no clouds no moons no..). –Its a smart telescope: links objects and data to literature on them.

21 Time and Spectral Dimensions The Multiwavelength Crab Nebulae X-ray, optical, infrared, and radio views of the nearby Crab Nebula, which is now in a state of chaotic expansion after a supernova explosion first sighted in 1054 A.D. by Chinese Astronomers. Slide courtesy of Robert Brunner @ CalTech. Crab star 1053 AD

22 Federation Data Federations of Web Services Massive datasets live near their owners: –Near the instruments software pipeline –Near the applications –Near data knowledge and curation –Super Computer centers become Super Data Centers Each Archive publishes a web service –Schema: documents the data –Methods on objects (queries) Scientists get personalized extracts Uniform access to multiple Archives –A common global schema

23 Grid and Web Services Synergy I believe the Grid will be many web services share data (computrons are free) IETF standards Provide –Naming –Authorization / Security / Privacy –Distributed Objects Discovery, Definition, Invocation, Object Model –Higher level services: workflow, transactions, DB,.. Synergy: commercial Internet & Grid tools

24 Web Services: The Key? Web SERVER: –Given a url + parameters –Returns a web page (often dynamic) Web SERVICE: –Given a XML document (soap msg) –Returns an XML document –Tools make this look like an RPC. F(x,y,z) returns (u, v, w) –Distributed objects for the web. –+ naming, discovery, security,.. Internet-scale distributed computing Your program Data In your address space Web Service soap object in xml Your program Web Server http Web page

25 SkyQuery: a prototype Defining Astronomy Objects and Methods. Federated 3 Web Services (fermilab/sdss, jhu/first, Cal Tech/dposs) multi-survey cross-match Distributed query optimization (T. Malik, T. Budavari, Alex Szalay @ JHU) http://SkyQuery.net/ My first web service (cutout + annotated SDSS images) online –http://skyservice.pha.jhu.edu/devel/ImgCutout/chart.asphttp://skyservice.pha.jhu.edu/devel/ImgCutout/chart.asp WWT is a great Web Services (.Net) application –Federating heterogeneous data sources. –Cooperating organizations –An Information At Your Fingertips challenge.

26 Demo of Image Cutout Service Shows image cutout Show project and debugging project Show hello World Show theAnswer method

27 SkyQuery ( http://skyquery.net/) http://skyquery.net/ Distributed Query tool using a set of services Feasibility study, built in 6 weeks from scratch –Tanu Malik (JHU CS grad student) –Tamas Budavari (JHU astro postdoc) Implemented in C# and.NET Allows queries like: SELECT o.objId, o.r, o.type, t.objId FROM SDSS:PhotoPrimary o, TWOMASS:PhotoPrimary t WHERE XMATCH(o,t)<3.5 AND AREA(181.3,-0.76,6.5) AND o.type=3 and (o.I - t.m_j)>2

28 SkyNode Basic Web Services Metadata information about resources –Waveband –Sky coverage –Translation of names to universal dictionary (UCD) Simple search patterns on the resources –Cone Search –Image mosaic –Unit conversions Simple filtering, counting, histogramming On-the-fly recalibrations

29 Portals: Higher Level Services Built on Atomic Services Perform more complex tasks Examples –Automated resource discovery –Cross-identifications –Photometric redshifts –Outlier detections –Visualization facilities Goal: –Build custom portals in days from existing building blocks (like today in IRAF or IDL)

30 Architecture Image cutout SkyNode SDSS SkyNode 2Mass SkyNode First SkyQuery Web Page

31 Summary So Far Some real web services deployed today Easy to build & deploy Services publish data, Portals unify it Tools really work! Im using C# and foundation classes of VisualStudio, a great! Tool A nice book explaining the ideas: (.Net Framework Essentials, Thai, Lam isbn 0-596-00302-1) (.Net Framework Essentials, Thai, Lam isbn 0-596-00302-1

32 Possible Relevance to You This web service stuff is REAL If you have a class, It is a way to publish data: Internet Intranet It is a way to find data data comes with schema no more screen scraping/parsing Business model unclear –Your ideas go here. Your program Data In your address space Web Service soap object in xml

33 What We Learned Web services really are a breakthrough. Data mining worked beautifully. See Data Mining the SDSS SkyServer Database, J. Gray, D. Slutz, A. Szalay, A. Thakar, P. Kuntz, C. Stoughton, MSR TR 2002-1, pp1-40, 2002. Data Mining the SDSS SkyServer Database, You can operate a system in Chicago from San Francisco – Terminal Server is wonderful. The Internet gives you 2 9s of availability TeraScale SneakerNet works well

34 What we are doing now. Loading more data (next data release) Preparing for the next generation Building the WWT Web Services for the Virtual Observatory, Alexander S. Szalay, Tamás Budavária, Tanu Malika, Jim Gray, and Ani Thakar, SPIE Astronomy Telescopes and Instruments, 22-28 August 2002, Waikoloa, Hawaii,Web Services for the Virtual Observatory, Petabyte Scale Data Mining: Dream or Reality?, Alexander S. Szalay; Jim Gray; Jan vandenBerg, SIPE Astronomy Telescopes and Instruments, 22-28 August 2002, Waikoloa, Hawaii,Petabyte Scale Data Mining: Dream or Reality?, Online Scientific Data Curation, Publication, and Archiving Jim Gray; Alexander S. Szalay; Ani R. Thakar; Christopher Stoughton; Jan vandenBerg, SPIE Astronomy Telescopes and Instruments, 22-28 August 2002, Waikoloa, Hawaii,Online Scientific Data Curation, Publication, and Archiving

35 Agenda TerraServer –What it is –What we learned –What we are doing now. SkyServer / WWT –What it is –What we learned –What we are doing now Grid Computing –General comments –Build a web service

36 The Grid Computation Grid: harvest Internet cpus. Data Grid: Share files Application Grid: Web services Access Grid: teleconferencing

37 The Microsoft View Web Services will subsume the Grid –The Grid will be data and services not renting cycles OGSA: evolution of Globus Toolkit to Web services concepts and technologies… Lots of encouragement from Microsoft, IBM, Oracle, Sun GGF as forum for discussion

38 Engagement with Grid Community Goal: GXA as infrastructure for Grids Working with Globus & GGF –Funding work at Argonne National Lab (Globus) –Globus Toolkit 3, and CondorG on Windows http://www.globus.org/win-alpha/ (we sponsored this)http://www.globus.org/win-alpha/ –OGSA for.NET (prototyping) http://www.globus.org/ogsa/ –Also OGSI.NET at U. VA is very interesting http://www.cs.virginia.edu/~gsw2c/ogsi.net.html –GGF Active membershp HPC.net kit – see http://www.microsoft.com/HPC http://www.microsoft.com/HPC –Part of.net server scale out development –Includes MPI-CH 1.2.4, distributed job scheduler,… –Thomas Sterling, Beowulf on Windows, MIT Press 2001

39 Whats Microsoft Doing Mostly.NET, W3C standards, web services, … I think SkyQuery is the best web service (grid app) in GriPhyN today. My stuff is grid computing But… Globus (GT3), OGSA, and CondorG ported to Windows (we sponsored it) We have a HPC toolkit: MPI-CH 1.2.4 See http://www.microsoft.com/windows2000/hpc/ for many useful links http://www.microsoft.com/windows2000/hpc/

40 I Can Talk About Computing on Demand But… Best to read Distributed Computing Economics, Jim Gray, MSR-TR-2003-24, March 2003Distributed Computing Economics The slides that follow are based on that paper.

41 Distributed Computing Economics Why is Seti@Home a great idea Why is Napster a great deal? Why is the Computational Grid uneconomic When does computing on demand work? What is the right level of abstraction Is the Access Grid the real killer app?

42 Computing is Free Computers cost 1k$ (if you shop right) So 1 cpu day == 1$ If you pay the phone bill (and I do) Internet bandwidth costs 50 … 500$/mbps/m (not including routers and management). So 1GB costs 1$ to send and 1$ to receive

43 Why is Seti@Home a Good Deal? Send 300 KB for costs 3e-4$ User computes for ½ day:benefit.5e-1$ ROI: 1500:1

44 Why is Napster a Good Deal? Send 5 MB costs 5e-3$ ½ a penny per song Both sender and receiver can afford it. Same logic powers web sites (Yahoo!...): –1e-3$/page view advertising revenue –1e-5$/page view cost of serving web page –100:1 ROI

45 The Cost of Computing: Computers are NOT free! Capital Cost of a TpcC system is mostly storage and storage software (database) IBM 32 cpu, 512 GB ram 2,500 disks, 43 TB (680,613 tpmC @ 11.13 $/tpmc available 11/08/03) http://www.tpc.org/results/individual_results/IBM/IBMp690es_05092003.pdf http://www.tpc.org/results/individual_results/IBM/IBMp690es_05092003.pdf A 7.5M$ super-computer Total Data Center Cost: 40% capital &facilities 60% staff (includes app development)

46 Computing Equivalents 1 $ buys 1 day of cpu time 4 GB ram for a day 1 GB of network bandwidth 1 GB of disk storage 10 M database accesses 10 TB of disk access (sequential) 10 TB of LAN bandwidth (bulk)

47 Some consequences Beowulf networking is 10,000x cheaper than WAN networking factors of 10 5 matter. The cheapest and fastest way to move a Terabyte cross country is sneakernet. 24 hours = 4 MB/s 50$ shipping vs 1,000$ wan cost. Sending 10PB CERN data via network is silly: buy disk bricks in Geneva, fill them, ship them – one way. TeraScale SneakerNet: Using Inexpensive Disks for Backup, Archiving, and Data Exchange Jim Gray; Wyman Chong; Tom Barclay; Alex Szalay; Jan vandenBerg Microsoft Technical Report may 2002, MSR-TR-2002-54 http://research.microsoft.com/research/pubs/view.aspx?tr_id=569

48 How Do You Move A Terabyte? 14 minutes6172001,920,0009600OC 1922.2 hours1000Gbps 1 day100100 Mpbs 14 hours97631649,000155OC3 2 days2,01065128,00043T3 2 months2,4698001,2001.5T1 5 months360117700.6Home DSL 6 years3,0861,000400.04 Home phone Time/TB $/TB Sent $/Mbps Rent $/month Speed Mbps Context

49 Computational Grid Economics To the extent that computational grid is like Seti@Home or ZetaNet or Folding@home or… it is a great thing The extent that the computational grid is MPI or data analysis, it fails on economic grounds: move the programs to the data, not the data to the programs. The Internet is NOT the cpu backplane. The USG should not hide this economic fact from the academic/scientific research community.

50 Computing on Demand Was called outsourcing / service bureaus in my youth. CSC and IBM did it. Payroll is standard outsource. Now we have Hotmail, Salesforce.com, Oracle.com,…. Works for standard apps. Airlines outsource reservations. Banks outsource ATMs. But Amazon, Amex, Wal-Mart,... Cant outsource their core competence. So, COD works for commoditized services. It is not a new way of doing things: think payroll.

51 Whats the right abstraction level for Internet Scale Distributed Computing? Disk block? No too low. File? No too low. Database? No too low. Application? Yes, of course. –Blast search –Google search –Send/Get eMail –Portals that federate astronomy archives (http://skyQuery.Net/)http://skyQuery.Net/ Web Services (.NET, EJB, OGSA) give this abstraction level.

52 Access Grid Q: What comes after the telephone? A: eMail? A: Instant messaging? Both seem retro technology: text & emotons. Access Grid could revolutionize human communication. But, it needs a new idea. Q: What comes after the telephone?

53 Distributed Computing Economics Why is Seti@Home a great idea? Why is Napster a great deal? Why is the Computational Grid uneconomic When does computing on demand work? What is the right level of abstraction? Is the Access Grid the real killer app? Based on: Distributed Computing Economics, Jim Gray, Microsoft Tech report, March 2003, MSR-TR-2003-24 http://research.microsoft.com/research/pubs/view.aspx?tr_id=655

54 Agenda TerraServer –What it is –What we learned –What we are doing now. SkyServer / WWT –What it is –What we learned –What we are doing now Grid Computing –General comments –Build a web service


Download ppt "Microsoft Large Databases and Grid Computing Jim Gray Microsoft Research Presentation to Kaiser."

Similar presentations


Ads by Google