Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2.

Similar presentations


Presentation on theme: "Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2."— Presentation transcript:

1 Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2

2 Scaleup - Big Database F Build a 1 TB SQL Server database F Data must be –1 TB –Unencumbered –Interesting to everyone everywhere –And not offensive to anyone anywhere F Loaded –1.1 M place names from Encarta World Atlas –1 M Sq Km from USGS (1 meter resolution) –2 M Sq Km from Russian Space agency (2 m) F Will be on web (world’s largest atlas) F Sell images with commerce server.

3 3 What’s a Terabyte? 1 Terabyte 1,000,000,000 business letters 150 miles of book shelf 100,000,000 book pages 15 miles of book shelf 50,000,000 FAX images 7 miles of book shelf 10,000,000 TV pictures (mpeg) 10 days of video 4,000 LandSat images 16 earth images (100m) Library of Congress (in ASCII) is 25 TB 1980: 200 M$ of disc 10,000 discs 5 M$ of tape silo 10,000 tapes 1998: 100 k$ of magnetic disc 60 discs 50 K$ nearline tape 30 tapes Terror Byte !!

4 Some Other Terror-Byte Databases Kilo Mega Giga Tera Peta Exa Zetta Yotta F TerraServer F Sloan Digital Sky Survey: –40 TB raw, 2 TB cooked –EOS/DIS (picture of planet each week) –15 PB by 2007 F Federal Reserve Clearing house: images of checks –15 PB by 2006 (7 year history) F Nuclear Stockpile Stewardship Program –10 Exabytes (???!!)

5 TerraServer is: ÊAn on-line demo and sales tool directed at IT customers and ISVs ËA test of the Sphinx VLDB features: –Load performance –Online Backup/Restore –Query Performance ÌA “cool 90s app” –Image and Text data –Web-lication –Electronic Commerce “A shameless advertisement of WNT and SQL Server Scalability”

6 Application Requirements F BIG —1 TB of data. F PUBLIC — available on the world wide web. F INTERESTING — to a wide audience F ACCESSIBLE — using standard browsers (IE, Netscape) F REAL — a real application (users can buy imagery) F FREE —cannot require NDA or money to access F FAST — impress customers for BackOffice, StorageWorks F EASY — Inexpensive to develop, deploy, and maintain

7 Project Partners Motivation Distribute DOQs to a wider audience Lower cost of distribution Demo scope & quality of Spin-2 imagery Open new markets for imagery sales SPIN-2 Demo DEC Alpha & StorageWorks™ Scalability Recognized as superior h/w vendor Demo Scalability of NT & SQL Server

8 Database & App UI F Coverage: Range from 70ºN to 70ºS 35% U.S., 1% outside U.S. F Source Imagery: –3.5 TB 1sq meter/pixel Aerial (USGS - 60,000 46Mb B&W- 151Mb Color IR files) –700 GB 1.56 meter/pixelSatellite (Spin-2 - 2400 300 Mb B&W) F Display Imagery: 80 m 225 x 150 pixel images, 1.6 m x 3 sub-sampled views F Nav Tools: –1.5 m place names –“Click-on” Coverage map –Expedia & Virtual Globe map 1.8x1.2km 32m “city view” 1.8x1.2km 16m thumbnail 1.8x1.2km 8m browse 225x150m tile F Concept: User navigates an ‘almost seamless’ image of earth

9 TerraServer Demo F Intranet Beta Sites: –http://terraweb1http://terraweb1 –http://terraweb2http://terraweb2 F Internal Beta Schedule –Mon April 27 - June 23

10 What Microsoft & DEC Contribute F Microsoft’s contribution: –Build an “internet UI” –Design the app and the database –Slice & Dice & Load the data. –Build “electronic stores” for USGS’ for Aerial Images to operate to sell & distribute images –Run a “robust”web site 18 months F Digital contribution: –Provide high-performance processors –provide high capacity, reliable storage. –Provide technical advice

11 F World’s Largest PC! –324 disks (2.4 TB) –8 x 440 mhz Alpha CPU –10 GB RAM

12 Alpha 8400 (8x440) 10GB Ram Enterprise Storage Array StorageTek 9 HSZ70 Ultra-SCSI Dual redundant Controllers 324 9.1 Seagate Disks 6 DLT7000 Quantum Drives FWD SCSI Compaq 5500 4x200mhz Web Servers Compaq 5500 4x200mhz Web Servers To the Web Site Configuration

13 broswer HTML Java Viewer The Internet Web Client Microsoft Automap ActiveX Server Internet Info Server 4.0 Image Delivery Application SQL Server 7 Microsoft Site Server EE Internet Information Server 4.0 Image Provider Site(s) Terra-Server DB Automap Server Sphinx (SQL Server) Terra-Server Stored Procedures Internet Information Server 4.0 Image Server Active Server Pages MTS Terra-Server Web Site Software

14 F Backup and Recovery –Cheyenne ArcServe –Legato Networker –Seagate Backup Exec –Sphinx Backup/Restore Utility F SQL Server Enterprise Manager –DBA Maintenance –SQL Performance Monitor System Management & Maintenance

15 How We Did It F “Chopped” big images into small “tiles” –Sub-sampled tiles to create zoom levels –Tile sizes map to Lat/Lon system –Unique ID assigned to each Tile location u (Z-transform of lat/long or UTM) –Unique ID clusters adjacent tiles onto the same database & index pages F Wrote Load Management program –Runs image cutting job –Loads meta and image data into SQL –Multiple Loaders can run in parallel –Web Active Server Page controls load process

16 USGS Editing Process 1 Degree Latitude DOQQ Origin Point DOQ Tiles Quad Cut 3x6 Jump, Thumb-nails & Browse Images 123 456 78 18 9 64 1 Quadrangle (7.5’ x 7.5’) 1 “QUAD” DOQ Photo (3.75’ x 3.75’) 9 101112 131415 161718 1 Degree Longitude

17 Spin-2 Image Editing Process 48 x 96 cells per sq degree Image aligned to left corner of grid system Non-image squares (all white) are discarded Cut Images are extracted SubSample Jump 16m Browse 8m Tiles are cut 5x5, scrambled output Jpeg 32m Thumb

18 Spin-2 Meta Data  File name (of image)  City 1  State 1  Country  Number of Rows  Number of Columns  Shooting Height  Height of Sun  Date of survey (mm/dd/yyyy)  Time of survey (GMT) (hr:mn:ss)  Upper Left Latitude  Upper Left Longitude  Lower Right Latitude  Lower Right Longitude  Camera System 1  Pixel size 1  Copyright 1 1 Field is not required, if not present, then a blank field is present Semi-colon delimited fields, ASCII encoding 1 records per line

19 Database Design and Load F Build a 1 TB (2**40B) SQL Server Database F Database includes –Gazetteer data for searching –Image data pyramid and metadata F Load the Database –Chop the big images into tiles –BCP data and metadata in –Allow for restart and undo of loads –Create indexes –Check consistency of the data F Keep it Simple, no Tricks, Test the Scaling

20 Jump image 1 pixel = 32x32 m 2 USGS Tile image DOQ of Washington Monument 1 pixel = 1 sq meter Dithered Thumb image 1 pixel = 8x8 m 2 Dithered Browse image 1 pixel = 16x16 m 2 64:11:1 The Image Pyramid F Zooming in on the Washington Monument

21 ‘Logical’ Schema CountryState Place PlaceType FeatureType Gazetteer Star schema Index on image, place, type image, state, type image, state, country, type image, place, state, type image, place, country, type all lookups are fast ImgMeta TileMeta Jump Img BrowseImgTileImg Theme Meta Information TileLog Thumb Img Image Data & Meta Data Lookup by UGrid or ZGrid ID plus resolution Lookups are fast. Indices are in DRAM (auto-magically by SQL) SQL manages all the tiles and indices Images are brought in on demand Lat/Long (U/ZGridId)

22 Gazetteer Design F Classic Snowflake Schema F Top 10 Hint to RE for Cursor Select

23 Image Data Design F Image pyramid stored in DBMS (250 M recs)

24

25 TerraServer File Group Design F Make 28 RAID5 sets from 324 disks Each raid set has 11 disks (16 spare drives) F Make 4 595GB NT volumes Each striped over 7 Raid sets on 7 controllers F Create 26 20,000MB files on F:, 27 on G: F DB is File Group of 53 files (1.011TB) F: G: H: I:

26 Physical DB Design F 324 disks ~ 3 TB of disk space F Configured as RAID5 => ~2.4 TB F Configured as 20 NT volumes –Each volume ~ 120 GB –Big files! F SQL data spread across all volumes. –Combines the 20 files. –One BIG table for the tiles –Images stored as blobs (JPEG compressed) F 2 GB RAM holds –all indices and –gazetteer. DEC Alpha Server T2B2 Alpha 8400 4x400Mhz 2 GB HSZ50 Controller HSZ0 Controller HSZ50 Controller 36 9GB disks PCI U-SCSI HSZ50 Controller(2) 10x tapes

27 Other Details F Active Server pages – faster and easier than DB stored procedures. F Commerce Server is interesting –Images the Inventory u no SKU, u millions of them –USGS built their own u they are very smart, but it is easy u masquerade as a credit-card reader. F The earth is a geoid, and F Every Geographer has a coordinate system (or two). F Tapes are still a nightmare. F Everyone is a UI expert.

28 Physical Database F 53 Files. 20,000MB each F 16,960,000 extents F 135,680,000 pages F Separate tables for DOQ, Spin ‘Themes’ F Each image stored in column of type ‘image’ F All tile images in one (big) table F A number of indexes too

29 TerraServer Tables F USGS DOQ Data –48,000 DOQQ images (45-55mb / image) –Creates 864,000 Jump, Thumb, & Browse images (3.5 m rows) –Creates 55.3 m Tile images (110.6 m rows) F SPIN-2 Data –3200 278 MB images (approximate size) –Creates 620,800 Jump, Thumb, & Browse images (2.5 m rows) –Creates 15.5 m Tile images (31 m rows) F Gazetteer Data –1.1 m named places (Encarta World Atlas) –45 m cell names F Total Rows = 193.7 M

30 The Loading Process F Includes Cutting Images, building BCP files, BCP meta data, BCP image data F First Load 1/97-5/97 for Scalability Day –190 GB actual image data, 800 GB duplicates –Pre-beta Sphinx F Second Load 12/97-4/98 for Web Server –750 GB actual image data, all images recut

31 Image Preperation and Load DLT Tape “tar” \ Drop’N’ DoJob Wait 4 Load LoadMgr DB 100mbit EtherSwitch 108 9.1 GB Drives Enterprise Storage Array Alpha Server 8400 108 9.1 GB Drives 108 9.1 GB Drives STC DLT Tape Library 60 4.3 GB Drives Alpha Server 4100 ESA Alpha Server 4100 LoadMgr DLT Tape NT Backup ImgCutter \ Drop’N’ \Images 10: ImgCutter 20: Partition 30: ThumbImg 40: BrowseImg 45: JumpImg 50: TileImg 55: Meta Data 60: Tile Meta 70: Img Meta 80: Update Place... LoadMgr

32 NT Backup Pre-Process Data Read *.IMD files Generate Ids Generate ZLatLong Sort by ZLatLong Image MetaTile Meta Load Thumb Img Read Image Meta Read Image Data BCP into ImgTbl Load Browse Img Read Image Meta Read Image Data BCP into ImgTbl Load Tile Img Read Tile Meta Read Tile Data BCP into TileTbl *.IMD & *.JPG Load Tile Meta Read Image Meta BCP into TileMeta Load Img Meta Read Image Meta BCP into TileMeta ImgMeta ImgMetaId int OrigMetaId int SrcId int ImgTypeId int XGridId int YGridId int ImgDate Date Hemisphere smallint Continent smallint xxLat smallint xxLong smallint ZLatLong int MetaStr vchar(255) TileMeta TileMetaId int ImgMetaId int OrigMetaId int SrcId int ImgTypeId int XGridId int YGridId int Hemisphere smallint Continent smallint xxLat smallint xxLong smallint ZLatLong int “SRC”ThumbImg ThumbImgId int ImgMetaId int ZLatLong int SrcId int ImgTypeId int PixWidth int PixHeight int ImgData Blob “SRC”BrowseImg BrowseImgId int ImgMetaId int ZLatLong int SrcId int ImgTypeId int PixWidth int PixHeight int ImgData Blob “SRC”TileImg TileImgId int TileMetaId int ZLatLong int SrcId int ImgTypeId int PixWidth int PixHeight int ImgData Blob Meta & Image Load Process

33 The Load Manager F A Workflow System. Manages Job ‘Steps’. F Built as an SQL Database App. Collects Stats. F Would use Data Transformation Services today

34 Load Statistics F 601 DOQ Jobs, 818 Spin Jobs –Each job does 3 meta BCP, 4 Image BCP steps F 5676 Image BCP Steps –106 million total images loaded –546 GB total. 5.4 KB avg image size F For Tile Images (96% of the database) –avg 68,000 images/step. max 757,000 –avg 33 minutes/step. max 596 –total time 796 hours (33 days)

35 F Industrial Strength –High Performance –Online Backups –Simple, Error Free Media Handling –Minimal Recovery Time System Maintenance: Backup & Recovery

36 Project Phases & Characteristics F Load Phase –Ongoing Massive Data Loads –Updates to Fix Errors in Meta-Data –Backups at Key Milestones F Deployed –7 x 24 –Some Updates to Existing Data –Small Loads as More Data Arrives –Infrequent Large Loads

37 SQL Server 7.0 Backup/Restore Features F  Fast F Online Backup Under Load –Minimal Impact F Just the Data F Backup Part of the Database F Minimize Recovery Time –Differential Backups, Log Backups –Restore Only Damaged Files

38 Backup ISVs Address Limitations F Legato NetWorker™ F Computer Associates ArcServe™ F Seagate Backup Exec™ F Others… These Products support SQL Server 6.5 None Support SQL Server 7.0 yet.

39 Deployed 6/98... F ISV Supports SQL Server 7.0 High Performance Backup API F ISV Supports Full Range of SQL Server 7.0 Backup/Restore Features Backup Software Backup API SQL Server Tape Library

40 Backup API Performance

41 Verifying Backup/Restore F Minimal Risk Restore to a Separate System at DECWest – Early Problems with Unreadable Tapes Test System TerraServer Another Terabyte of Disk!

42 TerraServer Backup/Restore Factoids F Backup/Restore Rate F Time Required for Full Database Backup: F Number of DLT Tape Cartridges: 200 GB/Hr (57 MB/sec) 5 Hours 36

43 Other Details F Active Server pages – faster and easier than DB stored procedures. F Commerce Server is interesting –Images the Inventory u no SKU, u millions of them –USGS built their own u they are very smart, but it is easy u masquerade as a credit-card reader. F The earth is a geoid, and F Every Geographer has a coordinate system (or two). F Tapes are still a nightmare. F Everyone is a UI expert.

44 Thank You! SPIN-2 Microsoft BackOffice

45 SQL 7 Testimonial F We started using it March 4 1997 –SQL 7 Pre-Alpha –SQL 7 Alpha –SLQ 7 Beta 1 –SQL 7 Beta F Loaded the DB twice –(we made application mistakes) F Now doing it “right” F Reliability: Great! SQL 7 never lost data F Ease of use: Great! F Functionality: Great!

46


Download ppt "Tom Barclay Jim Gray, Don Slutz, Greg Smith, many others Microsoft Research SPIN-2."

Similar presentations


Ads by Google