Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Tandem Daytona TeraByte Sort: Tsort 1 TB in 47.5 Minutes Daivd Cossock, Sam Fineberg, Pankaj Mehra, John Peck Trophy presentation by Jim Gray.

Similar presentations


Presentation on theme: "1 Tandem Daytona TeraByte Sort: Tsort 1 TB in 47.5 Minutes Daivd Cossock, Sam Fineberg, Pankaj Mehra, John Peck Trophy presentation by Jim Gray."— Presentation transcript:

1 1 Tandem Daytona TeraByte Sort: Tsort 1 TB in 47.5 Minutes Daivd Cossock, Sam Fineberg, Pankaj Mehra, John Peck Trophy presentation by Jim Gray

2 2 Benchmark History Wisconsin Bitton Boral DeWitt Turbyfill IBM TP 1-7 CA and Tony Lukes Debit Credit Gray Datamation Anon et al TPC-A MCC Boral &... TPC-B TPC-C 1970 1980 1990 2000 TPC-W ? Teradata Bollinger &... TPC-D Sort PennySort MinuteSort

3 3 A Short History of Sort April Fools 1995: Datamation Sort –Sort 1M 100 B records –An IO benchmark: 15-min to 1 hr! 1993:{Minute | Penny}x{Daytona | Indy} 1998: TeraByte Sort Web site: http://research.Microsoft.com/barc/SortBenchmark/

4 4 Ground Rules How much can you sort for a penny (in a minute). –Hardware and Software cost –Depreciated over 3 years –1M$ system gets about 1 second, –1K$ system gets about 1,000 seconds. – Time (seconds) = SystemPrice ($) / 946,080 Input and output are disk resident Input is –100-byte records (random data) –key is first 10 bytes. Must create output file and fill with sorted version of input file. Daytona (product) and Indy (special) categories

5 5 Bottleneck Analysis Drawn to linear scale Theoretical Bus Bandwidth 422MBps = 66 Mhz x 64 bits Memory Read/Write ~150 MBps MemCopy ~50 MBps Disk R/W ~15MBps

6 6 Bottleneck Analysis NTFS Read/Write 18 Ultra 3 SCSI on 4 strings (2x4 and 2x5) 3 PCI 64 ~ 155 MBps Unbuffered read (175 raw) ~ 95 MBps Unbuffered write Recently: SQL Server on Xeon: 190MBps scan. Good, but 10x down from S390/SGI/UE10k Memory Read/Write ~250 MBps PCI ~110 MBps Adapter ~70 MBps PCI Adapter 155 MBps

7 7 PennySort Hardware –266 Mhz Intel PPro –64 MB SDRAM (10ns) –Dual Fujitsu DMA 3.2GB EIDE disks Software –NT workstation 4.3 –NT 5 sort Performance –sort 15 M 100-byte records (~1.5 GB) –Disk to disk –elapsed time 820 sec cpu time = 404 sec

8 8 Recent Results NCSAsort: 10.3 GB in.9 minute 60 Intel/NT/Myranet nodes MilleniumSort: 16x Dell NT cluster: 100 MB in 1.08 Sec (Datamation)

9 9 1999 PennySort Daytona & Indy: 2.58 GB in 917 sec HMsort: Brad Helmkamp, Keith McCready, Stenograph LLC Intel 400Mhz 2 IDE disks

10 10 1998 TB Sort Chris Nyberg Nsort SGI 32x Origin2000 151 Minutes

11 11 1999 Terabyte Sort Daytona: Daivd Cossock, Sam Fineberg, Pankaj Mehra, John Peck Tandem/Sandia TSort: 68 CPU ServerNet 47 minutes Indy: IBM SPsort 408 nodes, 1952 cpu 2168 disks 17.6 minutes = 1057sec (all for 1/3 of 94M$, slice price is 64k$ for 4cpu, 2GB ram, 6 9GB disks + interconnect

12 12 Sandia/Compaq/ServerNet/NT Sort Sort 1.1 Terabyte (13 Billion records) in 47 minutes 68 nodes (dual 450 Mhz processors) 543 disks, 1.5 M$ 1.2 GB ps network rap (2.8 GBps pap) 5.2 GB ps of disk rap (same as pap) (rap=real application performance, pap= peak advertised performance )

13 13 SP sort 2 – 4 GBps!

14 14 1999 Sort Records Daytona Indy Penny 2.58 GB in 917 sec HMsort: Brad Helmkamp, Keith McCready, Stenograph LLCHMsort: Brad HelmkampKeith McCready Stenograph LLC 2.58 GB in 917 sec HMsort:HMsort: Brad Helmkamp, Keith McCready, Stenograph LLC Brad HelmkampKeith McCready Stenograph LLC Minute 7.6 GB in 60 seconds Ordinal Nsort SGI 32 cpu Origin IRIX Ordinal Nsort SGI 32 cpu Origin IRIX 10.3 GB in 56.51 sec NOW+MPI HPVMsort Luis Rivera UIUC & Andrew Chien UCSD NOW+MPI HPVMsort Luis RiveraAndrew Chien TeraByt e 49 minutes Daivd Cossock, Sam Fineberg, Pankaj Mehra, John Peck 68x2 Compaq &Sandia Labs Daivd CossockSam FinebergPankaj MehraJohn Peck 1057 seconds SPsort 1952 SP cluster 2168 disks Jm Wyllie Jm Wyllie PDF SPsort.pdf (80KB)SPsort.pdf (80KB) Datamation 1.18 Seconds Phillip Buonadonna, Spencer Low, Josh Coates, UC Berkeley Millennium Sort 16x2 Dell NT Myrinet Phillip BuonadonnaJosh Coates

15 15 Partly hardware Partly software Partly economics 2x/year!

16 16 Progress on Sorting Speedup comes from Moore’s law 40%/year Processor/Disk/Network arrays: 60%/year (this is a software speedup).

17 17 Musings: PennySort=TBsort Sorts 1TB in 1Minute 2 pass so 3TB of disk = 10 disks if 330GB/disk = 5Gps (if each disk is 50Mbps) So, 600 seconds (3TB/5GBps) So, node costs 1.5k$ Costs 100x that today maybe in 10 years?

18 18 Data Gravity Processing Moves to Transducers Move Processing to data sources Move to where the power (and sheet metal) is Processor in –Modem –Display –Microphones (speech recognition) & cameras (vision) –Storage: Data storage and analysis System is “distributed” (a cluster/mob)

19 19 Disk = Node has magnetic storage (100 GB?) has processor & DRAM has SAN attachment has execution environment OS Kernel SAN driverDisk driver File SystemRPC,... ServicesDBMS Applications

20 20 Gbps SAN: 110 MBps SAN: Standard Interconnect PCI: 70 MBps UW Scsi: 40 MBps FW scsi: 20 MBps scsi: 5 MBps LAN faster than memory bus? 1 G B ps links in lab. 100$ port cost soon Port is computer Winsock: 110 MBps (10% cpu utilization at each end) RIP FDDI RIP ATM RIP SCI RIP SCSI RIP FC RIP ?


Download ppt "1 Tandem Daytona TeraByte Sort: Tsort 1 TB in 47.5 Minutes Daivd Cossock, Sam Fineberg, Pankaj Mehra, John Peck Trophy presentation by Jim Gray."

Similar presentations


Ads by Google