The Personal Petabyte The Enterprise Exabyte

The Personal Petabyte The Enterprise Exabyte
Jim Gray Microsoft Research Presented at IIST Asilomar 10 December 2003

Outline History Changing Ratios Who Needs a Petabyte? Thesis:
in 20 years, Personal Petabyte will be affordable. Most personal bytes will be video. Enterprise Exabytes will be sensor data.

An Early Disk Phaistos Disk: No one can read it  1700 BC
Minoan (Cretian, Greek) No one can read it 

Early Magnetic Disk 1956 IBM 305 RAMAC 4 MB 50x24” disks 1200 rpm
100 ms access 35k$/y rent Included computer & accounting software (tubes not transistors)

10 years later (1966 Illiac) 30 MB 1.6 meters

Or 1970 IBM 2314 at 29MB 970

History: 1980 Winchester Seagate 5 ¼” 5 MB Fujitsu Eagle 10” 450MB

The MAD Future Terror Bytes8
In the beginning there was the Paramagnetic Limit: 10Gbpsi Limit keeps growing (now ~ 200Gbpsi) Mark H. Kryder, Seagate Future Magnetic Recording Technologies FAST PDF. apologizes: “Only 100x density improvement, then we are out of ideas” That’s 20 TB desktop TB laptop!

Outline Changing Ratios History Who Needs a Petabyte? Disk to Ram
DASD is Dead Disk space is free Disk Archive-Interchange Network faster than disk Capacity, Access TCO == people cost Smart disks happened The entry cost barrier Who Needs a Petabyte?

Storage Ratios Changed
10x better access time 10x more bandwidth 100x more capacity Data 25x cooler (1Kaps/20MB vs 1Kaps/500MB) 4,000x lower media price 20x to 100x lower disk price Scan takes 10x longer (3 min vs 45 min) RAM/disk media price ratio changed :1 :1 :1 today ~ $/GB disk 200: $/GB dram

Price_Ram_TB(t+10) = Price_Disk_TB(t) Disk Data Can Move to RAM in 10 years
Disk ~100x cheaper than RAM per byte Both get 100x bigger in 10 years. Move data to main memory Seems: RAM/Disk bandwidth ~100:1 100:1 10 years

DASD (direct access storage device) is Dead
accesses got cheaper Better disks Cheaper disks! Disk access/bandwidth: the scarce resource 2003: 100 minute Scan : 5 minute Scan Sequential bandwidth 50x faster than random Random Scan 3 days Ratio will get 10x worse in 10 years 100x more capacity, 10x more bandwidth. Invent ways to trade capacity for bandwidth Use the capacity without using bandwidth. 300 GB 50 MB/s

Disk Space is “free” Bandwidth & Accesses/sec are not
1k$/TB going to 100$/TB 20 TB disks on the (distant) horizon 100x density, Waste capacity intelligently Version everything Never delete anything Keep many copies Snapshots Mirrors (triple and geoplex) Cooperative caching (Farsite and OceanStore) Disk Archive

Disk as Archive-Interchange
Tape is archive / interchange / low cost Disc now competitive in all 3 categories What format? Fat? CDFS?.. What tools? Need the software to do disk-based backup/restore Commonly snapshot (multi-version FS) Radical: peer-to-peer file archiving Many researchers looking at this OceanStore, Farsite, others…

Disk vs Network Now the Network is Faster (!)
Old days: 10 MBps disk, low cpu cost ( 0.1 ins/b) 1 MBps net, huge cpu cost (10 ins/b) New days: 50 MBps disk, low cpu cost 100 MBps net, low cpu cost (toe, rdma) Consequence: You can remote disks. Allows consolidation Aggregate (bisection) bandwidth still a problem.

Storage TCO == people time
1980 rules-of-thumb: 1 systems programmer per mips 1 data admin per 10GB 800 sys programmers + 4 data admins for your laptop Sometimes it must seem like that but… Today one data admin per 1 TB TB Depending on process and data value. Automate everything Use redundancy to mask (and repair) problems. Save people, spend hardware

Disk Evolution: Smart Disks
Kilo Mega Giga Tera Peta Exa Zetta Yotta System on a chip High-speed LAN Disk is super computer!

Smart Disks Happened Disk appliances are here: Cameras Games PVRs
FileServers Challenge: entry price

The Entry Cost Barrier Connect the Dots
Consumer electronics want low entry cost 1970: 20,000$ 1980: 2,000$ 2000: $ $ If magnetics can’t do this, another technology will. Think: copiers, hydraulic shovels,… WantedToday ln(price) Time

Outline Yotta Zetta Exa History Peta Changing Ratios
Tera Giga Mega Kilo Outline History Changing Ratios Who Needs a Petabyte? Petabyte for 1k$ in years Affordable but useless How much information is there? The Memex vision MyLifeBits The other 20% (enterprise storage) We are here

A Bleak Future: The ½ Platter Society?
Conclusion from Information Storage Industry Consortium HDD Applications Roadmap Workshop: “Most users need only 20GB” We are heading to a ½ platter industry. 80% of units and capacity is personal disks (not enterprise servers). The end of disk capacity demand. A zero billion dollar industry?

Try to fill a terabyte in a year
Item Items/TB Items/day 300 KB JPEG 3 M 9,800 1 MB Doc 1 M 2,900 1 hour 256 kb/s MP3 audio 9 K 26 1 hour 1.5 Mbp/s MPEG video 290 0.8 Petabyte volume has to be some form of video.

Growth Comes From NEW Apps
The 10M$ computer of 1980 costs 1k$ today If we were still doing the same things, IT would be a 0 B$/y industry NEW things absorb the new capacity 2010 Portable ? 100 Gips processor 1 GB RAM 1 TB disk 1 Gbps network Many form factors

The Terror Bytes are Here
1 TB costs 1k$ to buy 1 TB costs 300k$/y to own Management & curation are expensive (I manage about 15TB in my spare time. no, I am not paid 4.5M$/y to manage it) Searching 1TB takes minutes or hours or days or.. I am Petrified by Peta Bytes But… people can “afford” them so, we have lots to do – Automate! Yotta Zetta Exa Peta Tera Giga Mega Kilo We are here

How much information is there?
Yotta Zetta Exa Peta Tera Giga Mega Kilo Soon everything can be recorded and indexed Most bytes will never be seen by humans. Data summarization, trend detection anomaly detection are key technologies See Mike Lesk: How much information is there: See Lyman & Varian: How much information Everything! Recorded All Books MultiMedia All books (words) .Movie A Photo A Book 24 Yecto, 21 zepto, 18 atto, 15 femto, 12 pico, 9 nano, 6 micro, 3 milli

Memex As We May Think, Vannevar Bush, 1945
“A memex is a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility” “yet if the user inserted 5000 pages of material a day it would take him hundreds of years to fill the repository, so that he can be profligate and enter material freely”

Why Put Everything in Cyberspace?
Low rent min $/byte Shrinks time now or later Shrinks space here or there Automate processing knowbots Point-to-Point OR Broadcast Immediate OR Time Delayed Locate Process Analyze Summarize

How Will We Find Anything?
Need Queries, Indexing, Pivoting, Scalability, Backup, Replication, Online update, Set-oriented access If you don’t use a DBMS, you will implement one! Simple logical structure: Blob and link is all that is inherent Additional properties (facets == extra tables) and methods on those tables (encapsulation) More than a file system Unifies data and meta-data SQL ++ DBMS

MyLifeBits The guinea pig
Gordon Bell is digitizing his life Has now scanned virtually all: Books written (and read when possible) Personal documents (correspondence, memos, , bills, legal,0…) Photos Posters, paintings, photo of things (artifacts, …medals, plaques) Home movies and videos CD collection And, of course, all PC files Now recording: phone, radio, TV (movies), web pages… conversations Paperless throughout ” scanned, 12’ discarded. Only 30 GB!!! Excluding digital videos Video is 2+ TB and growing fast

Capture and encoding

I mean everything

gbell wag: 67 yr, 25Kday life a Personal Petabyte
1PB

80% of data is personal / individual. But, what about the other 20%?
Business Wall Mart online: 1PB and growing…. Paradox: most “transaction” systems < 1 PB. Have to go to image/data monitoring for big data Government Government is the biggest business. Science LOTS of data.

Information Avalanche
Both better observational instruments and Better simulations are producing a data avalanche Examples Turbulence: 100 TB simulation then mine the Information BaBar: Grows 1TB/day 2/3 simulation Information 1/3 observational Information CERN: LHC will generate 1GB/s 10 PB/y VLBA (NRAO) generates 1GB/s today NCBI: “only ½ TB” but doubling each year, very rich dataset. Pixar: 100 TB/Movie Image courtesy of C. Meneveau & A. Szalay @ JHU

Q: Where will the Data Come From? A: Sensor Applications
Earth Observation 15 PB by 2007 Medical Images & Information + Health Monitoring Potential 1 GB/patient/y  1 EB/y Video Monitoring ~1E8 video 1E5 MBps  10TB/s  100 EB/y  filtered??? Airplane Engines 1 GB sensor data/flight, 100,000 engine hours/day 30PB/y Smart Dust: ?? EB/y

Instruments: CERN – LHC Peta Bytes per Year
Looking for the Higgs Particle Sensors: GB/s (1TB/s ~ 30 EB/y) Events GB/s Filtered GB/s Reduced GB/s ~ 2 PB/y Data pyramid: 100GB : 1TB : 100TB : 1PB : 10PB CERN Tier 0

LHC Requirements (2005- ) 1E9 events pa @ 1MB/ev = 1PB/year/expt
Reconstructed = 100TB/recon/year/expt Send to Tier1 Regional Centres => 400TB/year to RAL? Keep one set + derivatives on disk …and rest on tape But UK plans a Tier1 clone Many data clones Source: John Gordon IT Department, CLRC/RAL CUF Meeting, October 2000

Science Data Volume ESO/STECF Science Archive
100 TB archive Similar at Hubble, Keck, SDSS,… ~1PB aggregate

Data Pipeline: NASA Level 0: raw data data stream
Level 1: calibrated data measured values Level 1A: calibrated & normalized flux/magnitude/… Level 2: derived data metrics vegetation index Data volume 0 ~ 1 ~ 1A << 2 Level 2 >> level 1 because MANY data products Must keep all published data Editions (versions) E1 E2 E3 E4 time Level 1A 4 editions of 4 Level 2 products, each is small, but… EOSDIS Core System Information for Scientists,

DataGrid Computing Store exabytes twice (for redundancy)
Access them from anywhere Implies huge archive/data centers Supercomputer centers become super data centers Examples: Google, Yahoo!, Hotmail, BaBar, CERN, Fermilab, SDSC, …

Outline History Changing Ratios Who Needs a Petabyte? Thesis:
in 20 years, Personal Petabyte will be affordable. Most personal bytes will be video. Enterprise Exabytes will be sensor data.

Bonus Slides

TerraServer V4 8 web front end 4x8cpu+4GB DB
18TB triplicate disks Classic SAN (tape not shown) ~2M$ Works GREAT! 2000…2004 Now replaced by.. WEB x8 SAN SQL x4

TerraServer V5 Storage Bricks
KVM / IP Storage Bricks “White-box commodity servers” 4tb raw / 2TB Raid1 SATA storage Dual Hyper-threaded Xeon 2.4ghz, 4GB RAM Partitioned Databases (PACS – partitioned array) 3 Storage Bricks = 1 TerraServer data Data partitioned across 20 databases More data & partitions coming Low Cost Availability 4 copies of the data RAID1 SATA Mirroring 2 redundant “Bunches” Spare brick to repair failed brick 2N+1 design Web Application “bunch aware” Load balances between redundant databases Fails over to surviving database on failure ~100K$ capital expense.

How Do You Move A Terabyte?
Time/TB $/TB Sent $/Mbps Rent $/month Speed Mbps Context 6 years 3,086 1,000 40 0.04 Home phone 5 months 360 117 70 0.6 Home DSL 2 months 2,469 800 1,200 1.5 T1 2 days 2,010 651 28,000 43 T3 14 hours 976 316 49,000 155 OC3 14 minutes 617 200 1,920,000 9600 OC 192 1 day 100 100 Mpbs 2.2 hours 1000 Gbps Source: TeraScale Sneakernet, Microsoft Research, Jim Gray et. all

Key Observations for Personal Store And for Larger Stores.
Schematized storage can help organization and search. Schematized XML data sets a universal way exchange data answers and new data. If data are objects, then need standard representation for classes & methods.

Longhorn - For Knowledge Workers
Simple (Self-*): auto install/manage/tune/repair. Schema: data carries semantics Search: find things fast (driven by schema) Sync: “desktop state” anywhere Security: (Palladium) -- trustworthy - privacy - trustworthy (virus, spam,..) - DRM (protect IP) Shell: task-based UI (aka activity-based UI) Office-Longhorn Intelligent documents XML and Schemas

How Do We Represent It To The Outside World? Schematized Storage
<?xml version="1.0" encoding="utf-8" ?> - <DataSet xmlns=" - <xs:schema id="radec" xmlns="" xmlns:xs=" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata"> <xs:element name="radec" msdata:IsDataSet="true"> <xs:element name="Table"> <xs:element name="ra" type="xs:double" minOccurs="0" /> <xs:element name="dec" type="xs:double" minOccurs="0" /> … - <diffgr:diffgram xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1"> - <radec xmlns=""> - <Table diffgr:id="Table1" msdata:rowOrder="0"> <ra> </ra> <dec> </dec> </Table> - <Table diffgr:id="Table10" msdata:rowOrder="9"> <ra> </ra> <dec> </dec> </Table> </radec> </diffgr:diffgram> </DataSet> File metaphor too primitive: just a blob Table metaphor too primitive: just records Need Metadata describing data context Format Providence (author/publisher/ citations/…) Rights History Related documents In a standard format XML and XML schema DataSet is great example of this World is now defining standard schemas schema Data or difgram

There Is A Problem Niklaus Wirth: Algorithms + Data Structures = Programs GREAT!!!! XML documents are portable objects XML documents are complex objects WSDL defines the methods on objects (the class) But will all the implementations match? Think of UNIX or SQL or C or… This is a work in progress.

Disk Storage Cheaper Than Paper
File Cabinet (4 drawer) 250$ Cabinet: Paper (24,000 sheets) 250$ Space 10€/ft2) 180$ Total 700$ $/sheet pennies per page Disk: disk (250 GB =) 250$ ASCII: 100 m pages 2e-6 $/sheet(10,000x cheaper) micro-dollar per page Image: m photos 3e-4 $/photo (100x cheaper) milli-dollar per photo Store everything on disk Note: Disk is 100x to 1000x cheaper than RAM

Data Analysis Looking for Needles are easier than haystacks
Needles in haystacks – the Higgs particle Haystacks: Dark matter, Dark energy Needles are easier than haystacks Global statistics have poor scaling Correlation functions are N2, likelihood techniques N3 As data and computers grow at same rate, we can only keep up with N logN A way out? Discard notion of optimal (data is fuzzy, answers are approximate) Don’t assume infinite computational resources or memory Requires combination of statistics & computer science

Analysis and Databases
Much statistical analysis deals with Creating uniform samples – Data filtering Assembling relevant subsets Estimating completeness Censoring bad data Counting and building histograms Generating Monte-Carlo subsets Likelihood calculations Hypothesis testing Traditionally these are performed on files Most of these tasks are much better done inside DB Bring Mohamed to the mountain, not the mountain to him

Data Access is hitting a wall FTP and GREP are not adequate
You can GREP 1 MB in a second You can GREP 1 GB in a minute You can GREP 1 TB in 2 days You can GREP 1 PB in 3 years. Oh!, and 1PB ~5,000 disks At some point you need indices to limit search parallel data search and analysis This is where databases can help You can FTP 1 MB in 1 sec You can FTP 1 GB / min (= 1 $/GB) … days and 1K$ … 3 years and 1M$

Smart Data (active databases)
If there is too much data to move around, take the analysis to the data! Do all data manipulations at database Build custom procedures and functions in the database Automatic parallelism Easy to build-in custom functionality Databases & Procedures being unified Example temporal and spatial indexing pixel processing, … Easy to reorganize the data Multiple views, each optimal for certain types of analyses Building hierarchical summaries are trivial Scalable to Petabyte datasets

The Personal Petabyte The Enterprise Exabyte

Similar presentations

Presentation on theme: "The Personal Petabyte The Enterprise Exabyte"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

The Personal Petabyte The Enterprise Exabyte

Similar presentations

Presentation on theme: "The Personal Petabyte The Enterprise Exabyte"— Presentation transcript:

Similar presentations

About project

Feedback