Presentation is loading. Please wait.

Presentation is loading. Please wait.

Site-Wide Backup Briefing Ray Pasetes Core Support Services April 16, 2004.

Similar presentations


Presentation on theme: "Site-Wide Backup Briefing Ray Pasetes Core Support Services April 16, 2004."— Presentation transcript:

1 Site-Wide Backup Briefing Ray Pasetes Core Support Services April 16, 2004

2 Agenda Progress Update Progress Update Proposed Model Proposed Model Proposed Rollout Proposed Rollout Initial Costs Initial Costs Estimated 6-year TCO Estimated 6-year TCO Needed Decisions Needed Decisions Critical Timeline Critical Timeline

3 Progress Update

4 12/03: Reduce scope to division-wide 12/03: Reduce scope to division-wide –Site-wide requirements too diverse –Start small, let others see it work, then expand 1/04: Interview groups 1/04: Interview groups –6 Groups ClueD0, D0-offline, CDF, CMS, FTP, ISA ClueD0, D0-offline, CDF, CMS, FTP, ISA SCS (represents 11 clusters/groups) SCS (represents 11 clusters/groups) –Along with CSI, 22 total clusters/groups represented

5 Progress Update 2/1: Interviews finished 2/1: Interviews finished 2/06: CSS department review of data 2/06: CSS department review of data –Most accepted 8to17 x5 service –Most accepted same day or NBD restore –Investigate using existing resources –Investigate costs for deployment -Initial deployment will backup 7+TB of data -Start pilot small

6 Proposed Model

7 Goals Provide a reliable data backup service Provide a reliable data backup service Reduce redundant effort, allowing division to be more productive Reduce redundant effort, allowing division to be more productive Long-term goal: reduce overall division spending on data backups via consolidation Long-term goal: reduce overall division spending on data backups via consolidation Long term goal: service accessible across entire site, desktops included Long term goal: service accessible across entire site, desktops included

8 Service Model Use farms model –> backup blocks Use farms model –> backup blocks –Backup blocks == 1 server + 4 or more tape drives –Smaller customers share the same backup block –Larger customers would have their own backup block(s)

9 Costing Model Charge based on total GB backed up Charge based on total GB backed up –Cost should completely cover tape costs –Cost should cover hardware and software maintenance costs –Cost should cover hardware costs “Profit” will be used to expand and enhance the system “Profit” will be used to expand and enhance the system

10 Costing Model - Example Year 1 customer: $1.15/GB on tape/yr or $34.50/GB of data to backup/year Year 1 customer: $1.15/GB on tape/yr or $34.50/GB of data to backup/year –Covers hardware, tape and maintenance costs Year 2+ customer: $0.33/GB on tape/yr or $9.82/GB of data to backup/year Year 2+ customer: $0.33/GB on tape/yr or $9.82/GB of data to backup/year –Covers maintenance, tapes, additional slots, etc. –No charge for existing hardware No connection or per client fees No connection or per client fees No restore fees No restore fees

11 Proposed Rollout

12 Start pilot small ~ 7.1TB Start pilot small ~ 7.1TB The following groups will be asked to partake in pilot project The following groups will be asked to partake in pilot project - Astro, CEPA, CMS, D0, ESH, FESS, ISA, KTEV, LSS, MINOS, NUMI, PPD, SDSS, SIDET, Theory, VMS Desktops NOT part of initial rollout Desktops NOT part of initial rollout Lessons learned in year 1 determine year 2 growth Lessons learned in year 1 determine year 2 growth

13 Proposed Rollout – Unknowns Limit of each backup block Limit of each backup block –Highly dependent on tape rotation and daily delta –A single server at CMU can handle 15TB of data Compressibility of data Compressibility of data –Affects cost of service –Affects performance

14 Proposed Rollout – Timeline 9/1 – Equipment delivered 9/1 – Equipment delivered 9/8 – Equipment Installed 9/8 – Equipment Installed 9/8-9/15 – Functionality test 9/8-9/15 – Functionality test 9/16-10/01 – Systems testing 9/16-10/01 – Systems testing 10/04 – Start rollout 10/04 – Start rollout 12/04 – Complete initial rollout 12/04 – Complete initial rollout ~2FTE effort required for initial rollout ~2FTE effort required for initial rollout

15 Initial Costs

16 Costs - Library Three Solutions STK and 9940B tape drives STK and 9940B tape drives ADIC and LTO-2 tape drives ADIC and LTO-2 tape drives SpectraLogic and SAIT-1 tape drives SpectraLogic and SAIT-1 tape drives

17 Costs – Library ItemsSTKSpectra D0 ADIC CDF ADIC Library$0$69,418.12$19,302.00$19,302.00+ Drives Needed 8x 9940B 8x SAIT-1 8x LTO-2 Drive Cost $208,000$122,894.87$76,480$76,480 Tapes Needed 109545010951095 Tape Cost $78,292($357.50/TB)$80,970($359.87/TB)$72,270($330/TB)$72,270($330/TB) TiBS S/W Port $30,000+$0$30,000+$30,000+ Maintenance$6,960$14,436$39,276$48,015 Total year 1 $323,252+$287,719.29$237,328+$246,067+

18 Costs – Server 2x Sun V440 -- $28K 2x Sun V440 -- $28K 1x SATA RAID Disk Cache -- $ 14K 1x SATA RAID Disk Cache -- $ 14K

19 Costs – Software - TiBS No additional client costs No additional client costs Servers: $8250 each Servers: $8250 each –2 have already been purchased –Additional cost for OFM s/w for Windows Process packs Process packs –Processes == number of parallel backups –45 processes have been purchased Maintenance – 15% Maintenance – 15% –Currently $11,590.35 annually Prices change if not managed by CSS

20 Possible Funding CSS Backup OP: $130K CSS Backup OP: $130K CSS Backup EQ: $120K CSS Backup EQ: $120K CMS Backup EQ: $100K CMS Backup EQ: $100K Other groups? Other groups?

21 Estimated 6-year TCO Two Growth Models 1. Fermi Standard 10% daily delta10% daily delta Data growth doubles yearlyData growth doubles yearly Approximately 40% above industry standardApproximately 40% above industry standard 2. Fermi Active 45% daily delta (CMS scenario)45% daily delta (CMS scenario) Data growth doubles yearlyData growth doubles yearly

22 Estimated Year 2 (14.2TB) Fermi Standard Fermi Standard –Double slots –Add caching disk –Tape cost down 25% Configuration: Configuration: –2 backup blocks –Slots 2x year 1 –Increased caching disk Fermi Active Fermi Active –~Triple slots –Double backup blocks –Add caching disk –Tape cost down 25% Configuration: Configuration: –4 backup blocks –Slots ~3x year 1 –Increased caching disk

23 Estimated Year 3 (28.4TB) Fermi Standard Fermi Standard –Double Slots –Add Caching disk Configuration: Configuration: –2 backup blocks –Slots 4x year 1 –Increased caching disk Fermi Active Fermi Active –Double backup blocks –Double slots –Add caching disk –Drive cost down 25% Configuration Configuration –8 backup blocks –Slots ~6.5x year 1 –Increased caching disk

24 Estimated Year 4 Tape technology changes Tape technology changes –Media capacity quadruples –Drive performance quadruples 10GigE standard 10GigE standard –Servers now equipped with 10GiGE –Increased bus speeds –Generally available to the lab

25 Estimated Year 4 Roll in new servers Roll in new servers –Backup block == 1 server + 8 tape drives Migrate customers onto new servers Migrate customers onto new servers Allow old tapes to migrate off via tape retention policies Allow old tapes to migrate off via tape retention policies Slowly tear down old backup blocks Slowly tear down old backup blocks Approximately 1 year migration Approximately 1 year migration

26 Estimated Year 4 (56.4TB) Fermi Standard Fermi Standard –2 new backup blocks –Increase slots by half Configuration: Configuration: –2 old backup blocks –2 new backup blocks –2/3 library old slots –1/3 library new slots –Slots 6x year 1 Fermi Active Fermi Active –2 new backup blocks –Increase slots by half Configuration: Configuration: –8 old backup blocks –2 new backup blocks –2/3 library old slots –1/3 library new slots –Slots ~10x year 1

27 Estimated Year 5 (112.8TB) Fermi Standard Fermi Standard –Add caching disk –Convert slots –Tape cost down 25% Configuration: Configuration: –2 backup blocks –All slots new 2/3 library used 2/3 library used 1/3 unused 1/3 unused –Slots 6x year 1 Fermi Active Fermi Active –Double backup blocks –Convert slots –Add caching disk –Tape cost down 25% Configuration: Configuration: –4 backup blocks –All slots new 2/3 library used 1/3 unused –Slots ~10x year 1

28 Estimated Year 6 (225.6TB) Fermi Standard Fermi Standard –Add caching disk –Increase slots by 1/3 Configuration Configuration –2 backup blocks –Slots 8x year 1 Fermi Active Fermi Active –Add caching disk -3 additional backup blocks -Increase slots 1/3 -Drive cost down 25% - Configuration: -7 backup blocks -Slots ~13x year 1

29 Estimated 6-year TCO Year Spectra Standard Spectra Active ADIC Standard ADIC Active 1$329,719$329,719$279,328$279,328 2$135,064$454,402$142,848$474,458 3$132,569$721,767$234,420$830,748 4$582,057$756,795$723,142E 5$194,192$623,280$268,688E 6$367,681$1.13M$505,083E Total~$1.74M~$4.02M~$2.15ME

30 Needed Decisions

31 Decisions – Library Share D0 ADIC robot? Share D0 ADIC robot? –Initially, makes most sense financially –Backup s/w will need to be ported Custom engineering costs Custom engineering costs –Current s/w assumes all tapes are owned by backup service We will be the only deployment We will be the only deployment Fermi will need to provide development resources Fermi will need to provide development resources –Hardware –People/coordination –Active scenario outgrows robot at year 4. This assumes ZERO growth by D0 and SDSS (current users of robot). –Standard scenario will use 2/3 of D0 robot.

32 Decisions - Library Purchase new robot? (SpectraLogic) Purchase new robot? (SpectraLogic) –Higher initial cost –Lower TCO over 6 years. Years 2-6 lower operational costs than ADIC Years 2-6 lower operational costs than ADIC –Backup software already works with it. Service can be brought up more quickly. –SpectraLogic library running TiBS software to be deployed at MIT ( not for public disclosure ).

33 Decisions - Library Purchase new robot (part 2)? Purchase new robot (part 2)? –Small footprint. Fully populated library is 7 racks wide, 2 racks deep. –Library supports SAIT, LTO, LTO-2, SDLT, DLT –In ’05, will support disk –Relatively new product. Higher possibility hardware costs will decrease.

34 Footprint Comparison

35 Critical Timeline

36 4/21: Division request for departments to determine how budgets will be spent 4/21: Division request for departments to determine how budgets will be spent –CSI/SCS robots running out of room Windows support 2.3TB pending decision Windows support 2.3TB pending decision –D0 Legato renewal 6/1/04: Promised date to CMS for backup service to be online 6/1/04: Promised date to CMS for backup service to be online 6/30/04: Latest time for TiBS to begin porting effort to make FY ’04 timeline. 6/30/04: Latest time for TiBS to begin porting effort to make FY ’04 timeline.

37 Fini Open Discussion

38 CSI Backup System - Example Initially deployed in 2001 for AFS backups Initially deployed in 2001 for AFS backups Expanded to include UNIX and some windows Expanded to include UNIX and some windows 2003 – SCS added to backup system 2003 – SCS added to backup system Data has grown >600% since 2001. Data has grown >600% since 2001. Backup system has not grown Backup system has not grown

39 CSI Backup System - Example


Download ppt "Site-Wide Backup Briefing Ray Pasetes Core Support Services April 16, 2004."

Similar presentations


Ads by Google