Presentation on theme: "Total Cost of Preservation Cost Modeling for Sustainable Services Stephen Abrams Patricia Cruse John Kunze University of California Curation Center California."— Presentation transcript:
Total Cost of Preservation Cost Modeling for Sustainable Services Stephen Abrams Patricia Cruse John Kunze University of California Curation Center California Digital Library Screening the Future 2012: Pause, Play, and Press Forward Los Angeles, May 21-23, 2012
Outline Goals Prior work Modeling preservation activity Total cost of preservation ► Pay-as-you-go price model ► Paid-up price model Conclusions Questions and discussion http://wiki.ucop.edu/display/Curation/Cost+Modeling Source: Getty Images
Goals Understand costs in order to plan for and implement sustainable preservation services Investigate the possibility of paid-up pricing in order to address ► Boom-or-bust budget cycles ► Fixed-term, grant funded projects Source: www.sharedidiz.com/ End date!
Prior work Nationaal Archief (2005) http://www.nationaalarchief.nl/sites/default/files/docs/kennisbank/codpv1.pdf LIFE (2008) http://www.life.ac.uk/ KRDS (2010) http://www.beagrie.com/krds.php DataSpace (2010) http://arks.princeton.edu/ark:/88435/dsp01w6634361k Jean-Daniel Zeller (2010) “Cost of digital archiving: Is there a universal model?” 8th European Conference on Digital Archiving, Geneva, April 28-30, 2010 http://regarddejanus.files.wordpress.com/2010/05/costsdigitalarchiving-_jdz_eca2010.pdf http://regarddejanus.files.wordpress.com/2010/05/costsdigitalarchiving-_jdz_eca2010.pdf Rosenthal (2011) http://blog.dshr.org/2011/09/modeling-economics-of-long-term-storage.html } Identification of granular cost components } Assumption of annual decrease in aggregate cost, i.e., discounted cash flow (DCF) Critique of DCF approach
Key assumptions Consider only the costs incurred by the preservation service provider ► Costs of content creation by collection managers are out of scope Costs can be categorized unambiguously as fixed or marginal, and one-time or recurring ► One-time costs can be annualized over the effective lifespan of the activity or system component
Cost model components System, composed of various Services for necessary/desirable functions, running on Servers, deployed by Staff, in support of content Producers, who use Workflows to submit instances of Content Types, which occupy Storage, and are subject to ongoing Monitoring and periodic Interventions ; all subject to managerial Oversight
Number and unit cost of Producers Total cost of preservation Fixed cost of System Number and unit cost of Workflows Unit cost and number of Content Types Number and unit cost of Storage Number and unit cost of Monitoring Number and unit cost of Interventions System component subsumes Services and Servers Staff costs are subsumed by other components Total cost to service provider Fixed cost of oversight
Total cost of preservation Model is rich enough to represent the full economic cost of preservation Implemented by a spreadsheet that captures all subsidiary costs
Total cost of preservation Model is rich enough to represent the full economic cost or preservation But service providers can customize the model to exclude components whose costs are not recoverable or are subsidized as a matter of local policy
Assumption: Cost allocation Cost of the Archive, Workflows, Content Types, Monitoring, and Interventions are “common goods” ► Equally beneficial to all Providers ► Properly apportioned across all Providers
Cost of a single Producer Number of Storage units attributable to Producer Number of Producers Unit cost of a Producer Total cost attributable to a given Producer
Assumptions: Billing Costs are billed for at the end of the period of service The cost model should be revenue neutral
Pay-as-you-go cash flow Expense Income GG t = 0 123 G Cash flow diagram GGG Cost of a single Producer Cumulative pay-as-you-go price over time period T Pay-as-you-go price for a single Producer
Cumulative pay-as-you-go price 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 $16,000 $14,000 $12,000 $10,000 $ 8,000 $ 6,000 $ 4,000 $ 2,000 $ 0 Year (T) Cost ($) Cumulative pay-as-you-go G (T ) Cumulative pay-as-you-go price over time period T… for “forever” as a function of time T
Assumptions: Costs over time Moore’s law, 1971 – 2011 Source: Wikipedia Kryder’s law, 1980 – 2012 Source: Wikipedia The aggregate cost of providing preservation service decreases over time; and that decrease is uniform ► Moore’s and Kryder’s laws
Assumptions: Costs over time The aggregate cost of providing preservation service decreases over time; and that decrease is uniform ► Moore’s and Kryder’s laws ► State-of-the-art tools and understanding ► Productivity increases
Discounted pay-as-you-go cash flow (1–d ) 2 ·GG(1–d )·G Discounting factor t = 0123 Expense Income Discounted cash flow (DCF) diagram G(1–d )·G(1–d ) 2 ·G Discounted pay-as-you-go price over time period T Cost of a single Producer Pay-as-you-go price for a single Producer Compounding over time
as a function of time T Discounted pay-as-you-go price 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 $16,000 $14,000 $12,000 $10,000 $ 8,000 $ 6,000 $ 4,000 $ 2,000 $ 0 Year (T) Cost ($) (1-d) t discount factor Discounted pay-as-you-go G (T,d ) Discounted pay-as-you-go G ( ,d ) Cumulative pay-as-you-go G (T ) … for “forever”Discounted pay-as-you-go price over time period T
Discount factor d is the weighted sum of the expected changes in number and unit cost of individual components Weighting factors ω are the proportion that a particular component contributes to the aggregate cost G, e.g.
Drawbacks to pay-as-you-go pricing Only viable for Producers with reliable annual funding sources Boom-or-bust budgeting or the termination of funded project work can interrupt this funding Any interruption in proactive preservation care can lead to irretrievable data loss
Assumptions: Investment return Preservation service providers can carry forward budgetary surpluses across fiscal years Surplus funds can be invested with the return supplementing the surplus
Paid-up cash flow t = 0123 Expense Income (1–d ) 2 ·G(1–d )·G F r ·F F Surplus (1+r )·F –G r ·[(1+r )· F –G ] (1+r )· [(1+r )· F –G ]– (1–d )·G r ·[(1+r )·[(1+r) ·F– G ]–(1–d )·G ]– (1–d )2·G (1+r )·[(1+r )· [(1+r )· F –G ]– (1–d )·G ]–(1–d ) 2 ·G G Paid-up price for time period T Paid-up price Investment return Cost of a single Producer
as a function of time T Paid-up price 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 $16,000 $14,000 $12,000 $10,000 $ 8,000 $ 6,000 $ 4,000 $ 2,000 $ 0 Year (T) Cost ($) (1–d) t discount factor (1+r) t investment return Paid-up price, for T F (T,d,r) Paid-up price, for F ( ,d,r) Discounted pay-as-you-go G (T,d ) Discounted pay-as-you-go G ( ,d ) Cumulative pay-as-you-go G (T ) … for “forever”Paid-up price for time period T
Paid-up example Pay-as-you-go price, G $ 650 (1 TB) Discount factor, d 5% Investment return, r 2% Term, T 10 years Paid-up price, F $ 4,725 < $ 5,216< $ 6,500 dr
Coefficient of permanence It is useful to be able to transition from a pay-as- you-go to a paid-up price basis If you’re currently paying G on a pay-as-you-go basis, you can upgrade to a paid-up basis with a one-time payment of F = G ·φ, where ► Princeton DataSpace, φ ≈ 30( T = ) ► USC digital repository, φ ≈ 1.2( T = 20)
Problems with R&D TCP modeling is dependent on the predicative reliability of r and d ► For d, extrapolate from Moore’s and Kryder’s laws? Moore’s law, 1971 – 2011 Source: Wikipedia Kryder’s law, 1980 – 2012 Source: Wikipedia ? ?
Problems with R&D TCP modeling is dependent on the predicative reliability of r and d ► For d, extrapolate from Moore’s and Kryder’s laws? ► For r, extrapolate from 30 year Treasury bonds? 30 year treasuries, 2007 – 2012 Source: http://ycharts.com/indicators/30_year_treasury_rate 30 year treasuries, 1882 – 2012 Source: Robert Schiller ?
Model the risk Round up r and d, i.e., adding a fixed “risk premium” Add an additional risk component R to the formula for G ► Its influence on the price can grow over time, reflecting increasing uncertainty, by setting a negative discount factor d R so that 1–d R > 1 ► Note, however, that if the weighted sum d becomes less than 0 and |d | > r then G ( T ) will not converge to a limit + R
Recalibrate the model G and F do not have to be fixed values over time ► Periodically recalculate based on current conditions (actual costs for G ) and predictions ( r and d ), and apply prospectively ► Retrospective service contracts remain “locked-in”
Hybrid price model Distinguish between costs that are (relatively) easy to quantify and forecast, and those that aren’t ► Use the paid-up model for the former and pay-as-you-go for the latter EasyDifficult ArchiveIntervention Producer Workflow Content Type Monitoring Storage
Hybrid price model Distinguish between costs that are (relatively) easy to quantify and forecast, and those that aren’t ► Use the paid-up model for the former and pay-as-you-go for the latter ► Bit preservation only EasyDifficult ArchiveContent Type ProducerWorkflow StorageMonitoring Intervention
Bound the uncertainty The discounted cash flow (DCF) approach is problematic on practical and theoretical grounds ► Difficulty in the setting fixed values for r and d that realistically represent financial and technological trends over time Stochastic modeling to determine the probability distribution of possible outcomes ► C.f., David Rosenthal, FAST ‘12 http://blog.dshr.org/2012/02/fast-2012.html CNI Fall 2011 http://www.youtube.com/watch?v=_5lQxmyz3xY
Preservation forever Some things are intended to last forever… Source: John Church CompanySource: United Artists
Preservation forever ? Some things are intended to last forever…
Preservation for … A fixed term – 10 years? 20 years? – may be appropriate for much content ► Give content an opportunity to prove its worth, as evidenced by someone’s commitment to pay for its subsequent preservation
Transparency and opportunity Possible outcomes… ► We overestimate our costs and collect too much ● Fund a higher level of service ● Refund some portion ► We underestimate ● Ask for additional funds ● Lower service levels ● De-accession content – but at least it was preserved up to that point and had a chance to prove its value, and gain an advocate
Conclusions Different customers have different funding capabilities ► Flexibility in price models is important Any price model is based on an idealization of the real world ► Assumptions matter Understanding all of your costs is a precondition to a policy decision to recover all or part of those costs ► Cost accounting is difficult If investment return and discount factor can be reliably projected, DCF can be used to model of long-term costs ► What if not?
Conclusions Even if we don’t have a perfect model, we need to move forward now with a “good enough” model
For more information Total Cost of Preservation: Cost Modeling for Sustainable Services http://wiki.ucop.edu/display/Curation/Cost+Modeling UC Curation Center http://www.cdlib.org/uc3 firstname.lastname@example.org Stephen AbramsMark Reyes Patricia CruseAbhishek Salve Scott FisherJoan Starr Erik HetznerTracy Seneca Greg JanéeCarly Strasser John KunzeMarisa Strong Margaret LowAdrian Turner David LoyPerry Willett