Presentation is loading. Please wait.

Presentation is loading. Please wait.

Long tails and Archive systems Elliot Jaffe FDIS 2005.

Similar presentations


Presentation on theme: "Long tails and Archive systems Elliot Jaffe FDIS 2005."— Presentation transcript:

1 Long tails and Archive systems Elliot Jaffe FDIS 2005

2 Archive Metrics What –Distribution of file sizes –Distribution of occupied storage –How are files accessed Why –System architecture –Scaling for access

3 File size studies UFS93 (1993) 12 million files UNIX only Avg. file size is 2k 90% of storage in 11% of files HUJI (2005) 4 million files UNIX + Windows Avg. file size is 8k 90% of storage in 5.5% of files

4 What’s Changed Then JAWS, NOW Online was expensive Offline tape storage Now Central File Servers Digital Libraries Online is cheap No offline storage XML Multimedia

5 Empirical Data

6 Questions What is the future of these distributions? Are the changes extensions of the tails with power laws, so that 10/90 and 20/80 rules no longer work and are the wrong way to think about them? Are the changes based on external factors that are unpredictable?

7 The Long Tail Chris Anderson (2004) –http://www.wired.com/wired/archive/12.10/tail.html The long tail of a distribution has tremendous mass and creates new market opportunities Amazon, Netflix, Wikipedia

8 Today’s landscape NOW File Servers Sarbanes Oxley Digital Libraries Storage Capacity Access Frequency

9 Next Steps Collecting data from large storage systems –File Sizes, Created, Last Modified, Last Access, Frequency of Reads Goal: New architectures for Digital libraries –Focus on Operations –Store large and small files differently –Store very-low access files in slow access


Download ppt "Long tails and Archive systems Elliot Jaffe FDIS 2005."

Similar presentations


Ads by Google