Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Spensa File System Douglas Santry Computer Laboratory University of Cambridge.

Similar presentations


Presentation on theme: "The Spensa File System Douglas Santry Computer Laboratory University of Cambridge."— Presentation transcript:

1 The Spensa File System Douglas Santry Computer Laboratory University of Cambridge

2 Target Environment “Lots” of physical machines in a machine room Physical machines interconnected by “high” quality network Machines are cheap and stuffed with “large” ATA disk drives

3 What are they doing? Machines are running virtual machines (Xen or VMWare) Virtual machines are mobile, that is, they migrate between physical machines There is very little explicit file sharing between virtual machines Candidates include corporate data centres, “service” providers, e-commerce sites

4 Challenges Data availability and reliability Load balancing and performance tuning Service differentiation and guarantees Location Transparency – virtual machines and data need to move transparently to the one another ATA disks are cheap – they WILL fail

5 Spensa Features Service Differentiation Service Guarantees Service Isolation Automatic load balancing Automatic performance tuning

6 Spensa A Distributed File System Two components: a client file system and a server Servers store opaque objects – they have no notion of file systems The client file system is backed by objects on the servers and offers the traditional file system hierarchy and name space

7 An instance of a Spensa (Name: foo) / usr home mnt Machine A Machine BMachine C Foo’s bascauda Spensa operates on objects

8 Bascauda A Bascauda B Bascauda C VM Mounted Spensa B VM Mounted Spensa C VM Mounted Spensa A

9 Spensa continued Every physical machine runs application virtual machines and a Spensa server Spensa servers run inside dedicated virtual machines – one per physical machine

10 Reliability and Availability Replication At 50 cents/G one can be free with it Replication factor specified on a per Spensa basis

11 Reading Replicas Spensa client broadcasts request for data to all copies of it First machine to fetch it answers and cancels fetch on peers

12 Caching Servers reside in virtual machines with all of the other virtual machines – memory is critical Servers do not cache client data Servers cache path critical meta data to minimize latency (backing file system’s inode, bitmaps &c)

13 Service Service can be specified in terms of time or bandwidth Time is specified in terms of percentage Bandwidth specified in KB/s Latency in milliseconds A Server is configured for either time or bandwidth. They are mutually exclusive

14 Service Continued Enforcement is distributed. There are no centralised or interposed enforcement machines or mechanisms Bandwidth seems to be more intuitive to specify for humans Bandwidth offers tighter short-term control

15 Load Balancing Too many machines (real and virtual) for a human to make provision decisions - Spensa auto-provisions Load balancing mitigates poor decisions Virtual diffusion with direct migration

16 Diffusion Bascaudae need to be decomposed for partial migration Bascaudae are decomposed in the object name space (it has no knowledge of the file system’s name space) Traffic is not Poisson – use the real distribution Servers keep a per bascauda load and address reference histogram


Download ppt "The Spensa File System Douglas Santry Computer Laboratory University of Cambridge."

Similar presentations


Ads by Google