Presentation on theme: "The Zebra Striped Network File System Presentation by Joseph Thompson."— Presentation transcript:
The Zebra Striped Network File System Presentation by Joseph Thompson
Purpose Single file server architectures will not be able to support future throughput needs. Need a striping technique that will support all size of files writes in effective and uniform manner.
Striping in Zebra RAID Per-File Striping in a Network File System Log-Structured File Systems and Per- Client Striping
RAID-Problems Small writes in RAID are about four times as expensive as they would be in a disk array with parity. All the disks are attacked to a single machine, so its memory and I/O system are performance bottlenecks. Note: no reason there has to be a dedicated parity disk.
Per-File Striping in a Network File System Note: A collection of file data that spans the servers is called a stripe, and the portion of a stripe stored on a single server is called a stripe fragment. Small files are difficult to handle efficiently Inefficient parity management during updates
Log-Structured File Systems and Per-Client Striping Solution to per-file problems Zebra applies techniques of Log-File System and Per- Client Striping Creates an append only log for each client who then can convert many small writes into one large writes to a single stripe. (client is responsible for calculating parity) Requires a File Manager to facilitate client interaction and keep record of file metadata such as: file attributes, directory structures, etc. Like all other LFS’s, this solution also requires a stripe cleaner.
Storage Servers Storage server requirements –Store a fragment –Append to an existing fragment Used for periodic writes of a log –Retrieve a fragment –Delete a fragment –Identify fragments Used to identify end of client logs after crashes
Clients On Read –Client must determine which stripe fragments store the desired data, retrieve the data from the storage servers, and return them to the application. On Write –Client appends the new data to its log by creating new stripes to hold the data, computing the parity of the stripes, and writing the stripe to the storage servers.
File Mangers File Manager stores all of the information in the file system except for file data. The client requests block pointers for the File Manager, and accesses the block data itself. Performance if the File Manager is a concern because it is a centralized resource. –Solution: clients cache naming information from File Manager so that the client contacts the file manager less often.
Stripe Cleaners (first glance) The only way to reuse free space in a stripe is to clean the stripe so that is contains no live data, then delete it. Since the cleaner is a client itself, it just reads live data from stripes with the largest amounts of free fragments, appends the data to its own client log to be written to a new stripe, and then deletes the old stripes.
System Operations Communication Deltas Stripe Cleaning (additional details) Adding Additional Storage Servers
Communication Deltas Deltas provide a simple and reliable way for various system components to communicate changes to files. A client's log also contains deltas. Delta Information: –File ID, File Version(time edited), Block Number, Old Block pointer, New Block pointer. Three types of deltas –Update delta, cleaner delta, reject delta.
Stripe Cleaning (additional details) Evaluating stripe space utilization –Cleaner must process the number of deltas in every client log (stripe) to keeping a running count of free fragments. –The cleaner appends all of the deltas that refer to a given stripe to a special file for that stripe, called a Stripe Status File. Conflicts between cleaning and file access –Stripe cleaner does not lock any files during cleaning. Only issues a special cleaner delta. –If a conflict did a occur when a update took place during a cleaning, the file manager will notice two different deltas and make sure the final pointer for the block reflects the update delta. –The manager generates a reject delta that the cleaner uses to tell that the new block it created is unused. –(just to show how adding a stripe cleaner significantly adds complexity)
Adding Additional Storage Servers When a new storage sever becomes available, all that must be done is notify the clients, file manager, and stripe cleaner that each stripe group has one more fragment.
Restoring Consistency After Crashes Two general issues upon crash –Consistency –Availability Zebra uses checkpoint and roll-forward method for restoring consistency. Three new consistency problems –Stripes may become internally inconsistent Some of the data or parity written but not all of it –Information written to stripes may become inconsistent with metadata –Stripe cleaner state becomes inconsistent with stripes
Stripes may become internally inconsistent Zebra stores simple checksum for each fragment. On storage server reboot –Verifies checksums only around the time of crash (using deltas) –Discards incomplete checksums –Queries other stripes to find out what new stripes were written when server was down.
Information written to stripes may become inconsistent with metadata If Client crashes file manager must check logs to make sure the last log written successfully. If manager crashes it has to run through every client’s log from the managers last check point and roll forward through the rest of the log to update info since last checkpoint.
Stripe cleaner state becomes inconsistent with stripes Strip cleaner also stores periodic checkpoints and on restart reads (and corrects) status files then starts collecting more utilization information from the point of its last checkpoint (roll-forward).
Conclusion Zebra provides higher throughput, availability, and scalability, than previous file systems at the cost of increased system complexity. Its only step one. As we saw that xFS included and improved Zebra’s core functionality.