CS 414 – Multimedia Systems Design Lecture 26 – Media Server (Part 2) Klara Nahrstedt Spring 2011 CS 414 - Spring 2011
Outline Storage Management Multiple Disks and Multimedia Multimedia Data Placement Strategies on Disk Multiple Disks and Multimedia RAID Data Striping Group Creation Disk Management Data Interleaving Disk Scheduling EDF ; SCAN-EDF CS 414 - Spring 2011
Client/Server Video-on-Demand System CS 414 - Spring 2011
Video-on-Demand Systems must be designed with goals: Avoid starvation Minimize buffer space requirement Minimize initiation latency (video startup time) Optimize cost CS 414 - Spring 2011
Some Facts on YouTube Video Service (from 2009) Checked with http://www.wired.com/epicenter/2009/10/youtube-over-one-billion-videos-served-per-day/ Over 1 billion videos per day Bandwidth accounts for about 51% of expenses -- with a run rate of $1 million per day -- with content licensing accounting for 36% http://www.multichannel.com/article/191223-YouTube_May_Lose_470_Million_In_2009_Analysts.php CS 414 - Spring 2011
YouTube Video Server (2010) http://tech.fortune.cnn.com/2010/05/17/youtube-at-5-years-old-2-billion-served-per-day/ May 2010, 2 Billion videos served per day More than 24 hours of video uploaded every minute Videos usually less than 10 minutes long CS 414 - Spring 2011
Media Server Architecture Delivered data Incoming request Network Attachment Content Directory Memory Management File System Storage management Disk controller (Disk Management) Storage device CS 414 - Spring 2011
Storage Management Storage access time to read/write disk block is determined by 3 components Seek Time Time required for the movement of read/write head Rotational Time (Latency Time) Time during which transfer cannot proceed until the right block or sector rotates under read/write head Data Transfer Time Time needed for data to copy from disk into main memory CS 414 - Spring 2011
Placement of MM Data Blocks on Single Disk Continuous Placement Scattered Placement Simple to implement, but subject to fragmentation Avoids fragmentation Enormous copying overhead during insert/delete to maintain continuity Avoid copying overhead When reading file, only one seek required to position the disk head at the start of data When reading file, seek operation incurs for each block , hence intrafile seek CS 414 - Spring 2011
Intra-file Seek Time Intra-file seek – can be avoided in scattered layout if the amount read from a stream always evenly divides block Solution: select sufficient large block and read one block in each round If more than one block is required to prevent starvation prior to next read, deal with intra-file seek Solution: constrained placement or log-structure placement CS 414 - Spring 2011
Scattered Non-continuous Placement CS 414 - Spring 2011
Constrained Placement Approach: separation between successive file blocks is bounded Bound on separation – not enforced for each pair of successive blocks, but only on average over finite sequence of blocks Attractive for small block sizes Implementation – expensive For constrained latency to yield full benefit, scheduling algorithm must retrieve immediately all blocks for a given stream before switching to another stream CS 414 - Spring 2011
Log-Structure Placement This approach writes modified blocks sequentially in a large contiguous space, instead of requiring seek for each block in stream when writing (recording) Reduction of disk seeks Large performance improvements during recording, editing video and audio Problem: bad performance during playback Implementation: complex CS 414 - Spring 2011
Placement of Multiple MM Files on Single Disk Popularity concept among multimedia content - very important Take popularity into account when placing movies on disk Model of popularity distribution – Zipf’s Law Movies are kth ranked if their probability of customer usage is C/k, C = normalization factor Condition holds: C/1 + C/2 + … C/N = 1, N is number of customers CS 414 - Spring 2011
Example Assume N = 5 movies Problem: what is the probability that the next customer picks 3rd ranked movie? Solution: Solve C from the equation C/1 + C/2 + C/3 + C/4 + C/5 = 1 C = 0.437 Probability to pick 3rd ranked movie is C/3 = 0.437/3 = 0.1456 CS 414 - Spring 2011
Placement Algorithm for Multiple Files on Single Disk Organ-Pipe Algorithms (Grossman and Silverman 1973) 1st rank (most popular movie) 2nd ranked movie 3rd 4th 5th 6th 7th 8th 9th Middle of disk (in case of traditional disk layout) Note: In case of ZBR disk layout , place most popular disks at the outer tracks CS 414 - Spring 2011
Need for Multiple Disks Solutions for Media Server Limitation of Single Disk: Disk Throughput Approach: 1 Maintain multiple copies of the same file on different disks Very expensive Approach 2: Scatter multimedia file across multiple disks CS 414 - Spring 2011
Approach: Data Striping RAID (Redundant Arrays of Inexpensive Disks) Addresses both performance and security (0-6) RAID levels – different approach at combining performance enhancements with security/fault-tolerance enhancements Disks spindle synchronously Operate in lock-step parallel mode Striping improves BW, but does not improve seek or rotational delay CS 414 - Spring 2011
Data Striping – Group Creation Multiple RAID: Creation of Subgroups of disks into independent logical disk arrays; limits # of disks per file Declustering: Groups are not made up of complete disks ; # of disks for any stripe is fixed and of same size, but disks on which stripe is located differs Dynamic Declustering: Non - static strip allocation to disks CS 414 - Spring 2011
Storage/Disk Management Disk access – slow and costly Reduce disk access Use block caches (anticipate future reads or writes) Reduce disk arm motion Blocks accesses in sequence (continuously) , place together on one cylinder Interleaved vs non-interleaved storage CS 414 - Spring 2011
Data Interleaving Data Interleaving On single disk (consecutive blocks are placed on The same cylinder But in interleaved way Data Interleaving On Multiple Disks (Disks are not Synchronized) CS 414 - Spring 2011
Disk Scheduling Policies Goal of Scheduling in Traditional Disk Management Reduce cost of seek time Achieve high throughput Provide fair disk access Goal of Scheduling in Multimedia Disk Management Meet deadline of all time-critical tasks Keep necessary buffer requirements low Serve many streams concurrently Find balance between time constraints and efficiency CS 414 - Spring 2011
Disk Scheduling Framework Must support multiple QoS levels and requests Source: Reddy et al, “Disk Scheduling in a Multimedia I/O System”, ACM TOMCCAP 2005 CS 414 - Spring 2011
EDF (Earliest Deadline First) Disk Scheduling Each disk block request is tagged with deadline Very good scheduling policy for periodic requests Policy: Schedule disk block request with earliest deadline Excessive seek time – high overhead Pure EDF must be adapted or combined with file system strategies CS 414 - Spring 2011
EDF Example Note: Consider that block number Implicitly encapsulates the disk track number CS 414 - Spring 2011
Conclusion Disk Scheduling – important component in the timely delivery of streams Admission should be done if one cares not to over subscribe CS 414 - Spring 2011