Using Conviva 29 Aug 2013. Summary Who are we? What is the problem we needed to solve? How was Spark essential to the solution? What can Spark.

Slides:



Advertisements
Similar presentations
Building Cloud-ready Video Transcoding System for Content Delivery Networks(CDNs) Zhenyun Zhuang and Chun Guo Speaker: 饒展榕.
Advertisements

NDN in Local Area Networks Junxiao Shi The University of Arizona
CONFIDENTIAL©2008 MEDIAMELON, INC. DCIA PRESENTATION Kumar Subramanian
White Master Replace with a graphic 5.5” Tall & 4.3” Wide © 2010 Adobe Systems Incorporated. All Rights Reserved. Video Distribution Philippe Degery DMO.
Using Conviva Spark Summit Summary Who are we? What is the problem we needed to solve? How was Spark essential to the solution? What can.
Suphakit Awiphan, Takeshi Muto, Yu Wang, Zhou Su, Jiro Katto
Chapter 19: Network Management Business Data Communications, 4e.
Traffic Engineering With Traditional IP Routing Protocols
Internet Traffic Patterns Learning outcomes –Be aware of how information is transmitted on the Internet –Understand the concept of Internet traffic –Identify.
Two main requirements: 1. Implementation Inspection policies (scheduling algorithms) that will extand the current AutoSched software : Taking to account.
Multiple Sender Distributed Video Streaming Thinh Nguyen, Avideh Zakhor appears on “IEEE Transactions On Multimedia, vol. 6, no. 2, April, 2004”
Chapter 13 Embedded Systems
A Routing Control Platform for Managing IP Networks Jennifer Rexford Princeton University
All content in this presentation is protected – © 2008 American Power Conversion Corporation Rael Haiboullin System Engineer Capacity Manager.
Bringing Internet Video to Prime Time Aditya Ganjam Vice President of Engineering Conviva.
New Challenges in Cloud Datacenter Monitoring and Management
Scalable Server Load Balancing Inside Data Centers Dana Butnariu Princeton University Computer Science Department July – September 2010 Joint work with.
11 SERVER CLUSTERING Chapter 6. Chapter 6: SERVER CLUSTERING2 OVERVIEW  List the types of server clusters.  Determine which type of cluster to use for.
Network Simulation Internet Technologies and Applications.
Microsoft ® Application Virtualization 4.6 Infrastructure Planning and Design Published: September 2008 Updated: February 2010.
Real-time Stream Processing Architecture for Comcast IP Video
Network Planète Chadi Barakat
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
USING HADOOP & HBASE TO BUILD CONTENT RELEVANCE & PERSONALIZATION Tools to build your big data application Ameya Kanitkar.
Portable and Predictable Performance on Heterogeneous Embedded Manycores (ARTEMIS ) ARTEMIS Project Review 28 nd October 2014 Multimedia Demonstrator.
Tyson Condie.
Technology Overview. Agenda What’s New and Better in Windows Server 2003? Why Upgrade to Windows Server 2003 ?  From Windows NT 4.0  From Windows 2000.
Cloud Computing 1. Outline  Introduction  Evolution  Cloud architecture  Map reduce operation  Platform 2.
NAB 2012 Cloud Computing Conference New Levels of Media Performance Data Enabled by Cloud Computing -- and Impact on Other Sectors Scott Brown, GM US and.
Active Monitoring in GRID environments using Mobile Agent technology Orazio Tomarchio Andrea Calvagna Dipartimento di Ingegneria Informatica e delle Telecomunicazioni.
An Integration Framework for Sensor Networks and Data Stream Management Systems.
Wireless Networks Breakout Session Summary September 21, 2012.
Department of Information Engineering The Chinese University of Hong Kong A Framework for Monitoring and Measuring a Large-Scale Distributed System in.
1 Wenguang WangRichard B. Bunt Department of Computer Science University of Saskatchewan November 14, 2000 Simulating DB2 Buffer Pool Management.
HUAWEI TECHNOLOGIES CO., LTD. Page 1 Survey of P2P Streaming HUAWEI TECHNOLOGIES CO., LTD. Ning Zong, Johnson Jiang.
FUTURE OF NETWORKING SAJAN PAUL JUNIPER NETWORKS.
1 Geospatial and Business Intelligence Jean-Sébastien Turcotte Executive VP San Francisco - April 2007 Streamlining web mapping applications.
CE Operating Systems Lecture 3 Overview of OS functions and structure.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
How Companies are Using Spark And where the Edge in Big Data will be Matei Zaharia.
Developer TECH REFRESH 15 Junho 2015 #pttechrefres h Understand your end-users and your app with Application Insights.
Zibin Zheng DR 2 : Dynamic Request Routing for Tolerating Latency Variability in Cloud Applications CLOUD 2013 Jieming Zhu, Zibin.
A Utility-based Approach to Scheduling Multimedia Streams in P2P Systems Fang Chen Computer Science Dept. University of California, Riverside
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Automatic Statistical Evaluation of Resources for Condor Daniel Nurmi, John Brevik, Rich Wolski University of California, Santa Barbara.
Resilient Distributed Datasets: A Fault- Tolerant Abstraction for In-Memory Cluster Computing Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave,
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
MiddleMan: A Video Caching Proxy Server NOSSDAV 2000 Brian Smith Department of Computer Science Cornell University Ithaca, NY Soam Acharya Inktomi Corporation.
CoopNet: Cooperative Networking
Performance Testing Test Complete. Performance testing and its sub categories Performance testing is performed, to determine how fast some aspect of a.
Microsoft Cloud Solution.  What is the cloud?  Windows Azure  What services does it offer?  How does it all work?  How to go about using it  Further.
WHAT'S THE DIFFERENCE BETWEEN A WEB APPLICATION STREAMING NETWORK AND A CDN? INSTART LOGIC.
© 2016 TM Forum | 1 Virtual CPE Platform in the Home.
BIG DATA/ Hadoop Interview Questions.
Devices 10 billion Internet- connected devices by 2016 People 1 billion+ people use social media services today Cloud 30 % of data will live in or pass.
Connected Infrastructure
TV Broadcasting What to look for Architecture TV Broadcasting Solution
Univa Grid Engine Makes Work Management Automatic and Efficient, Accelerates Deployment of Cloud Services with Power of Microsoft Azure MICROSOFT AZURE.
University of Maryland College Park
Live Global Sports Events
Enterprise Town Hall solution
Pytheas: Enabling Data-Driven Quality of Experience Optimization Using Group-Based Exploration-Exploitation Junchen Jiang (CMU) Shijie Sun (Tsinghua Univ.)
CFA: A Practical Prediction System for Video Quality Optimization
Connected Infrastructure
Cloud Computing By P.Mahesh
Discretized Streams: A Fault-Tolerant Model for Scalable Stream Processing Zaharia, et al (2012)
Chapter 2: Operating-System Structures
Chapter 2: Operating-System Structures
Mark Quirk Head of Technology Developer & Platform Group
Conviva & Sky A real-world OTT video Quality of Experience case study
Presentation transcript:

Using Conviva 29 Aug 2013

Summary Who are we? What is the problem we needed to solve? How was Spark essential to the solution? What can Spark enable us to do in the future?

Some context… Quality has a huge impact on user engagement in online video

What we do Optimize end user online video experience Provide historical and real-time metrics for online video providers and distributers Maximize quality through  Adaptive Bit Rate (ABR)  Content Distribution Network (CDN) switching Enable Multi-CDN infrastructure with fine grained traffic shaping control

About four billion streams per month Mostly premium content providers (e.g., HBO, ESPN, BSkyB, CNTV) but also User Generated Video sites (e.g., Ustream) Live events (e.g., NCAA March Madness, FIFA World Cup, MLB), short VoDs (e.g., MSNBC), and long VoDs (e.g., HBO, Hulu) Various streaming protocols (e.g., Flash, SmoothStreaming, HLS, HDS), and devices (e.g., PC, iOS devices, Roku, XBOX, …) Traffic from all major CDNs, including ISP CDNs (e.g., Verizon, AT&T) What traffic do we see?

The Truth Video delivery over the internet is hard  There is a big disconnect between viewers and publishers

The Viewer’s Perspective ISP & Home Net Video Player Start of the viewer’s perspective ?

Video Source Encoders & Video Servers CMS and Hosting Content Delivery Networks (CDN) The Provider’s Perspective ?

Video Source Encoders & Video Servers CMS and Hosting Content Delivery Networks (CDN) ISP & Home Net Video Player Closing the loop

The Truth Video delivery over the internet is hard  CDN variability makes it nearly impossible to deliver high quality everywhere with just one CDN SIGCOMM ‘12

The Truth Video delivery over the internet is hard  CDN variability makes it nearly impossible to deliver high quality all the time with just one CDN SIGCOMM ‘12

The Idea Where there is heterogeneity, there is room for optimization For each viewer we want to decide what CDN to stream from But it’s difficult to model the internet, and things can rapidly change over time So we will make this decision based on the real-time data that we collect

CDN1 (buffering ratio) CDN2 (buffering ratio) The Idea Avg. buff ratio of users in Region[1] streaming from CDN1 Avg. buff ratio of users in Region[1] streaming from CDN2 For each CDN, partition clients by City For each partition compute Buffering Ratio Seattle San Francisco New York London Hong Kong

The Idea For each partition select best CDN and send clients to this CDN CDN1 (buffering ratio) CDN2 (buffering ratio) Seattle San Francisco New York London Hong Kong

The Idea For each partition select best CDN and send clients to this CDN CDN1 (buffering ratio) CDN2 (buffering ratio) CDN2 >> CDN1 Seattle San Francisco New York London Hong Kong

The Idea For each partition select best CDN and send clients to this CDN CDN1 (buffering ratio) CDN2 (buffering ratio) Seattle San Francisco New York London Hong Kong

Best CDN (buffering ratio) The Idea CDN1 (buffering ratio) CDN2 (buffering ratio) For each partition select best CDN and send clients to this CDN Seattle San Francisco New York London Hong Kong

The Idea Best CDN (buffering ratio) CDN1 (buffering ratio) CDN2 (buffering ratio) What if there are changes in performance? Seattle San Francisco New York London Hong Kong

The Idea Best CDN (buffering ratio) CDN1 (buffering ratio) CDN2 (buffering ratio) Use online algorithm respond to changes in the network. Seattle San Francisco New York London Hong Kong

Content owners (CMS & Origin) CDN 1 CDN 2 CDN 3 Clients Coordinator Continuous measurements Business Policies control How? Coordinator implementing an optimization algorithm that dynamically selects a CDN for each client based on  Individual client  Aggregate statistics  Content owner policies All based on real-time data

What processing framework do we use? Twitter Storm  Fault tolerance model affects data accuracy  Non-deterministic streaming model Roll our own  Too complex  No need to reinvent the wheel Spark  Easily integrates with existing Hadoop architecture  Flexible, simple data model  Writing map() is generally easier than writing update()

Spark Usage for Production Decision Maker Compute performance metrics in spark cluster Relay performance information to decision makers Spark Cluster Spark Node Spark Node Spark Node Storage Layer Clients

Results Non-buffering views Average Bitrate (Kbps)

Spark’s Role Spark development was incredibly rapid, aided both by its excellent programming interface and highly active community Expressive:  Develop complex on-line ML decision based algorithm in ~1000 lines of code  Easy to prototype various algorithms It has made scalability a far more manageable problem After initial teething problems, we have been running Spark in a production environment reliably for several months.

Problems we faced Silent crashes… Often difficult to debug, requiring some tribal knowledge Difficult configuration parameters, with sometimes inexplicable results Fundamental understanding of underlying data model was essential to writing effective, stable spark programs

Enforcing constraints on optimization Imagine swapping clients until an optimal solution is reached Constrained Best CDN (buffering ratio) CDN1 (buffering ratio) CDN2 (buffering ratio) 50%

Enforcing constraints on top of optimization Solution is found after clients have already joined. Therefore we need to parameterize solution to clients already seen for online use. Need to compute an LP on real time data Spark Supported it  20 LPs  Each with 4000 decisions variables and 350 constraints  5 seconds.

Tuning Can’t select a CDN based solely on one metric.  Select utility functions that best predict engagement Confidence in a decision, or evaluation will depend on how much data we have collected  Need to tune time window  Select different attributes for separation

Tuning Need to validate algorithm changes quickly Simulation of algorithm offline, is essential

Spark Usage for Simulation Spark Node Spark Node Spark Node HDFS with Production traces Load production traces with randomized initial decisions Generate decision table (with artificial delay) Produce simulated decision set Evaluate decisions against actual traces to estimate expected quality improvement

Future of Spark and Conviva Leverage spark streaming Unify live and historical processing Develop platform to build various processing ‘apps’ (e.g. Anomaly Detection, Customer Tailored Reporting)  Can share the same data API  Will all have consistent input

In Summary Spark was able to support our initial requirement of fast fault tolerant performance computation for an on-line decision maker New complexities like LP calculation ‘just worked’ in the existing architecture Spark has become an essential tool in our software stack

Thank you!

Questions?